Some work I have been doing recently has involved setting up a PostGIS database to store spatio-temporal data and corresponding geometries. I like to do my analysis in R, so I needed to import the well-known binary (WKB) geometries into the R environment. If you have drivers for PostGIS in your gdal installation, this is straightforward, and instructions are here. However, getting the drivers can be tricky on Windows and was not something I wanted to spend time doing. Luckily, it is possible to get around the problem by returning the geometries as well-known text (WKT) from PostGIS and converting them to spatial objects in R. Lee Hachadoorian describes a way of doing this here. The issue with Lee’s method (as he points out) is that it uses loops, which most R users know are very inefficient. I wanted to avoid loops and came up with the following solution using RPostgreSQL, rgeos and sp:
#Load the required packages require(RPostgreSQL)
require(sp) # create driver dDriver <- dbDriver("PostgreSQL") #connect to the server, replacing with your credentials conn <- dbConnect(dDriver, user="user", password="password", dbname="dbname")
In the following query I am selecting the ID (id) and geometry (geom) columns from a table called ‘your_geometry’ that fall within a specified bounding box. The function
ST_AsText is used to convert WKB to WKT.
# Select and return the geometry as WKT rs <- dbSendQuery(conn, 'select id, ST_AsText(geom) from your_geometry where your_geometry.geom && ST_MakeEnvelope(-0.149, 51.51, -0.119, 51.532, 4326);') # Fetch the results res <- fetch(rs, -1) dbClearResult(rs)
readWKT function converts the WKT to R Spatial objects such as SpatialPoints, SpatialLines or SpatialPolygons. In this case I am using lines.
# Use the readWKT function to create a list of SpatialLines # objects from the PostGIS geometry column str <- lapply(res[,2], "readWKT", p4s=CRS("+proj=longlat +datum=WGS84")) # Add the IDs to the SpatialLines objects using spChFIDs coords <- mapply(spChFIDs, str, as.character(res[,1]))
Now it is a case of creating a SpatialLinesDataFrame and adding the remaining attributes from the attribute table.
# Query the remaining fields in the attribute table rs <- dbSendQuery(conn, 'select * from your_geometry where your_geometry.geom && ST_MakeEnvelope(-0.149, 51.51, -0.119, 51.532, 4326);') res <- fetch(rs, -1) dbClearResult(rs) # Create a SpatialLinesDataFrame with the geometry and the # attribute table rownames(res) <- res[,1] # Here I assume the geometry is in the last column and remove it from the attributes data <- SpatialLinesDataFrame(SpatialLines(unlist(lapply(coords, function(x) x@lines)),proj4string=CRS("+proj=longlat +datum=WGS84")), res[,-ncol(res)])
The tricky bit of this was working out the way that Line, Lines, SpatialLines and SpatialLinesDataFrame objects interact. readWKT makes SpatialLines objects. If you try to create a SpatialLinesDataFrame from the result, it will give you a single feature with space for one attribute. Therefore, it is necessary to extract the Lines from the SpatialLines, and then convert them back to individual SpatialLines objects… This was a real pain to work out, so I thought I would post it in case it is useful for anyone.
Note that I have queried the same data twice; first to select the geometry as WKT and then to get the remaining attributes. I did this because the query was fast and I didn’t want to manually type out the column names of the attribute table in the first query. It would be easy to do the whole operation in one query.
If you have any ways of simplifying this or speeding it up further I would be interested to know!