This tutorial walks you through uploading a shapefile in R, and using it in a ggplot2 visualization.
I tend to spend my time massaging and visualizing data whether it be for work or just for fun (this website). Some of my tutorials are born from a question emailed to me, others from questions I ask of myself. This tutorial is one I asked of myself. I recently found a really spiffy shapefile of the world’s roads, as mentioned and used in my Multiplot tutorial and have been looking for other ways to use it.
It just so happened that this week I was reading through the R Graphics Cookbook (available here) by Winston Chang and came across his use of the Isabel dataset from the gcookbook package in R. This dataset contains temperature and wind data from Hurricane Isabel and he uses it to demonstrate geom_segment visualizations. I thought maybe I could combine the two, and a couple google searches to make sure I wasn’t duplicating other’s work and here we are.
- To start we need to load all the packages we are going to need. GGplot2 is the graphing package, rgdal is for the shapefiles as it is the Geospatial Data Abstraction Library, gcookbook is of course where we get the data and plyr brings it all together.
library(ggplot2) library(rgdal) library(plyr) library(gcookbook)
- Then we will massage the Isabel dataset into something we can use.
- Now we use the rgdal package, and the readOGR function to read in our shapefiles. One important thing I can say is, if you do not know already, R works within it’s working directory. So for these next few lines you will need to make sure the shapefiles you have downloaded from here are in your working directory. Because this data is specific to North America I only downloaded the NA version. Here we will read in the whole Shapefile library which is now located within your working directory. Then we join the pieces of data together with plyr into a data.frame we can plot with.
na <- readOGR(dsn=".", layer="ne_10m_roads_north_america") na@data$id <- rownames(na@data) na.points <- fortify(na, region="id") na.df <- join(na.points, na@data, by="id")
- Now we create the map and adjust it to how we want it to appear. There is a lot going on here so let me explain briefly. The geom_line is the shape that the data will be plotted in. To mess around with the data try adjusting the alpha which is the opacity of the data, and the color which is self explanatory. The theme is applying mostly to the background, the blank elements get rid of axis and gridlines, labels etc. The element_rect is important as it is the background color, you could adjust it to black, or any color if you’d like to see how it works.
map<-ggplot(data=na.df)+ geom_line(aes(long,lat,group=group),alpha=.5,color="dark gray")+ theme(panel.grid.minor=element_blank(), panel.grid.major=element_blank(), panel.background = element_rect(fill="white"), axis.ticks=element_blank(), axis.text=element_blank(), axis.title= element_blank())
- Finally let’s add the points of data for the hurricane together with the map we just made. The two pieces that are important are the scales. These values limit the size of the graph to the area where the hurricane took place. You can delete these two aspects to see our map above with the points plotted.
map + geom_segment(data=hurricane, aes(x=x,y=y,xend = x +vx/50, yend=y+vy/50),color="blue",size=.25, alpha=.15)+ scale_x_continuous(limits = c(-85, -68))+ scale_y_continuous(limits = c(24, 42))
This is just a quick tutorial on how to integrate shapefiles into your ggplots. One take away from this is the use of gradients which I may incorporate into a separate post. The use of gradients may make a nicer looking plot that could change how it is presented. In the mean time I hope this provided a few people with either help or a fun project to recreate.