Monday, October 24, 2016

Static, multiple, to interactive - part I

Static, multiple, to interactive - part I

Introduction:

This post has been inspired by an articles i read online by Nathan Yau describing using R vs D3. Nathan Yau has written a small article on what technology does he use to generate a static vs an interactive visualization.

One of the advantages of using R is that you can achieve visual interactivity very easily using R packages such as shiny, googleVis and plotly to name a few. However, as Nathan pointed out, R is not very flexible in providing 100% interactivity.

In the current blogpost i will introduce the readers to generating a simple static map in R using the baisc R plot. In part II and III of this post the reader will also learn to add different elements to the map such as text, rectangle and legend.The maps generated in this post has been inspired by the data analysis performed by Susie Lu. She has used a unique method to generate a legend for her maps. In this post and the next we will try and replicate the same in R.

Setting up:

The first step is to load all the necessary packages in R. For the current post we will use dplyr and maps packages. The following code can be executed to install the packages in R. Once we load the packages we need to load the packages in the current session using the library() function in R.

library(dplyr)
library(maps)

Data:

The data used for generating the spatial plot can be downloaded from Data.gov webite. Alternatively you can download the file here.

lct= read.csv("locations.csv", stringsAsFactors = FALSE)

A quick look of the data reveals that the data consist of 59 variables and 8675 observations.The dim() provides us with the dimensions of the data and colnames() function allows us to review the list of variables in the data.

dim(lct)
## [1] 8675   59
colnames(lct)
##  [1] "FMID"          "MarketName"    "Website"       "Facebook"     
##  [5] "Twitter"       "Youtube"       "OtherMedia"    "street"       
##  [9] "city"          "County"        "State"         "zip"          
## [13] "Season1Date"   "Season1Time"   "Season2Date"   "Season2Time"  
## [17] "Season3Date"   "Season3Time"   "Season4Date"   "Season4Time"  
## [21] "x"             "y"             "Location"      "Credit"       
## [25] "WIC"           "WICcash"       "SFMNP"         "SNAP"         
## [29] "Organic"       "Bakedgoods"    "Cheese"        "Crafts"       
## [33] "Flowers"       "Eggs"          "Seafood"       "Herbs"        
## [37] "Vegetables"    "Honey"         "Jams"          "Maple"        
## [41] "Meat"          "Nursery"       "Nuts"          "Plants"       
## [45] "Poultry"       "Prepared"      "Soap"          "Trees"        
## [49] "Wine"          "Coffee"        "Beans"         "Fruits"       
## [53] "Grains"        "Juices"        "Mushrooms"     "PetFood"      
## [57] "Tofu"          "WildHarvested" "updateTime"

Finally we use the summary() function to get a general idea about the distributions of data. Most of the variables in the data are of type character. However, the latitude and longitude data are numeric. The first look at the data reveals that dataset consist of some NA values. These can be eliminated using the na.omit() function in R.

summary(lct)
lct=na.omit(na)
dim(lct)

The number of observation in the data has reduced to 8646.

Quick Map:

Inspecting the data we relaize that all the farmers markets have been assigned a longitude and latitude. The variables names corresponding to these locatons are x and y respectively. We can generate a map using the plot() function. The first two arguments in the plot() function are points to be plottted on x and y axis. The type=p argument instructs R to plot points and the pch=19 argument instructs R to plot filled points. In order to learn more bout the plot function type ?plot() in teh R console window.

plot(lct$x,lct$y, type="p", pch =19)

Eventhough the map looks like USA map it is not very appealing or informative. Firstly, the points are too big. Secondly, the map is enclosed within a box, has default axes labels, and tick marks. Finally, there is no label.

Adding style elements:

To make the map more appealing:

  • Add a different color by using the col=“#e6550d” argument.
  • Reduce the size of the points using the cex=0.1 argument.
  • The enclosed box surrounding the map can be removed using the bty=“n” argument.
  • The axis labels can be removed by using xlab=“” and ylab=“” arguments
  • The axis tickmarks are removed using axes=FALSE argument.
  • The labels can be added to the plot using the mtext function
par(mar=c(5,6,4,6))
plot(lct$x,lct$y, type="p", pch =19, cex= 0.1, 
     col="#e6550d",bty="n", xlab="",axes=FALSE,
     ylab="")
mtext("Farmers Market in USA", side=3)
mtext("source: data.gov", side=1, cex=0.6)

To learn more about the mtext function type ?mtext() in R console Window.

Conclusion:

In the current section we have just touched a very small portion of how to generate maps in R. In the next coming weeks we will go a step further to generate multiple plots and also learn to make map intereactive.