Mapping with an 800 Pound Gorilla

I’ve been focusing on python recently to become a bi-lingual data scientist. Probably my least favorite thing about python is its plotting libraries - there are too many options built on top of matplotlib which pre-dates pandas dataframes. This makes for some clunky code and blurry boundaries (both “is that a seaborn, pandas, or matplotlib function?” and situations with 3 equally messy solutions but in very different ways). In my opinion, ggplot2’s deep interplay with dataframes makes a lot more sense and ggplot’s layers make it easy to change plot type (just switch the geom_), add facets, and tweak aesthetics. I think this post does a great job comparing the main python plotting libraries and also illustrates why matplotlib turns me off a bit - too many loops!

Despite matplotlib’s issues, it can make some really nice plots! Today I’ll be doing a little mapping so show off it’s prettier side.

Data

The data I’ll be using is of the places I’ve lived - GPS coordinates of 9 cities in 7 states (for security reasons I’m omitting where I was born). I’m going to use R’s maps package to get most of the data (because it’s soooo easy).

library(dplyr)
library(ggplot2)
library(maps)

cities <- filter(maps::us.cities, name %in% c("Reno NV", "Sacramento CA", "Salt Lake City UT", "Tacoma WA", "Boise ID", "Tulare CA", "Harrisonburg VA", "Charlotte NC", "Seattle WA"))

This dataframe was saved to a .csv for later use. The rest will be in python to draw a pretty map, overlay some points and connect the points with great circles.

Basemap

There are two main ways I know of to draw maps in python: matplotlib’s basemap toolkit and geopandas. As you might expect, geopandas is generally a lot nicer to use, but can be limiting. I’m going to use basemap since most of the data I need to plot is just to get the background map drawn and basemap has that built in (no shapefiles or extra datasets).

First, we’ll load our python libraries and set-up the underlying map.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from mpl_toolkits.basemap import Basemap
m = Basemap(llcrnrlon=-130.,llcrnrlat=20.,urcrnrlon=-60. , urcrnrlat=50., rsphere=(6378137.00,6356752.3142),
            resolution='l',projection='merc',
            lat_0=35.,lon_0=-95.)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.fillcontinents()
plt.title("The US")
plt.show()

The Basemap command has many options, here I’ve set the lower-left corner and upper-right corner, the Mercator projection, low resolution, and a center of \(35^{\circ}N\), \(95^{\circ}W\). This will include the first 48 states of the US roughly centered. Then we add coastlines, states, countries, and fill the land. These last commands are what makes Basemap nice - I don’t need shapefiles or other data to get all this detail onto the map.

Adding Points

Now I can add points for the cities. First, I’ll load the dataframe of the gps data and build lists of the order in which I moved.

city = pd.read_csv('../../static/files/city-gps2.csv')
print(city.head())
#lists of longitude and latitude in order.
##               name     pop    lat    long
## 0         Boise ID  193628  43.61 -116.23
## 1     Charlotte NC  607111  35.20  -80.83
## 2  Harrisonburg VA   41992  38.44  -78.87
## 3          Reno NV  206626  39.54 -119.82
## 4    Sacramento CA  480392  38.57 -121.47
clon = [city.long[3], city.long[8], city.long[4], 
        city.long[7], city.long[5], city.long[1], 
        city.long[6], city.long[7], city.long[2], 
        city.long[0]]
clat = [city.lat[3], city.lat[8], city.lat[4], 
        city.lat[7], city.lat[5], city.lat[1], 
        city.lat[6], city.lat[7], city.lat[2], 
        city.lat[0]]

Next, we add these points ((clon,clat) pairs) to the earlier basemap.

x,y = m(clon, clat)
m.scatter(x,y,marker='*', color='r', zorder=5) #the zorder is important
plt.show()

Connecting the Points

Now we can connect the points. If I were really bored, I could trace out the actual paths driven to get from one place to another. I’m not, so I’m just going to use Basemap’s drawgreatcircle() method to connect points with a great circle. This is why having the lists of latitude and longitude was really done.

for i in range(0,len(clon)-1):
    m.drawgreatcircle(clon[i],clat[i],clon[i+1],clat[i+1],linewidth=4,color='b')
plt.title("Places Lived with Great Circle Moves")
plt.show()

Not bad, but adding an alpha setting to change transparency based on the order of moves helps. The earliest move will be at \(1/3\) (light, but visible) while the most recent will be \(1\), with a linear increment. I’m also going to close the old plot and put all the mapping code together here.

plt.close() 
m = Basemap(llcrnrlon=-130.,llcrnrlat=20.,urcrnrlon=-60. , urcrnrlat=50., rsphere=(6378137.00,6356752.3142),
            resolution='l',projection='merc',
            lat_0=35.,lon_0=-95.)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.shadedrelief()
x,y = m(clon, clat)
m.scatter(x,y,marker='*', color='r', zorder=5)
for i in range(0,len(clon)-1):
    m.drawgreatcircle(clon[i],clat[i],clon[i+1],clat[i+1],linewidth=4,color='b', alpha=((1/3)+i*(2/27)))
plt.title("Places Lived with Great Circle Moves")
plt.show()

You may have noticed that I used .shadedrelief() instead of .fillcontinents(). Basemap has several similar options to provide some real quality to maps. Here are the .etopo() and .bluemarble() options as well:

plt.close()
plt.subplot(1,2,1)
m2 = Basemap(llcrnrlon=-130.,llcrnrlat=20.,urcrnrlon=-60. , urcrnrlat=50., rsphere=(6378137.00,6356752.3142),
            resolution='l',projection='merc',
            lat_0=35.,lon_0=-95.)
m2.drawcoastlines()
m2.drawstates()
m2.drawcountries()
m2.bluemarble()
x,y = m2(clon, clat)
m2.scatter(x,y,marker='*', color='r', zorder=5)
for i in range(0,len(clon)-1):
    m2.drawgreatcircle(clon[i],clat[i],clon[i+1],clat[i+1],linewidth=4,color='b', alpha=(.3+i*(2/27)))
plt.subplot(1,2,2)
m3 = Basemap(llcrnrlon=-130.,llcrnrlat=20.,urcrnrlon=-60. , urcrnrlat=50., rsphere=(6378137.00,6356752.3142),
            resolution='l',projection='merc',
            lat_0=35.,lon_0=-95.)
m3.drawcoastlines()
m3.drawstates()
m3.drawcountries()
m3.etopo()
x,y = m3(clon, clat)
m3.scatter(x,y,marker='*', color='r', zorder=5)
for i in range(0,len(clon)-1):
    m3.drawgreatcircle(clon[i],clat[i],clon[i+1],clat[i+1],linewidth=4,color='b', alpha=(.3+i*(2/27)))
plt.show()

It’s no ggplot2, but very nice for an 800 pound gorilla*.

Related