Making Maps (cont’d)

Wednesday March 30, 2022

In the past lecture we talked about making maps. Recall there are several ways to do this, e.g., pygmt, matplotlib’s mpl_tookit, cartopy, etc.. In this class we will expand on this a bit and look at how to add data (and/or analysis) to a map.

There is a whole discipline aimed at geospatial analysis (i.e., GIS). Traditionally this has been done with a completely different set of tools, for example the ESRI (now Esri) set (ArcGIS) and the opensource QGIS. Meanwhile, geophysical analysis is done with tools like Matlab.

Data Models

With the different tools (and disciplines) come different data models. In the geosciences we typically deal with “flat” data structures, e.g., a variable expressed as a two (or three) dimensional array where one dimension is latitude the other longitude (third being time and/or depth). The most common formats are ASCII tables, NetCDF and Matlab binary.

GIS applications on the other hand use either raster files or vector files. The former are akin to images (a value as a function of lat/lon). The latter are most commonly known as “shapefiles”

Shapefiles

Shapefiles are usually a series of files. For example, you might have a shapefile of schools in Hawaii. This would consist of schools.shp but also schools.dbf, schools.prj, etc..

QGIS example.

Data and maps

  1. projections and transforms

  2. GIS maps: geopandas (and matplotlib)

  3. GIS maps: geojson and plotly

  4. GIS maps: cartopy and shapely

1. Plotting data on maps: Understanding the transform and projection keywords

Notes taken from https://scitools.org.uk/cartopy/docs/latest/tutorials/understanding_transform.html

It can be easy to get confused about what the projection and transform keyword arguments actually mean. Here we’ll use some simple examples to illustrate the effect of each.

The core concept is that the projection of your axes is independent of the coordinate system your data is defined in. The projection argument is used when creating plots and determines the projection of the resulting plot (i.e. what the plot looks like). The transform argument to plotting functions tells Cartopy what coordinate system your data are defined in.

First we’ll create some dummy data defined on a regular latitude/longitude grid:

[1]:
import numpy as np

# make an example data set, just sin/cos as 2-D field
lon = np.linspace(-80, 80, 25)
lat = np.linspace(30, 70, 25)
lon2d, lat2d = np.meshgrid(lon, lat)
data = np.cos(np.deg2rad(lat2d) * 4) + np.sin(np.deg2rad(lon2d) * 4)

1A. Transform keyword

In this first example, we plot the data with a map using the PlateCarree projection without specifying the transform argument. Since the data happen to be defined in the same coordinate system as we are plotting in, this actually works correctly:

[2]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt


# The projection keyword determines how the plot will look
plt.figure(figsize=(12, 8))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_global()
ax.coastlines()

ax.contourf(lon, lat, data);  # didn't use transform, but looks ok...
../_images/notebooks_0330_LatLonMaps_8_0.png

1B. Add the projection

Again, since the data happen to be defined in the same coordinate system as we are plotting in adding the transformation doesn’t change things.

[3]:
# The data are defined in lat/lon coordinate system, so PlateCarree()
# is the appropriate choice:
data_crs = ccrs.PlateCarree()

# The projection keyword determines how the plot will look
plt.figure(figsize=(6, 3))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_global()
ax.coastlines()

ax.contourf(lon, lat, data, transform=data_crs);
../_images/notebooks_0330_LatLonMaps_10_0.png

1C. Data coordinate system differs from projection

We saw in the previous example that the plot didn’t change since the default assumption when the transform argument is not supplied is that the coordinate system matches the projection, which was indeed the case.

Now we’ll try this again but using a different projection for our plot. We’ll plot onto a rotated pole projection, and we’ll omit the transform argument to see what happens:

[4]:
# Now we plot a rotated pole projection
projection = ccrs.RotatedPole(pole_longitude=-177.5, pole_latitude=37.5)
plt.figure(figsize=(12, 8))
ax = plt.axes(projection=projection)
ax.set_global()
ax.coastlines()

ax.contourf(lon, lat, data);  # didn't use transform, uh oh!
../_images/notebooks_0330_LatLonMaps_12_0.png

The resulting plot is incorrect! We didn’t tell Cartopy what coordinate system our data are defined in, so it assumed it was the same as the projection we are plotting on, and the data are plotted in the wrong place.

We can fix this by supplying the transform argument, which remains the same as before since the data’s coordinate system hasn’t changed:

[5]:
# A rotated pole projection again...
projection = ccrs.RotatedPole(pole_longitude=-177.5, pole_latitude=37.5)
plt.figure(figsize=(6, 3))
ax = plt.axes(projection=projection)
ax.set_global()
ax.coastlines()

# ...but now using the transform argument
ax.contourf(lon, lat, data, transform=data_crs);
../_images/notebooks_0330_LatLonMaps_14_0.png

The safest thing to do is always provide the transform keyword regardless of the projection you are using, and avoid letting Cartopy make assumptions about your data’s coordinate system. Doing so allows you to choose any map projection for your plot and allow Cartopy to plot your data where it should be:

[6]:
# We can choose any projection we like...
projection = ccrs.InterruptedGoodeHomolosine()
plt.figure(figsize=(6, 7))
ax1 = plt.subplot(211, projection=projection)
ax1.set_global()
ax1.coastlines()
ax2 = plt.subplot(212, projection=ccrs.NorthPolarStereo())
ax2.set_extent([-180, 180, 20, 90], crs=ccrs.PlateCarree())
ax2.coastlines()

# ...as long as we provide the correct transform, the plot will be correct
ax1.contourf(lon, lat, data, transform=data_crs)
ax2.contourf(lon, lat, data, transform=data_crs);
../_images/notebooks_0330_LatLonMaps_16_0.png

2. GIS maps: geopandas (and matplotlib)

Geopandas provides a (very) easy way to interact with vector geospatial data (e.g., shapefiles). This provides a powerful tool for actually interacting with data and making plots. In this example we read data from a shapefile (New York City boroughs) and make a simple plot. Then, we read data from a separate file to color-shade the map based on a value.

[7]:
# load matplotlib, pandas, geopandas and pyproj
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
import pyproj
[8]:
# Read in a shapefile and print contents
fp = '../shapefiles/nybb.shp'
map_df = gpd.read_file(fp)
# check data type so we can see that this is not a normal dataframe, but a GEOdataframe
#map_df.head()
[9]:
map_df.head()
[9]:
BoroCode BoroName Shape_Leng Shape_Area geometry
0 5 Staten Island 330454.806607 1.623847e+09 MULTIPOLYGON (((970217.022 145643.332, 970227....
1 1 Manhattan 357176.132581 6.363978e+08 MULTIPOLYGON (((981219.056 188655.316, 980940....
2 2 Bronx 464475.067699 1.186824e+09 MULTIPOLYGON (((1012821.806 229228.265, 101278...
3 3 Brooklyn 742297.830402 1.937844e+09 MULTIPOLYGON (((1021176.479 151374.797, 102100...
4 4 Queens 874225.139404 3.048479e+09 MULTIPOLYGON (((1029606.077 156073.814, 102957...
[10]:
# Make a quick map
map_df.plot();
../_images/notebooks_0330_LatLonMaps_21_0.png
[11]:
# Read data to be used (plotted) on our map
df = pd.read_csv('../shapefiles/population.dat', header=0)
df.head()
[11]:
Borough County Census GDP sqmile sqkm personpsqmile personpsqkm
0 Bronx Bronx 1472654 42.695 42.2 109.3 34920 13482
1 Brooklyn Kings 2736074 91.559 69.4 179.7 39438 15227
2 Manhattan New York 1694251 600.244 22.7 58.8 74781 28872
3 Queens Queens 2405464 93.310 108.7 281.5 22125 8542
4 Staten Island Richmond 495747 14.514 57.5 148.9 8618 3327
[12]:
# merge (technically here we are doing a "spatial join") the two data
# sets.  Note here we specify the column from the first data set (the
# shapefile) that we want to match to the second (the ascii table).
# This would be the actual Burough name, and in the shapefile this is
# called "BoroName" while in the ascii file it's "Borough"
merged = map_df.set_index('BoroName').join(df.set_index('Borough'))
merged.head()
[12]:
BoroCode Shape_Leng Shape_Area geometry County Census GDP sqmile sqkm personpsqmile personpsqkm
BoroName
Staten Island 5 330454.806607 1.623847e+09 MULTIPOLYGON (((970217.022 145643.332, 970227.... Richmond 495747 14.514 57.5 148.9 8618 3327
Manhattan 1 357176.132581 6.363978e+08 MULTIPOLYGON (((981219.056 188655.316, 980940.... New York 1694251 600.244 22.7 58.8 74781 28872
Bronx 2 464475.067699 1.186824e+09 MULTIPOLYGON (((1012821.806 229228.265, 101278... Bronx 1472654 42.695 42.2 109.3 34920 13482
Brooklyn 3 742297.830402 1.937844e+09 MULTIPOLYGON (((1021176.479 151374.797, 102100... Kings 2736074 91.559 69.4 179.7 39438 15227
Queens 4 874225.139404 3.048479e+09 MULTIPOLYGON (((1029606.077 156073.814, 102957... Queens 2405464 93.310 108.7 281.5 22125 8542
[14]:
# We are now all set to make a map.  Here we will plot a map of the five buroughs
# and color-shade them based on some value.
#
# 1. Define the variable that we want to use for our shading (note this is a
#    column heading in the joined file); here is it the number of people ("Census")
variable = 'GDP'

# 2. Set the range for the color values
#vmin, vmax = 500000, 3000000
vmin, vmax = 0, 600
# 3. Initiate figure (fig) and axes (ax) objects
fig, ax = plt.subplots(1, figsize=(14, 14))

# 4. Create map; here we use the "plot" method in geopandas
merged.plot(column=variable, cmap='Blues', linewidth=0.8, ax=ax, edgecolor='0.8')

# 5. Clean up the plot a bit
#   a) remove the axis
ax.axis('off')
#   b) add a title
ax.set_title('GDP of NYC Boroughs', fontdict={'fontsize': '25', 'fontweight' : '3'})
#   c) create an annotation for the data source
ax.annotate('Source: Wikipedia',xy=(0.1, .08),  xycoords='figure fraction',
            horizontalalignment='left', verticalalignment='top', fontsize=12, color='#555555')
#   d) create colorbar as a legend
sm = plt.cm.ScalarMappable(cmap='Blues', norm=plt.Normalize(vmin=vmin, vmax=vmax))
sm._A = []
cbar = fig.colorbar(sm)
../_images/notebooks_0330_LatLonMaps_24_0.png

3. GIS maps: geojson and plotly

In this example we access geospatial data via a JSON request/reply. To plot the data, we will use plotly. This gives a nice, interactive plot that shows how to take advantage of databases within geospatial data.

[15]:
# load package to read from a URL, JSON and pandas for data handling and plotly for plotting
from urllib.request import urlopen
import json
import pandas as pd
import plotly.express as px
[16]:
# set the URL for a GeoJSON file and access
URL = 'https://raw.githubusercontent.com/plotly/datasets/master/'
# the JSON file is "geojson-counties-fips"
with urlopen(URL+'geojson-counties-fips.json') as response:
    counties = json.load(response)
[17]:
# print the head of the file; note is a dictionary
counties['features'][0]
[17]:
{'type': 'Feature',
 'properties': {'GEO_ID': '0500000US01001',
  'STATE': '01',
  'COUNTY': '001',
  'NAME': 'Autauga',
  'LSAD': 'County',
  'CENSUSAREA': 594.436},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-86.496774, 32.344437],
    [-86.717897, 32.402814],
    [-86.814912, 32.340803],
    [-86.890581, 32.502974],
    [-86.917595, 32.664169],
    [-86.71339, 32.661732],
    [-86.714219, 32.705694],
    [-86.413116, 32.707386],
    [-86.411172, 32.409937],
    [-86.496774, 32.344437]]]},
 'id': '01001'}
[18]:
# Next we read the FIPS from a CSV file; same URL, different file name
df = pd.read_csv(URL+'fips-unemp-16.csv',dtype={"fips": str})
df.head()
[18]:
fips unemp
0 01001 5.3
1 01003 5.4
2 01005 8.6
3 01007 6.6
4 01009 5.5
[19]:
# Make a plot
fig = px.choropleth_mapbox(df, geojson=counties, locations='fips', color='unemp',
                           color_continuous_scale="Viridis",
                           range_color=(0, 12),
                           mapbox_style="carto-positron",
                           zoom=3, center = {"lat": 37.0902, "lon": -95.7129},
                           opacity=0.5,
                           labels={'unemp':'unemployment rate'}
                          )
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

4. GIS maps: cartopy and shapely

[20]:
import cartopy
import cartopy.crs as ccrs
import cartopy.io.shapereader as shpreader
import matplotlib.pyplot as plt
import matplotlib.patheffects as PathEffects
from shapely.geometry import LineString, MultiLineString, MultiPolygon
import shapely.wkt as wkt

4A. read shapefiles and highligh specific country

[21]:
shpfilename = shpreader.natural_earth(resolution='110m',
                                      category='cultural',
                                      name='admin_0_countries')
reader = shpreader.Reader(shpfilename)
countries = reader.records()

country1 = next(countries)
country1.attributes.keys()
[21]:
dict_keys(['featurecla', 'scalerank', 'LABELRANK', 'SOVEREIGNT', 'SOV_A3', 'ADM0_DIF', 'LEVEL', 'TYPE', 'ADMIN', 'ADM0_A3', 'GEOU_DIF', 'GEOUNIT', 'GU_A3', 'SU_DIF', 'SUBUNIT', 'SU_A3', 'BRK_DIFF', 'NAME', 'NAME_LONG', 'BRK_A3', 'BRK_NAME', 'BRK_GROUP', 'ABBREV', 'POSTAL', 'FORMAL_EN', 'FORMAL_FR', 'NAME_CIAWF', 'NOTE_ADM0', 'NOTE_BRK', 'NAME_SORT', 'NAME_ALT', 'MAPCOLOR7', 'MAPCOLOR8', 'MAPCOLOR9', 'MAPCOLOR13', 'POP_EST', 'POP_RANK', 'POP_YEAR', 'GDP_MD', 'GDP_YEAR', 'ECONOMY', 'INCOME_GRP', 'FIPS_10', 'ISO_A2', 'ISO_A2_EH', 'ISO_A3', 'ISO_A3_EH', 'ISO_N3', 'ISO_N3_EH', 'UN_A3', 'WB_A2', 'WB_A3', 'WOE_ID', 'WOE_ID_EH', 'WOE_NOTE', 'ADM0_A3_IS', 'ADM0_A3_US', 'ADM0_A3_FR', 'ADM0_A3_RU', 'ADM0_A3_ES', 'ADM0_A3_CN', 'ADM0_A3_TW', 'ADM0_A3_IN', 'ADM0_A3_NP', 'ADM0_A3_PK', 'ADM0_A3_DE', 'ADM0_A3_GB', 'ADM0_A3_BR', 'ADM0_A3_IL', 'ADM0_A3_PS', 'ADM0_A3_SA', 'ADM0_A3_EG', 'ADM0_A3_MA', 'ADM0_A3_PT', 'ADM0_A3_AR', 'ADM0_A3_JP', 'ADM0_A3_KO', 'ADM0_A3_VN', 'ADM0_A3_TR', 'ADM0_A3_ID', 'ADM0_A3_PL', 'ADM0_A3_GR', 'ADM0_A3_IT', 'ADM0_A3_NL', 'ADM0_A3_SE', 'ADM0_A3_BD', 'ADM0_A3_UA', 'ADM0_A3_UN', 'ADM0_A3_WB', 'CONTINENT', 'REGION_UN', 'SUBREGION', 'REGION_WB', 'NAME_LEN', 'LONG_LEN', 'ABBREV_LEN', 'TINY', 'HOMEPART', 'MIN_ZOOM', 'MIN_LABEL', 'MAX_LABEL', 'NE_ID', 'WIKIDATAID', 'NAME_AR', 'NAME_BN', 'NAME_DE', 'NAME_EN', 'NAME_ES', 'NAME_FA', 'NAME_FR', 'NAME_EL', 'NAME_HE', 'NAME_HI', 'NAME_HU', 'NAME_ID', 'NAME_IT', 'NAME_JA', 'NAME_KO', 'NAME_NL', 'NAME_PL', 'NAME_PT', 'NAME_RU', 'NAME_SV', 'NAME_TR', 'NAME_UK', 'NAME_UR', 'NAME_VI', 'NAME_ZH', 'NAME_ZHT', 'FCLASS_ISO', 'FCLASS_US', 'FCLASS_FR', 'FCLASS_RU', 'FCLASS_ES', 'FCLASS_CN', 'FCLASS_TW', 'FCLASS_IN', 'FCLASS_NP', 'FCLASS_PK', 'FCLASS_DE', 'FCLASS_GB', 'FCLASS_BR', 'FCLASS_IL', 'FCLASS_PS', 'FCLASS_SA', 'FCLASS_EG', 'FCLASS_MA', 'FCLASS_PT', 'FCLASS_AR', 'FCLASS_JP', 'FCLASS_KO', 'FCLASS_VN', 'FCLASS_TR', 'FCLASS_ID', 'FCLASS_PL', 'FCLASS_GR', 'FCLASS_IT', 'FCLASS_NL', 'FCLASS_SE', 'FCLASS_BD', 'FCLASS_UA'])
[23]:
# just like before, create a graphic object using a PlateCarree projection
ax = plt.axes(projection=ccrs.PlateCarree())

# now add a bunch o'features
#  land, ocean, coast, borders, lakes, rivers
ax.add_feature(cartopy.feature.LAND)
ax.add_feature(cartopy.feature.OCEAN)
ax.add_feature(cartopy.feature.COASTLINE)
ax.add_feature(cartopy.feature.BORDERS, linestyle='-', alpha=.5)
ax.add_feature(cartopy.feature.LAKES, alpha=0.95)
ax.add_feature(cartopy.feature.RIVERS)

# 'zoom' into a region
ax.set_extent([-20, 60, 10, 80],crs = ccrs.PlateCarree())

# read in shapefile (note this one comes with our install)
shpfilename = shpreader.natural_earth(resolution='110m',
                                      category='cultural',
                                      name='admin_0_countries')
reader = shpreader.Reader(shpfilename)
countries = reader.records()

# highligh specific country
for country in countries:
    if country.attributes['SOVEREIGNT'] == "Norway":
        ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(0, 1, 0))

plt.rcParams["figure.figsize"] = (20,20)
../_images/notebooks_0330_LatLonMaps_35_0.png

4B. make a map using an attribute from the shapefile

Typically shapefiles have a database attached, and we can access this information in our plots. In this case, we look for the gross regional product (GRP). We can then make a “heat map”, i.e., color shade by this value.

[24]:
ax = plt.axes(projection=ccrs.PlateCarree())
ax.add_feature(cartopy.feature.LAND)
ax.add_feature(cartopy.feature.OCEAN)
ax.add_feature(cartopy.feature.COASTLINE)
ax.add_feature(cartopy.feature.BORDERS, linestyle='-', alpha=.5)
ax.add_feature(cartopy.feature.LAKES, alpha=0.3)
ax.add_feature(cartopy.feature.RIVERS)

ax.set_extent([-150, 60, -25, 60])

shpfilename = shpreader.natural_earth(resolution='110m', category='cultural',
                                      name='admin_0_countries')
reader = shpreader.Reader(shpfilename)
countries = reader.records()

# loop through all countries, color shade as follows:
#   GRP starts with 1: cyan
#   GRP starts with 2: blue
#   GRP starts with 3: navy blue
#   GRP starts with 4: pink
#   GRP starts with 5: red
'''
for country in countries:
    if country.attributes['INCOME_GRP'].startswith("1"):
        ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(0, 1, 1))
    elif country.attributes['INCOME_GRP'].startswith("2"):
        ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(0, 0, 1))
    elif country.attributes['INCOME_GRP'].startswith("3"):
        ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(0, 0, 0.5))
    elif country.attributes['INCOME_GRP'].startswith("4"):
        ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(1, 0, 0.5))
    elif country.attributes['INCOME_GRP'].startswith("5"):
        ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(1, 0, 0))
'''
for country in countries:
    if country.attributes['INCOME_GRP'].startswith("1"):
        try:
            ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(1,1,1))
        except Exception as e:
            list_str_polygons = [str(country.geometry)]
            c = MultiPolygon(map(wkt.loads, list_str_polygons))
            ax.add_geometries(c, ccrs.PlateCarree(), facecolor=(1,1,1))
    elif country.attributes['INCOME_GRP'].startswith("2"):
        try:
            ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(1,0,1))
        except Exception as e:
            list_str_polygons = [str(country.geometry)]
            c = MultiPolygon(map(wkt.loads, list_str_polygons))
            ax.add_geometries(c, ccrs.PlateCarree(), facecolor=(1,1,1))
    elif country.attributes['INCOME_GRP'].startswith("3"):
        try:
            ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(0,0,0.5))
        except Exception as e:
            list_str_polygons = [str(country.geometry)]
            c = MultiPolygon(map(wkt.loads, list_str_polygons))
            ax.add_geometries(c, ccrs.PlateCarree(), facecolor=(1,1,1))
    elif country.attributes['INCOME_GRP'].startswith("4"):
        try:
            ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(1,0,0.5))
        except Exception as e:
            list_str_polygons = [str(country.geometry)]
            c = MultiPolygon(map(wkt.loads, list_str_polygons))
            ax.add_geometries(c, ccrs.PlateCarree(), facecolor=(1,1,1))
    elif country.attributes['INCOME_GRP'].startswith("5"):
        try:
            ax.add_geometries(country.geometry, ccrs.PlateCarree(), facecolor=(1,0,0))
        except Exception as e:
            list_str_polygons = [str(country.geometry)]
            c = MultiPolygon(map(wkt.loads, list_str_polygons))
            ax.add_geometries(c, ccrs.PlateCarree(), facecolor=(1,1,1))
../_images/notebooks_0330_LatLonMaps_37_0.png

4C. add labels to countries

[25]:
ax = plt.axes(projection=ccrs.PlateCarree())
ax.add_feature(cartopy.feature.LAND)
ax.add_feature(cartopy.feature.OCEAN)
ax.add_feature(cartopy.feature.COASTLINE)
ax.add_feature(cartopy.feature.BORDERS, linestyle='-', alpha=.5)
ax.add_feature(cartopy.feature.LAKES, alpha=0.3)
ax.add_feature(cartopy.feature.RIVERS)

ax.set_extent([-150, 60, -25, 60])

shpfilename = shpreader.natural_earth(resolution='110m', category='cultural',
                                      name='admin_0_countries')
reader = shpreader.Reader(shpfilename)
countries = reader.records()

my_color = 0.01
for country in countries:
    my_color = my_color + .1
    if my_color > 1:
        my_color = .01
    try:
        g = ax.add_geometries(country.geometry, ccrs.PlateCarree(),
                              facecolor=(my_color, 1, 0))
    except Exception as e:
        list_str_polygons = [str(country.geometry)]
        c = MultiPolygon(map(wkt.loads, list_str_polygons))
        g = ax.add_geometries(c, ccrs.PlateCarree(), facecolor=(my_color,1,0))
    x = country.geometry.centroid.x
    y = country.geometry.centroid.y
    ax.text(x, y, country.attributes['NAME'], color='blue', size=7,
            ha='center', va='center', transform=ccrs.PlateCarree(),
            path_effects=[PathEffects.withStroke(linewidth=0, foreground="k",
                                                 alpha=.6)])
../_images/notebooks_0330_LatLonMaps_39_0.png
[ ]: