Spatial data visualization (interactive map & choropleth)

Overview

In this report, I make an exploratory interactive map in tmap showing the location of oil spill events in California. The final graph results are displayed via a choropleth map in ggplot in which the fill color for each county depends on the count of inland oil spill events using the variable InlandMariby county for the 2008 oil spill data.

Data source: 1.Oil Spill Incident Tracking [ds394] GIS Dataset. https://map.dfg.ca.gov/metadata/ds0394.html.

2.CA Geographic Boundaries - California Open Data. https://data.ca.gov/dataset/ca-geographic-boundaries.

Data Wrangling

ca_counties_sf <- read_sf(here("data", "CA_Counties", "CA_Counties_TIGER2016.shp"))
  ca_subset_sf <- ca_counties_sf %>% 
  janitor::clean_names() %>%
  select(county_name = name, land_area = aland)
# head(ca_subset_sf)

# Check the CRS:
ca_subset_sf %>% st_crs()
ca_subset_sf %>% raster::crs() ### to show proj4 string

# ggplot(data = ca_subset_sf) +
#   geom_sf(aes(fill = land_area), color = "white", size = 0.1) +
#   theme_void() +
#   scale_fill_gradientn(colors = c("cyan","blue","purple"))

Data Analysis

Here we look at the data from the 2008 Oil Spill Incident Tracking in California using GIS Dataset.

oil_spill_sf <- read_sf(here("data","Oil_Spill_Incident_Tracking_[ds394]")) %>%
  janitor::clean_names()

# Check the CRS:
oil_spill_sf %>% st_crs()
oil_spill_sf %>% raster::crs()

Data Exploration & Vizualization

  • Initial plot of the 2008 oil spill events in California
# Set the viewing mode to "interactive":
tmap_mode(mode = "view")

# Then make a map (with the polygon fill color updated by variable 'land_area', updating the color palette to BuPu), then add another shape layer for the oil spill records (added as dots):
tm_shape(ca_subset_sf) +
  tm_fill("land_area", palette = "BuPu") +
  tm_shape(oil_spill_sf) +
 tm_markers(
  shape = marker_icon(),
  col = NA,
  border.col = NULL,
  clustering = TRUE,
  text = NULL,
  text.just = "top",
  markers.on.top.of.text = TRUE,
  group = NA
)
Figure 1: This interactive map shows the location of incidence oil spills in California in 2008
ca_oil_spill_sf <- ca_subset_sf %>% 
  st_join(oil_spill_sf)
head(ca_oil_spill_sf)


ca_oil_spill_counts_sf <- ca_oil_spill_sf %>%
  filter(inlandmari %in% "Inland") %>% 
  group_by(inlandmari,county_name) %>%
  summarize(n_records = sum(!is.na(objectid)))
head(ca_oil_spill_counts_sf)
names <- ca_oil_spill_counts_sf %>% 
  filter(n_records >= 100) %>% 
  select(county_name)
  • Choropleth Map: using the number of records for the count of inland oil spills by county :
ggplot(data = ca_oil_spill_counts_sf) +
  geom_sf(aes(fill = n_records), color = "white", size = 0.1) +
  scale_fill_gradientn(colors = c("lightgray","orange","red")) +
  theme_minimal() +
  labs(fill = "Number of Oil spills by county in CA ") 
Map of Inland Oil Spills by California County (through 2008)

Map of Inland Oil Spills by California County (through 2008)

Figure 2: This map shows the count of spills by county. The darkest red color indicates counties where with the greatest record of oil spills which all seem to be located in LA county.

Summary

The Choropleth map helps us visualize which counties had the greatest record count of inland oil spills in California in 2008. Los Angeles county was the county with 200 plus oil spills. Other counties like San Diego, San Mateo had high record counts as well.