What we’re doing today

In the spirit of Emily Riederer’s ugliest ggplot ever, we’ll play around with ggplot code in order to learn how it works. The goal: make the ugliest plot possible.

Set up

We’ll load in our packages below:

# general use
library(tidyverse) # general tidying and visualization: ggplot is loaded by default with tidyverse
library(lterdatasampler) # data we're using comes from this package
library(lubridate) # working with dates
library(here) # folder organization

# extras
library(patchwork) # arranging plots
library(magick) # putting images into ggplots

Note: lterdatasampler has to be installed from the GitHub repo using the code below (copy, paste, and run in the console):

remotes::install_github("lter/lterdatasampler")

Today, we’ll use the Plum Island fiddler crab data from lterdatasampler to visualize relationships between latitude and crab size. Read the linked vignette to learn about Bergmann’s rule!

The Plum Island LTER has a data set of crab sizes (column size) from Summer 2016 at 13 different marshes spanning 12 degrees latitude. A sample is below, but try View(pie_crab) in the console to see the whole data frame.

pie_crab %>% 
  slice_sample(n = 5)
## # A tibble: 5 × 9
##   date       latitude site   size air_temp air_temp_sd water_temp water_…¹ name 
##   <date>        <dbl> <chr> <dbl>    <dbl>       <dbl>      <dbl>    <dbl> <chr>
## 1 2016-07-28     41.6 NB     17.0     12.2        9.48       17.5     7.86 Narr…
## 2 2016-08-12     42.2 BC     11.4     11.6        9.53       14.0     6.9  Bare…
## 3 2016-08-09     37.2 VCR    14.0     15.0        8.41       17.6     8.43 Virg…
## 4 2016-08-13     42.7 PIE    17.7     10.3        9.45       14.3     4.84 Plum…
## 5 2016-08-01     34.7 RC     17.4     18.6        8.40       20.5     7.00 Rach…
## # … with abbreviated variable name ¹​water_temp_sd

Just to make things a little more interesting, I’m going to split up the dates into years, months, and days and save that as a new data frame, crab_data.

crab_data <- pie_crab %>% 
  # extracting month from the date column using lubridate::month()
  # also making this a factor (instead of numeric) using as.factor()
  mutate(month = as.factor(month(date)))

ggplot grammar

ggplot works in layers. The code to make a plot can vary, but always includes:
1. the ggplot() call: this tells R that you want to use the function ggplot() in ggplot to plot things.
2. data and aesthetics within that ggplot() call: tells ggplot to use a specific data frame any variables in that data frame that should be represented in the plot (for example, x- and y- axes, colors, shapes)
3. a geom_(): short for “geometry”, geom_() calls tell ggplot what kind of plot you want to make. Try ?geom_ in the console to see the different options.

# step 1: call ggplot
ggplot(
  
  # step 2: specify the data and the aesthetics
  # plotting latitude on the x-axis and crab size on the y-axis
  data = crab_data, aes(x = latitude, y = size)) +
  
  # step 3: specify a geom - in this case, we're creating a scatter plot
  geom_point()

Note that when you’re adding on layers in ggplot, you’ll use the + instead of the %>% operator. This is because ggplot() is the function call, but everything else you add on is a modifier of the ggplot() plotting function (instead of a new function doing something different).

So we’ve just made this plot. But how can we make it worse?

ggplot() takes aesthetics from the data frame, so I’m going to color the points by site and make the shapes represent month. I’m also going to make a jitter plot, which is a scatter plot with the points “jittered”, or randomly shaken up so that it’s easier to see the overlap (or be chaotic). I’m also going to facet the plot by month using facet_wrap(), which is a useful function when you’re trying to see differences between variables in different panels (or you can use facet_grid(), which does essentially the same thing).

ggplot(data = crab_data, aes(x = latitude, y = size)) +
  # putting the aesthetics in here: color points by site, shape points by month
  geom_jitter(aes(color = site, shape = month), 
  # anything that doesn't have to do with variables (like point size or transparency) goes outside the aesthetics
              size = 3, alpha = 0.6) +
  # facet by month
  facet_wrap(~ month)