Visualization with ggplot2

This page contains a quick overview of ggplot2 and the grammar of graphics, as well as resources for learning more about how to use ggplot2.

ggplot2 and the grammar of graphics

ggplot2 is built on a grammar of graphics that abstracts data visualizations into separate components.1

Component Function Description
Data ggplot(data) The data to be visualized.
Aesthetic mappings aes() Aesthetic mappings between variables and visual properties.
Geometries geom_*() The geometric shape used to represent the data.
Statistics stat_*() Any statistical transformations applied to the data.
Position position_() Any positional adjustment: stack, jitter, dodge, etc.
Scales scale_*() How aesthetic qualities are mapped to the data.
Guides guides() & labs() Labels for axes and legends to help interpret a plot.
Coordinate system coord_*() How data coordinates are mapped onto the plane of the graphic.
Facets facet_*() Any breaking up of the data into multiple plots.
Theme theme()/theme_*() The overall visual defaults of a plot.

Central to how ggplot2 implements the grammar of graphics is the notion of layers, which is used in two different but overlapping ways.

  1. As described in the ggplot2 documentation, “A layer combines data, aesthetic mapping, a geom (geometric object), a stat (statistical transformation), and a position adjustment. Typically, you will create layers using a geom_ function, overriding the default position and stat if needed.”
  2. You also build plots in ggplot2 by adding layers, using + between the different components of the plot to iteratively develop a visualization.

We can translate the grammar of graphics into generalized ggplot2 code as follows:

ggplot(data = <DATA>) + 
  geom_*(
     mapping = aes(<MAPPINGS>),
     stat = <STAT>, 
     position = <POSITION>
  ) +
  scale_*(<SCALES>, <GUIDES>) + 
  labs(<LABELS>) + 
  coord_*() + 
  facet_*() + 
  theme()

Note that you do not need to specify all of these elements for your plots. ggplot2 will choose reasonable defaults for those elements you do not specify. Let’s now see what this looks like with an actual plot of the penguins data.

ggplot(data = penguins) +
  geom_bar(
    mapping = aes(x = island, fill = species),
    stat = "count", # default stat for geom_bar(). Calculates y.
    position = "dodge" # position bars next to each other
  ) + 
  scale_fill_manual(values = c("darkorange", "purple", "cyan4")) + # fill scale
  scale_y_continuous(breaks = seq(0, 120, by = 20)) + # More guides for y axis
  coord_cartesian() + # default coordinates
  labs(title = "The Palmer Archipelago penguins",
       x = "Islands in the Palmer Archipelago, Antarctica",
       y = NULL, # Remove label for y axis
       fill = "Penguin species") + 
  theme_minimal(base_size = 13, # Change base theme and font
                base_family = "Times New Roman") + 
  theme(panel.grid.minor = element_blank(), # Remove minor grid lines
        legend.position = "top", # change position of legend
        legend.direction = "horizontal") # and legend direction

ggplot2 reference guides

Here are a couple of quick reference guides for working with ggplot2.

More in depth resources on ggplot2 and visualization

For more in depth discussions for how to use ggplot2 and data visualization in general, start with these resources:

For inspiration

If you want some inspiration for using ggplot2 to make some truly unique visualizations, check out the data visualization work of Nicola Rennie.

  • She has recently finished a book that goe through 12 different plots she has made, showing how and why she made them: Nicola Rennie, The Art of Data Visualization with Ggplot2: The TidyTuesday Cookbook (CRC Press, 2025), https://nrennie.rbind.io/art-of-viz/.
  • You can also look over the different data challenges she has participated in.

Visualization and historical argument

Footnotes

  1. Leland Wilkinson, The Grammar of Graphics, Second Edition (Springer-Verlag, 2005), https://doi.org/10.1007/0-387-28695-0; Hadley Wickham, “A Layered Grammar of Graphics,” Journal of Computational and Graphical Statistics 19, no. 1 (2010): 3–28, https://doi.org/10.1198/jcgs.2009.07098; Hadley Wickham, ggplot2: Elegant Graphics for Data Analysis, Second Edition (Springer, 2016), https://doi.org/10.1007/978-3-319-24277-4, page 4.↩︎