‘Tidy’ historical data
9 February 2026
This week our focus will be on the nature of data for historical research. We will also introduce R and see what working in a programming language looks like.
Reading
- Hadley Wickham, “Tidy Data,” Journal of Statistical Software 59, no. 10 (2014), http://www.jstatsoft.org/v59/i10/.
- Karl Broman and Kara Woo, “Data Organization in Spreadsheets,” The American Statistician 72, no. 1 (2018), https://doi.org/10.1080/00031305.2017.1375989.
- Katie Rawson and Trevor Muñoz, “Against Cleaning,” in Debates in the Digital Humanities 2019, ed. Matthew K. Gold and Lauren F. Klein (University of Minnesota Press, 2019).
- Catherine D’Ignazio and Lauren F. Klein, Data Feminism (The MIT Press, 2020), https://data-feminism.mitpress.mit.edu, Chapter 6: The Numbers Don’t Speak for Themselves.
Watch
A short video by Jenny Bryan on how to name files:
Assignment
The assignments this week are examples of material that should be placed in your commonplace book even if only the first one explicitly asks you to do so.
- Download a text editor or Markdown app such as Obsidian to use for Markdown documents. If you are using a digital commonplace book, you should do it in Markdown.
- See the Text editor guide for information on text editors and how to set them up.
- Commonplace book writing: Reflect on how you organize your research materials. How could this be improved? What are some of the features or capabilities you would like to have?
- Find one or more historical datasets related to your research interests and/or think of a dataset you could create from your research. Sketch out what it would look like.
- Look at Responsible datasets in context. How is the data documented? How is the data structured? Does it follow the tenets of tidy data?
Activities
- Discussion: Organizing your research materials.
- Working with text editors and Markdown.
- Set up folder structure for the course.
- Discussion: Tidy data in the humanities.
- Worksheet: Getting started with R
- Worksheet: Working with data frames in R
Resources
- Data Carpentry - Data Organization in Spreadsheets for Social Scientists
- Kieran Healy, The Plain Person’s Guide to Plain Text Social Science
- Kieran Healy, Modern Plain Text Computing
- Susan Collins, Data Management Plans for Historians: How to Document and Protect Your Research.
- There are lots of resources on data criticism from a humanities perspective.
- Look at the resources in the Critical DH section of the DH research guide.
- The work of Roopika Risam is a particularly good place to start.