TidyTuesday
    • About TidyTuesday
    • Datasets
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
      • 2020
      • 2019
      • 2018
    • Useful links

    On this page

    • Beer Production
      • Get the data here
      • Data Dictionary
    • brewing_materials.csv
    • beer_states.csv
    • beer_taxed.csv
    • brewer_size.csv
      • Cleaning Script

    Large brewing kettles

    Beer Production

    The data this week comes from the Alcohol and Tobacco Tax and Trade Bureau (TTB). H/t to Bart Watson for sharing the source of the data.

    There’s a literal treasure trove of data here: - State-level beer production by year (2008-2019) - Number of brewers by production size by year (2008-2019) - Monthly beer stats aggregated across the US (2008-2019)

    Some considerations: - A barrel of beer for this data is 31 gallons - Most data is in barrels removed/taxed or produced - Removals = “Total barrels removed subject to tax by the breweries comprising the named strata of data”, essentially how much was produced and removed for consumption. - A LOT of data came from PDFs - I included all the code I used to grab data and tidy it up, take a peek and try out your own mechanism for getting the tables out.

    Massive shoutout to pdftools by ROpenSci and stringr for doing a lot of heavy lifting with the datacleaning and prep here.

    Get the data here

    # Get the Data
    
    brewing_materials <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2020/2020-03-31/brewing_materials.csv')
    beer_taxed <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2020/2020-03-31/beer_taxed.csv')
    brewer_size <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2020/2020-03-31/brewer_size.csv')
    beer_states <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2020/2020-03-31/beer_states.csv')
    
    # Or read in with tidytuesdayR package (https://github.com/dslc-io/tidytuesdayR)
    # PLEASE NOTE TO USE 2020 DATA YOU NEED TO USE tidytuesdayR version ? from GitHub
    
    # Either ISO-8601 date or year/week works!
    
    # Install via pak::pak("dslc-io/tidytuesdayR")
    
    tuesdata <- tidytuesdayR::tt_load('2020-03-31')
    tuesdata <- tidytuesdayR::tt_load(2020, week = 14)
    
    
    brewing_materials <- tuesdata$brewing_materials

    Data Dictionary

    brewing_materials.csv

    variable class description
    data_type character Pounds of Material - this is a sub-table from beer_taxed
    material_type character Grain product, Totals, Non-Grain Product (basically hops vs grains)
    year double Year
    month integer Month
    type character Actual line-item from material type
    month_current double Current number of barrels for this year/month
    month_prior_year double Prior year number of barrels for same month
    ytd_current double Cumulative year to date of current year
    ytd_prior_year double Cumulative year to date for prior year

    beer_states.csv

    variable class description
    state character State abbreviated
    year integer Year
    barrels double Barrels produced within each type
    type character Type of production/use (On premise, Bottles/Cans, Kegs/Barrels)

    beer_taxed.csv

    variable class description
    data_type character Barrels Produced
    tax_status character The Tax Status, factor with Totals, Taxable, Sub Total Taxable, Tax Free, Sub Total Tax-Free
    year double Year
    month integer Month
    type character Type of production, either Total Production (Production) or specific sub-category and sub-totals
    month_current double Current number of barrels for this year/month
    month_prior_year double Prior year number of barrels for same month
    ytd_current double Cumulative year to date of current year
    ytd_prior_year double Cumulative year to date for prior year

    brewer_size.csv

    variable class description
    year integer Year
    brewer_size character Range of production for brewer size, number of barrels produced
    n_of_brewers double Number of brewers at that brewer size
    total_barrels double Total barrels of beer produced at that brewer size
    taxable_removals double Taxable barrels for removals - removals for consumption under taxation
    total_shipped double Total barrels shipped - produced beer that is not taxed

    Cleaning Script

    Please see download script via download_beer.R

    Please see cleaning scripts via scrape_beers.R