TidyTuesday
    • About TidyTuesday
    • Datasets
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
      • 2020
      • 2019
      • 2018
    • Useful links

    On this page

    • Art Collections
      • Some recommendations
      • Get the data here
      • Data Dictionary
    • artwork.csv
    • artists.csv
      • Cleaning Script

    Art Gallery

    Art Collections

    The data this week comes from the Tate Art Museum.

    The dataset in this repository was last updated in October 2014. Tate has no plans to resume updating this repository, but we are keeping it available for the time being in case this snapshot of the Tate collection is a useful tool for researchers and developers.

    Here we present the metadata for around 70,000 artworks that Tate owns or jointly owns with the National Galleries of Scotland as part of ARTIST ROOMS. Metadata for around 3,500 associated artists is also included.

    The metadata here is released under the Creative Commons Public Domain CC0 licence. Images are not included and are not part of the dataset. Use of Tate images is covered on the Copyright and permissions page. You may also license images for commercial use.

    Tate requests that you actively acknowledge and give attribution to Tate wherever possible. Attribution supports future efforts to release other data. It also reduces the amount of ‘orphaned data’, helping retain links to authoritative sources.

    Here are some examples of Tate data usage in the wild. Please submit a pull request with your creation added to this list.

    • Data visualisations by Florian Kräutli
    • machine imagined art by Shardcore
    • The Dimensions of Art by Jim Davenport
    • Art as Data as Art and Part II by Os Keyes
    • Tate Acquisition Data by Zenlan
    • Tate Collection Geolocalized by Corentin Cournac, Mathieu Dauré, William Duclot and Pierre Présent
    • Aspect Ratio of Tate Artworks through Time by Joseph Lewis

    There are JSON files with additional metadata in the original GitHub.

    Some recommendations

    This is a dataset has lots of room for cleaning.

    • The place of birth/death can be converted to city/country
    • The medium has many distinct categories, and some of them can be collapsed by overlap
      You could also practice converting the text dates to the actual years of interest (although the already exist in the data).

    Get the data here

    # Get the Data
    
    # Read in with tidytuesdayR package 
    # Install from CRAN via: install.packages("tidytuesdayR")
    # This loads the readme and all the datasets for the week of interest
    
    # Either ISO-8601 date or year/week works!
    
    tuesdata <- tidytuesdayR::tt_load('2021-01-12')
    tuesdata <- tidytuesdayR::tt_load(2021, week = 3)
    
    artwork <- tuesdata$artwork
    
    # Or read in the data manually
    
    artwork <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2021/2021-01-12/artwork.csv')
    artists <- readr::read_csv("https://github.com/tategallery/collection/raw/master/artist_data.csv")

    Data Dictionary

    artwork.csv

    variable class description
    id double Unique ID
    accession_number character Accession number
    artist character Artist Name
    artistRole character Artist or other attribution
    artistId double Artist ID
    title character Title of the piece of art
    dateText character Date as raw text (pretty messy)
    medium character Medium of art, quite a lot of overlap
    creditLine character How acquired
    year double Year of creation
    acquisitionYear double Year acquired
    dimensions character Dimensions as character
    width double Width of art
    height double Height of art
    depth double Depth of art
    units character units of measure
    inscription character inscription if present
    thumbnailCopyright logical Thumbnail copyright
    thumbnailUrl character Thumbnail URL
    url character art URL

    artists.csv

    variable class description
    id double Artist ID
    name character Artist Name
    gender character Artist gender
    dates character Date as a character
    yearOfBirth double Year of birth
    yearOfDeath double Year of death
    placeOfBirth character Place of birth (typically city, country)
    placeOfDeath character Place of death (typically city, country)
    url character Artist URL

    Cleaning Script

    There is no cleaning script for today, the data is already “tame”.