Italian industrial production

ISTAT (Istituto nazionale di statistica / Italian National Institute of Statistics) is the primary source of official statistics in Italy. They publish long-running time series data on industrial production of food and beverages, transport equipment, and textiles covering over 100 years.

Information on the volumes of some industrial products has been also collected over time by public and private Institutions and published by Istat in the previous Sommari di statistiche storiche (Historical statistics summaries), for the period between 1861 to 1985. This information does not provide a coherent reconstruction of the development of the Italian industrial system; however, for the purposes of completing the historical overview of the statistics in the sector, this chapter does present some parts of the material collected over the years, with specific reference to the food, textiles and transport industries

Which industries have increased production, and which have decreased?
How has the average weight per ship changed over time?

Thank you to Nicola Rennie for curating this week’s dataset.

The Data

# Using R
# Option 1: tidytuesdayR R package 
## install.packages("tidytuesdayR")

tuesdata <- tidytuesdayR::tt_load('2026-05-05')
## OR
tuesdata <- tidytuesdayR::tt_load(2026, week = 18)

food_beverages <- tuesdata$food_beverages
textiles <- tuesdata$textiles
transport <- tuesdata$transport

# Option 2: Read directly from GitHub

food_beverages <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/food_beverages.csv')
textiles <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/textiles.csv')
transport <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/transport.csv')

# Using Python
# Option 1: pydytuesday python library
## pip install pydytuesday

import pydytuesday

# Download files from the week, which you can then read in locally
pydytuesday.get_date('2026-05-05')

# Option 2: Read directly from GitHub and assign to an object

food_beverages = pandas.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/food_beverages.csv')
textiles = pandas.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/textiles.csv')
transport = pandas.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/transport.csv')

# Using Julia
# Option 1: TidierTuesday.jl library
## Pkg.add(url="https://github.com/TidierOrg/TidierTuesday.jl")

using TidierTuesday

# Download datasets for the week, and load them as a NamedTuple of DataFrames
data = tt_load("2026-05-05")

# Option 2: Read directly from GitHub and assign to an object with TidierFiles

food_beverages = read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/food_beverages.csv")
textiles = read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/textiles.csv")
transport = read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/transport.csv")

# Option 3: Read directly from Github and assign without Tidier dependencies
food_beverages = CSV.read("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/food_beverages.csv", DataFrame)
textiles = CSV.read("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/textiles.csv", DataFrame)
transport = CSV.read("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-05-05/transport.csv", DataFrame)

How to Participate

Explore the data, watching out for interesting relationships. We would like to emphasize that you should not draw conclusions about causation in the data. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our suggestion is to use the data provided to practice your data tidying and plotting techniques, and to consider for yourself what nuances might underlie these relationships.
Create a visualization, a model, a Quarto report, a shiny app, or some other piece of data-science-related output, using R, Python, or another programming language.
Share your output and the code used to generate it on social media with the #TidyTuesday hashtag.
Submit your own dataset!

PydyTuesday: A Posit collaboration with TidyTuesday

Exploring the TidyTuesday data in Python? Posit has some extra resources for you! Have you tried making a Quarto dashboard? Find videos and other resources in Posit’s PydyTuesday repo.
Share your work with the world using the hashtags #TidyTuesday and #PydyTuesday so that Posit has the chance to highlight your work, too!
Deploy or share your work however you want! If you’d like a super easy way to publish your work, give Connect Cloud a try.

Data Dictionary

`food_beverages.csv`

variable	class	description
Year	integer	Year. The figures for the years between 1871 and 1950 refer to the fiscal year, which does not necessarily coincide with the calendar year; in particular, for the years between 1931 to 1950 the fiscal year began on 1st July. From 1951 onwards, the figures refer to the calendar year.
Sugar	integer	Sugar produced (tons)
Glucose	integer	Glucose, maltose, invert sugar produced (quintals)
Coffee_substitute	integer	Coffee substitute produced (quintals)
Seed_oil	integer	Seed oil produced (quintals)
Ethyl_alcohol_1	integer	Ethyl alcohol (1st category) produced (Ettanidro, 100 litres pure alcohol).
Ethyl_alcohol_2	integer	Ethyl alcohol (2nd category) produced (Ettanidro, 100 litres pure alcohol).
Beer	integer	Beer produced (hectolitres)

`textiles.csv`

variable	class	description
Year	integer	Year.
Cotton_yarn	integer	Cotton yarn produced (tons).
Flock_yarn	integer	Flock cotton yarn produced (tons).
Other_yarn	integer	Other fibres and fibre blend cotton yarn produced (tons).
Total_yarn	integer	Total cotton yarn produced (tons).
Cotton_textiles	integer	Cotton textiles produced (tons).
Flock_textiles	integer	Flock cotton textiles produced (tons).
Other_textiles	integer	Other fibres and fibre blend cotton textiles produced (tons).
Total_textiles	integer	Total cotton textiles produced (tons).
Raw_silk	integer	Raw silk produced (tons).

`transport.csv`

variable	class	description
Year	integer	Year.
Ships_launched	integer	Number of ships launched.
Ships_weight	integer	Tons of gross tonnage.
Steam_and_electric_engine	integer	Number of steam and electric engine rolling stock in the property of the Italian Railway Service.
Rail_cars_and_electric_locomotives	integer	Number of rail cars and electric locomotives rolling stock in the property of the Italian Railway Service.
Carriages_and_trailers	integer	Number of carriages and trailers rolling stock in the property of the Italian Railway Service.
Mail_luggage_vans_and_carriages	integer	Number of mail luggage vans and carriages rolling stock in the property of the Italian Railway Service.
Passenger_cars	integer	Number of passenger cars.
Other	integer	Number of other motor vehicles (e.g. lorries, vans and pick-ups, buses, trolleybuses, or special motor vehicles).

Cleaning Script

library(tidyverse)
library(readxl)

# Download files ----------------------------------------------------------

download.file("https://seriestoriche.istat.it/fileadmin/documenti/Table_14.4.xls", "italian-industry-production/Table_14.4.xls")
download.file("https://seriestoriche.istat.it/fileadmin/documenti/Table_14.5.xls", "italian-industry-production/Table_14.5.xls")
download.file("https://seriestoriche.istat.it/fileadmin/documenti/Table_14.6.xls", "italian-industry-production/Table_14.6.xls")

# Food and beverages ------------------------------------------------------

raw_data1 <- read_xls("italian-industry-production/Table_14.4.xls",
                      skip = 5, sheet = 1)
raw_data2 <- read_xls("italian-industry-production/Table_14.4.xls",
                      skip = 5, sheet = 2)
data1 <- raw_data1 |>
  drop_na(YEARS) |>
  rename(Year = YEARS,
         Sugar = `Sugar (tons)`,
         Glucose = `Glucose, maltose, invert sugar \n(quintals)`,
         Coffee_substitute = `Coffee substitute (quintals)`,
         Seed_oil = `Seed oil (quintals)`,
         Ethyl_alcohol_1 = `Ethyl alcohol  (ettanidri) (c)`,
         Ethyl_alcohol_2 = `...11`,
         Beer = `Beer (hectolitres)`
  ) |>
  select(!starts_with("...")) |>
  mutate(
    across(everything(), as.numeric)
  ) |>
  drop_na(Year)

data2 <- raw_data2 |>
  rename(Year = YEARS,
         Sugar = `Sugar (tons)`,
         Glucose = `Glucose, maltose, invert sugar\n (quintals)`,
         Coffee_substitute = `Coffee substitute (quintals)`,
         Seed_oil = `Seed oil \n(quintals)`,
         Ethyl_alcohol_1 = `Ethyl alcohol  (ettanidri) (c)`,
         Ethyl_alcohol_2 = `...11`,
         Beer = `Beer \n(hectolitres)`
  ) |>
  select(!starts_with("...")) |>
  mutate(
    across(everything(), as.numeric)
  ) |>
  drop_na(Year)
food_beverages <- rbind(data1, data2) |>
  mutate(across(
    everything(), as.integer
  ))
write_csv(food_beverages, "italian-industry-production/food_beverages.csv")

# Textiles ----------------------------------------------------------------

raw_data1 <- read_xls("italian-industry-production/Table_14.5.xls",
                      skip = 6, sheet = 1)
raw_data2 <- read_xls("italian-industry-production/Table_14.5.xls",
                      skip = 7, sheet = 2)

data1 <- raw_data1 |>
  rename(Year = "...1",
         Raw_silk = "...12",
         Cotton_yarn = Cotton...2,
         Flock_yarn = Flock...3,
         Other_yarn = `Other fibres and fibre blend...4`,
         Total_yarn = Total...5,
         Cotton_textiles = Cotton...7,
         Flock_textiles = Flock...8,
         Other_textiles = `Other fibres and fibre blend...9`,
         Total_textiles = Total...10
  ) |>
  select(!starts_with("...")) |>
  mutate(
    across(everything(), as.numeric)
  ) |>
  drop_na(Year)

data2 <- raw_data2 |>
  rename(Year = "...1",
         Raw_silk = "...12",
         Cotton_yarn = Cotton...2,
         Flock_yarn = Flock...3,
         Other_yarn = `Other fibres and fibre blend...4`,
         Total_yarn = Total...5,
         Cotton_textiles = Cotton...7,
         Flock_textiles = Flock...8,
         Other_textiles = `Other fibres and fibre blend...9`,
         Total_textiles = Total...10
  ) |>
  select(!starts_with("...")) |>
  mutate(
    across(everything(), as.numeric)
  ) |>
  drop_na(Year)

textiles <- rbind(data1, data2) |>
  mutate(across(
    everything(), as.integer
  ))
write_csv(textiles, "italian-industry-production/textiles.csv")

# Transport ---------------------------------------------------------------

raw_data1 <- read_xls("italian-industry-production/Table_14.6.xls",
                      skip = 7, sheet = 1)
raw_data2 <- read_xls("italian-industry-production/Table_14.6.xls",
                      skip = 7, sheet = 2)

data1 <- raw_data1 |>
  rename(Year = "...1",
         Ships_launched = Number,
         Ships_weight  = `Tons of gross\n tonnage`
  ) |>
  select(!starts_with("...")) |>
  mutate(
    across(everything(), as.numeric)
  ) |>
  drop_na(Year) |>
  pivot_longer(-Year) |>
  mutate(name = str_remove(name, "\\(c\\)"),
         name = str_trim(name),
         name = str_replace_all(name, "\n", ""),
         name = str_replace_all(name, " ", "_")) |>
  pivot_wider()

data2 <- raw_data2 |>
  rename(Year = "...1",
         Ships_launched = Number,
         Ships_weight  = `Tons of gross tonnage`
  ) |>
  select(!starts_with("...")) |>
  mutate(
    across(everything(), as.numeric)
  ) |>
  drop_na(Year) |>
  pivot_longer(-Year) |>
  mutate(name = str_remove(name, "\\(c\\)"),
         name = str_trim(name),
         name = str_replace_all(name, "\n", ""),
         name = str_replace_all(name, " ", "_")) |>
  pivot_wider()

transport <- rbind(data1, data2) |>
  mutate(across(
    everything(), as.integer
  ))
write_csv(transport, "italian-industry-production/transport.csv")