Video Games and Sliced

The data this week comes from Steam by way of Kaggle and originally came from SteamCharts. The data was scraped and uploaded to Kaggle.

Note there is a different dataset based on video games from 2019’s TidyTuesday, check it out here, there’s a possibility that some of the data could be joined on “name”.

Additionally we are doing a crossover with the “Sliced” data science challenge this week!

Make sure to tune in to “Sliced” on Nick Wan’s Twitch stream, Tuesday March 16th at 8:30 pm ET!

What is Sliced? It’s like Chopped but for Data Science!

Data scientists get data they have never seen and have 2 hours to make a predictive model. Create the best data science or be sliced!

This is inline with the TidyTuesday efforts, and I look forward to seeing what they do with the stream.

Get the data here

# Get the Data

# Read in with tidytuesdayR package 
# Install from CRAN via: install.packages("tidytuesdayR")
# This loads the readme and all the datasets for the week of interest

# Either ISO-8601 date or year/week works!

tuesdata <- tidytuesdayR::tt_load('2021-03-16')
tuesdata <- tidytuesdayR::tt_load(2021, week = 12)

games <- tuesdata$games

# Or read in the data manually

games <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2021/2021-03-16/games.csv')

Data Dictionary

`games.csv`

variable	class	description
gamename	character	Name of video games
year	double	Year of measure
month	character	Month of measure
avg	double	Average number of players at the same time
gain	double	Gain (or loss) Difference in average compared to the previous month (NA = 1st month)
peak	double	Highest number of players at the same time
avg_peak_perc	character	Share of the average in the maximum value (avg / peak) in %

Cleaning Script

No cleaning this week!