Analyzing Spotify songs data in R, a quick rundown

Photo by sgcdesignco on Unsplash

A few days ago I ran into this quite underrated and unknown package called SpotifyR, which pulls data from — you guessed it — Spotify. Although R undeservingly gets a bad reputation especially when compared to Python or other “cleaner” programming languages, I find it fun to use and fairly quick to process data when we’re not working with huge dataframes and thousands of data points.
Let’s dive into SpotifyR and find you what exactly it can and can’t do. As described by its developer Charlie Thompson:

SpotifyR is an R wrapper for pulling track audio features and other information from Spotify’s Web API (…) it allows you to enter an artist’s name and retrieve their entire discography in seconds, along with Spotify’s audio features and track/album popularity metrics.

There are a few steps related to setting up your Spotify Developer account if you don’t have one already:

  1. Login with your Spotify account here (create one if you don’t have it):
    https://developer.spotify.com/dashboard/login

As it’s not a commercial project — not yet, anyway — you can skip the part that concerns Commercial integration.

Next, entering your project Dashboard you should see the Client ID and Client Secret. You’re gonna use those.

For more information on the Spotify API, some of the metrics, functions and data you can use from Spotify:

You’re all set up, now on to RStudio!

If you want the full code for this, go to my Github page. And now, let’s see what kind of information we can extract and use with SpotifyR:

Your favorite songs/artists

Function get_my_top_artists_or_tracks is one the best of the package. It allows us to see the user’s favorites songs or artists for the short, medium or long term. In addition to that, you can change the parameter type to “tracks” or “artists” to obtain exactly what you need.

These are my top 5 artists in the medium term. Looks pretty good to me, they are 5 of the bands I listened to the most on Spotify.

Your recently played songs

Self-explanatory, it gets up to the minute the last songs you played on Spotify.

Analyzing a playlist

You can also pull data from your (or others’) playlists. Here I use data from my “Your Top Songs 2019” playlist which is the recap for the year 2019 of your 100 most listened songs.
I computed the average of a few musical metrics (energy and valence) and the popularity coefficient for all songs in my playlist and lastly, with a tolerance of 5%, I found the songs that are most representative, i.e. the songs that most resemble my average taste in music.

As you can see, one can extract many insights with just a quick look at the data, for instance: apparently, I like songs with high energy and middle-of-the-pack in terms of valence and popularity. :)

A deeper analysis of the whole catalog of an artist

Spotify has a pretty comprehensive database of information for millions of songs. Each song has a few metrics that represent numerically musical features such as loudness, tempo, key, etc. A full list can be found on the above mentioned Spotify Page for developers, but for shortness, we’ll focus on only one of them, of that’s pretty characteristic of Foo Fighters throughout their career: Energy.
In the context of Spotify, Energy is described as:

A measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.

Fast, loud and noisy, they said? Sounds like Foo Fighters to me.
One cool way to visualize graphically one metric is through a Ridge Plot. It shows a distribution of said metric within each album. Let’s do that for the metric of Energy.

And wouldn’t you know, most Foo Fighters songs rate pretty high on the energy spectrum. No surprises here!

Easily enough, we can also show a table with the 10 songs with the most energy:

Analysis of a specific album

The Colour and The Shape, Foo Fighters’ second album from 1997 is their most critically acclaimed and well regarded as their masterpiece. It has easily recognizable songs such as Everlong, My Hero, Monkey Wrench.
Here’s how you can see information for each song individually. Through conditional formatting, we’re highlighting the songs with the most energy and valence within this album.

Final Thoughts

SpotifyR makes it super easy to analyze Spotify data for playlists, artists, songs and users and can be used in a variety of ways, be it for classification/recommendation systems, or just for fun if you are a data enthusiast like me.

Following this article, I’ll look at some of Foo Fighters albums in more depth including lyrics analysis.

Aspiring Data Scientist, compulsive reader and just hoping to survive the pandemic one cup of coffee at a time

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store