The gender income gap, squared
November 14th, 2024
3 min
Datawrapper lets you show your data as beautiful charts, maps or tables with a few clicks. Find out more about all the available visualization types.
Our mission is to help everyone communicate with data - from newsrooms to global enterprises, non-profits or public service.
We want to enable everyone to create beautiful charts, maps, and tables. New to data visualization? Or do you have specific questions about us? You'll find all the answers here.
Data vis best practices, news, and examples
250+ articles that explain how to use Datawrapper
Answers to common questions
An exchange place for Datawrapper visualizations
Attend and watch how to use Datawrapper best
Learn about available positions on our team
Our latest small and big improvements
Build your integration with Datawrapper's API
Get in touch with us – we're happy to help
This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.
Hello, I’m Jack, a developer at Datawrapper. It’s Thursday again, so here comes a new musical Weekly Chart!
This week I took inspiration from two sources: my colleague Vivien’s chart from a couple of weeks ago, where she analyzed lyrics from the band TEMMIS, and Matt Daniels’ superb article “The Largest Vocabulary In Hip Hop” where he ranked rappers by the number of unique words used in their lyrics.
I wanted to try to visualize not only the absolute size of my favorite musicians’ vocabulary, but also how unique their lyrics are relative to other artists’.
As a big fan of concept albums, I decided to analyze the lyrics on a per-album basis. My first step was to find my 200 most listened to albums using the LastFM API. I then filtered out singles, instrumental albums, and albums in languages other than English. Next, using the Genius API, I fetched the full lyrics for each remaining album.
Next the lyrics need to be processed:
Now that the data is clean, I can start analyzing it.
First, I counted the number of non-repeated words in each album, only counting a word if it hadn’t already appeared in that album already. Dividing this by the total number of words in the album shows how varied the lyrics are within the album. Let’s call this measure variety; albums with higher variety spend less time repeating themselves.
I then counted all the words that appeared only in that album — and not in any other from my top 200 — and divided that by the number of non-repeated words. Let’s call this uniqueness; albums with higher uniqueness spend more time saying things no one else has said. I plotted the albums with variety on the vertical axis and uniqueness on the horizontal. (To keep the chart from getting too crowded, only the top 100 albums are actually shown).
Interestingly, the albums from a given artist tend to be close to each other on both axes, showing consistency in their lyrical style. Hip-hop albums (the diamond-shaped symbols) tended to be higher on both metrics than other genres, with "Get Rich Or Die Tryin’" by 50 Cent being the only rap album with no unique lyrics.
As I expected, Aesop Rock ends up at the top of both rankings. With his famously verbose and high-concept albums — including songs about politics, murders in the 80s, therapy, his cat Kirby, and everything in between — it’s unsurprising that those lyrics are so unique. Lupe Fiasco and Sa-Roc, with their dense political lyrics, are similarly close to the top-right. Surprisingly, although 2Pac’s albums "All Eyez on Me" and "Better Dayz" have the most total lyrics, they have the least variety of any rap albums in the data set.
Pink Floyd's "The Wall" ended up with a much higher uniqueness score than I expected; I took a closer look and realized this was because of one song, "Empty Spaces." It contains a dialog with backwards speech, which makes for some very unique words…
I hope you enjoyed visualizing some vernacular variability with me. Check in next week for a new Weekly Chart from Julian.
Comments