Hi! This is Ivan, developer here at Datawrapper. This week we’ll look at the ridiculous amount of money that some people pay for music.
In the age of digital streaming, it’s hard to imagine that anyone would value music at over $10 per month that a Spotify (or an alternative service) subscription costs. This relatively low fee gives access to pretty much all music ever recorded, so why would anyone pay more?
Turns out that sometimes it’s not that simple. For some people, there is a certain value attached to owning music and not just having the ability to listen to it. It’s not only owning the music but owning an object that music is recorded on that is important.
So let’s find out exactly how much some records have been sold for:
And the winner is: The Black Album by Prince which reportedly sold for $27,500. That’s quite a lot of money to pay for 8 songs. In fact, you could get a Spotify subscription in the US that would last 2,750 months (that’s a little over 229 years) for the same amount of money. I checked and you can’t stream The Black Album on Spotify (at least in Germany), but you can listen to it for free on YouTube.
We can only speculate about the exact reasons why someone would pay $27,500 for a single vinyl record, but I think it must be related to exclusivity. Most records that sell for a lot of money are rare (sometimes first) editions, or they are special in some other way. For example, the Prince record mentioned above is an album that was never properly released, so there must be only a few copies of it in existence. This scarcity is what drives prices up and some music collectors are prepared to pay up to own these records.
I chose to present the data in a table. Each music release transaction in the dataset is unique and a table is a great medium to showcase detailed and precise information for each entry. A table also lets you easily explore the data.
It’s possible to sort the table, so you can, for example, see if any recently released records sell for a lot of money (click on Released in twice to do that). You can also search the table! So you find out if your favorite artist makes an appearance. Or you can search for “Melody A.M.” and see how often this release by Röyksopp made it into the list (turns out that Banksy hand-made the artwork for it — hence the price).
Each entry in the table also includes a link to Discogs, so you can find out more information about it: for example, some users leave interesting comments or you can see if there is a copy that you want to buy on the marketplace. I used the awesome Markdown feature to put links and images into the table.
Getting and preparing the data
The table was really easy to make thanks to Datawrapper, but unfortunately, the data itself was not as easy to obtain.
The Discogs API does not allow searching the marketplace for current listings or purchase history, so I had to pull the data from the Discogs blog. There is a blog post on top 100 most expensive records sold on Discogs, but it came out more than a year ago, so I also had to include monthly posts that list top 30 expensive records sold in that month. I ended up scraping all those blog posts and combining the data to get the dataset that I required.
I won’t go into full details in this post, but will instead outline the steps that I took to get the dataset.
- Get a list of releases by scraping the blog post with top 100 most expensive records which was published in March 2019 (to make multiple requests I used the same technique as I described on my personal blog recently).
- Get a list of releases by scraping monthly top 30 posts from March 2019 to February 2020.
- Combine the two lists from above, sort them by price, and include top 100 in the final list.
- Download and resize images for the top 100 entries that made it into the final list.
- Assemble the dataset in the correct format for Markdown and export the columns as CSV.
If you’re interested, you can find all scripts that I used to get the dataset here.
After doing all the above steps I finally had the dataset which I used to create the table above!
I hope you enjoyed my journey into the music collectors’ world. Next time you’re looking through a stack of vinyl at a flea market or a charity shop, watch out, you might be flipping through a goldmine! If you have any questions or comments, you can reach me at email@example.com or on Twitter. See you next week!
A release on Discogs is a specific version of a music release which shares all the same features. For example, an LP album by a band might have been originally released in 1970 in the US, whereby 10,000 copies of it got pressed on vinyl. This counts as a single release. If the same album got pressed on vinyl in the same year in Germany, that’s a different release. If the same album got re-pressed in 1975 in the US, that’s yet another release. And so on. ↩︎
When a purchase is made on the Discogs marketplace, we don’t know exactly which copy of the release gets sold. So for all the duplicate entries in the table above, we don’t know if they were different copies of the release or the same copy that got sold. To use an example from the table: Melody A.M. by Röyksopp is included 5 times. However we don’t know if it’s literally the same copy of it that got sold 5 times from one person to the next, or if 5 different copies of it each got sold once (or a combination of the two). ↩︎