The gender income gap, squared
November 14th, 2024
3 min
Datawrapper lets you show your data as beautiful charts, maps or tables with a few clicks. Find out more about all the available visualization types.
Our mission is to help everyone communicate with data - from newsrooms to global enterprises, non-profits or public service.
We want to enable everyone to create beautiful charts, maps, and tables. New to data visualization? Or do you have specific questions about us? You'll find all the answers here.
Data vis best practices, news, and examples
250+ articles that explain how to use Datawrapper
Answers to common questions
An exchange place for Datawrapper visualizations
Attend and watch how to use Datawrapper best
Learn about available positions on our team
Our latest small and big improvements
Build your integration with Datawrapper's API
Get in touch with us – we're happy to help
This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.
Jittering, explained
This week, let’s talk about chocolate. “We did that already, Lisa”, I hear you saying, “remember last week? Snickers, Bounty, Galaxy, we got it.” Yeeeah, but can we really call that chocolate? Don’t we feel a bit uncomfortable every time we call Galaxy/Dove “chocolate”; knowing that this experience is miles away from what grounded cacao beans can truly be?
“Transcending beyond the ordinary limits”, Brady Brelinski promises us on his website. He’s the founding Member of the Manhattan Chocolate Society, and he has tried and rated 1800 dark chocolate bars over the last ten years. (If a chocolate transcends Brelinski, he gives it the highest rating, 5, “Elite”.)
This data gives me hope. When I tried 100% chocolate ten years ago and found it dry and bitter, it made me a bit sad. It meant (so I thought) that I don’t appreciate cacao enough. It meant that my career as a chocolate expert was over before it begun. But maybe I’ve been a top-notch cacao appreciator all this time! Maybe I was right in despising 100% chocolate! Because, yes: Brelinski doesn’t like it either. And he must know, right?
The evidence is clear: For all but four companies of the 17 shown, Brelinski rated their chocolates with more than 90% cacao the worst. And all but two companies would have a higher average rating on Brelinski’s list if they had not sold him their darkest chocolate. Seems like the sugar in chocolate is useful after all. If data vis will get boring at some point, there is a plan B.
This week, I chose a scatterplot for my data. “Wait a minute,” I hear you saying (you have a lot to say this week), “that’s clearly a dot plot.” How about we’re both right: It looks like a dot plot, but I did create it with the scatterplot feature in Datawrapper. A dot plot is basically just a scatterplot with “rows” – and the scatterplot feature gave me options (like tooltips and customized colors) that we don’t currently offer in our dot plot feature. Hover over the chart and click on “Edit this chart” to see how I created it!
The most interesting part of this chart might be the jittering. Every chocolate is rated at intervals of 0.25, e.g. as a 3.75, a 4.0 or a 4.25. Different chocolates from one maker can have the same ratings, though. For example, Brelinski gave six chocolates from Bonnat a 3.5-rating. Without jittering, the six triangles presenting these chocolates would lay exactly on top of each other. That makes it impossible to see how many chocolates got this rating, and it also makes it impossible to see the tooltips for all these triangles. “Jittering” means that I gave every triangle a slightly different (random) position on the x-axis, like 3.55 or 3.47. You can hover over all of them and can still see the actual rating in the tooltip. Read this footnote if you want to find out how to calculate these jittered ratings → [1].
Enjoy a chocolate bar or two…and I see you next Thursday!
You can calculate a jittered rating either in your spreadsheet application, like Excel or Google Spreadsheet; or directly in Datawrapper. I’ll explain the concept first.
What do we want to achieve? We want to replace every 3.5-rating with a new, slightly off jittered rating, that won’t interfere with the new, slightly off jittered ratings of the 3.25-ratings and the 3.75-ratings. So every rating gets a “buffer zone” in which its randomly assigned ratings can live. The buffer zone around each rating is 0.25. That’s 0.125 to each side – meaning, our 3.5-ratings should live between 3.375 and 3.625 (the 3.75 ratings will start at 3.625).
And that’s how we calculate it: We generate random numbers between 0 and 0.25 and add it to our ratings (3.5 + 0.21, 3.5 + 0.01, 3.5 + 0.19, etc.). Now all our jitter ratings are between 3.5 and 3.75. To bring them in the range of 3.375 and 3.625, we subtract 0.125.
In Excel, that’s =A1+(RAND()/4)-0.125
. (RAND()
creates a number between 0 and 1. We divide it by 4 to get a number between 0 and 0.25.)
In Datawrapper, you can click on “Add column” in step 2. Add a new column, give it a name like “jitter_ratings” and then click on “ratings” in the list of available columns below. Now we use the Javascript way to create a random number; everything else stays the same: ratings + (Math.random()/4)-0.125
. Done! Click on Close and admire your new column. ↩︎
Comments