The gender income gap, squared
November 14th, 2024
3 min
Datawrapper lets you show your data as beautiful charts, maps or tables with a few clicks. Find out more about all the available visualization types.
Our mission is to help everyone communicate with data - from newsrooms to global enterprises, non-profits or public service.
We want to enable everyone to create beautiful charts, maps, and tables. New to data visualization? Or do you have specific questions about us? You'll find all the answers here.
Data vis best practices, news, and examples
250+ articles that explain how to use Datawrapper
Answers to common questions
An exchange place for Datawrapper visualizations
Attend and watch how to use Datawrapper best
Learn about available positions on our team
Our latest small and big improvements
Build your integration with Datawrapper's API
Get in touch with us – we're happy to help
This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.
Hi, this is Hendrik, backend developer at Datawrapper. Most of the time you won’t hear about my work, which is good — that means Datawrapper is healthy and everything is available to you the way you are used to. When I was thinking about what to write this week, area charts came into my mind. I’ve been doing a lot of these lately to visualize Datawrapper internals.
Have you ever stared at an area chart thinking: “This looks like something completely unrelated”? Almost forgetting about the data that the chart was designed to actually show? It happens to me all the time, so I thought, let’s push it to the extreme. How much can I make an area chart look like something else?
I decided to draw a skyline of the Cathedral of Magdeburg. Being more than 800 years old, it’s the most famous building of Magdeburg, where I live. So I took my bike, rode out to the east side of the river Elbe, tried out some lenses and took a decent shot of it. At that point I thought I was almost done.
All that was left to be done was converting the picture to polygons and turning them into a chart.
Simple, right?
The first thing I got wrong: Area charts don’t render polygons. They’re basically line charts, just filled below the lines. As such they can visualize mathematical functions, but not polygons.
What’s the difference? I’ll spare you the boring mathematical theory and put it very simply. For every value of X
there can be at most one value of Y
. Given that, it’s impossible to draw even something as simple as a circle using a function.
So I sat back for about three hours, drawing one line after another in Inkscape, making use of the advantage that I, as a human, have an easy understanding of concepts like “this is a tree.” Then the only thing left to do was convert these lines to CSV data columns. Once again, I thought, this must be easy — SVG is a human-readable XML format after all. Parsing the data should be straightforward.
Well, not quite. Most things in Inkscape are paths, the most versatile type in SVG. Unfortunately the default format of paths does not store absolute points, like those I would need for a CSV. They have a single starting point and every other point is described in relation to the point before.
Fortunately, converting SVG paths back to polygons is already a solved problem. At this point I decided to go with JavaScript, though most of the time when I work with datasets I use sqlite3. When you’re in backend long enough, everything starts to look like a database.
I ended up having an array of arrays of points (which were themselves little arrays). Some merging, sorting, and clamping and I had everything in the right order as plain integer values. (Well, I needed to mirror them at Y
because SVG has the origin in the top left corner.)
My initial idea of writing out the data was to only write a line for a given value of X
when at least one of my lines had a point there. Any other row should just be skipped.
Here comes the final shock: This doesn’t work with Datawrapper.
The river Elbe, for example, was approximated by a single line from [0, 50]
to [599, 50]
. That worked for as long as the river was the only thing in my chart. But as soon as I added different lines for other things, having more points at intermediate X
values, the “water level” of the Elbe dropped to zero. Datawrapper will only interpolate between two points if there are no other undefined points in between. So I needed to apply linear interpolation to my dataset before even uploading it to Datawrapper.
Ten lines of code later, I had a datapoint for every value of X
in every line of the dataset. Here’s the chart it made:
I think this is a good example of what software development looks like most of the time. It's always a good thing to plan your work, but if your plans are too detailed, reality will prove them wrong on some point. You’re going to need to adapt your plans to overcome these challenges.
If someone is telling you that their software development plans work out 100% of the time, then they’re probably lying — or being lied to themselves, by the ones who do the actual work.
That’s it for today, I hope you enjoyed this bumpy ride as much as I did. Don’t miss out on our next Weekly Chart!
Comments