A recipe for Datawrapper

Heyo, I’m Antonio, developer on Datawrapper’s app team. This week, I wanted to find out how the technologies that go into Datawrapper have evolved over the past years.

As Marten, head of Datawrapper’s app development, wrote in a blog post last week, our team recently turned off a homegrown solution for server-side rendering in favor of SvelteKit, the official solution for app development made by the maintainers of Svelte. Since I only started working at Datawrapper a couple of months ago, that article inspired me to go back in time and try to visualize the history of our tech stack!

I’ve written a little script that goes back month by month and counts the lines of code per technology, allowing me to visualize the composition of Datawrapper over time inside Datawrapper itself:

The first thing that jumps out is that our codebase is still dominated by JavaScript. We’re in the process of migrating these modules to TypeScript, a more modern and less error-prone flavor of JavaScript, and you can see how TypeScript has gained steam since we first introduced it in October 2022. By now, newly written modules are generally TypeScript by default.

The beginning of this year also marked the end of PHP at Datawrapper; as a result of the major migration Marten described, the last lines of leftover PHP were ceremonially deleted in February. We also see the continual rise of Svelte, which goes along with a decrease in lines of code written in HTML. This is partly due to the fact that, as of version 3, Svelte code is written in its own .svelte files instead of using .html.

Generating the data

I used TypeScript to write the script that gathered this data. To find out the composition of our tech stack, I chose to use a program called cloc to count the lines of code per technology inside our code repository. It detects the programming language and can filter out blank lines and lines that are just comments, so we only count actual source code. Then I used git, our version control system, to see how the code looked in the past.

I’ve chosen to only go back as far as October 2021, since that's when all the code used to build Datawrapper was added to one singular code repository. I’ve also chosen to ignore some technologies that didn’t seem interesting, like Markdown (used for internal documentation) and JSON (used for translation strings, listing of dependencies, and other configurations). To remove some clutter, the script also discards any technology that never reached above 200 lines of code. I’ve also not counted every single file of our codebase, but tried to filter out what wasn’t written by the staff, like generated output files and external dependencies. The script can be found in my GitHub repository weekly-chart-technologies-over-time, if you're interested in seeing the exact commands used or want to play around with it yourself.

I hope you enjoyed this look behind the tech of Datawrapper and companion piece to the SvelteKit migration article. Tune in next week for a fresh new Weekly Chart by Ivan of our visualization team. 🌸