A retrospective of 15 years of data visualization projects
October 24th, 2024
4 min
Datawrapper lets you show your data as beautiful charts, maps or tables with a few clicks. Find out more about all the available visualization types.
Our mission is to help everyone communicate with data - from newsrooms to global enterprises, non-profits or public service.
We want to enable everyone to create beautiful charts, maps, and tables. New to data visualization? Or do you have specific questions about us? You'll find all the answers here.
Data vis best practices, news, and examples
250+ articles that explain how to use Datawrapper
Answers to common questions
An exchange place for Datawrapper visualizations
Attend and watch how to use Datawrapper best
Learn about available positions on our team
Our latest small and big improvements
Build your integration with Datawrapper's API
Get in touch with us – we're happy to help
This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.
Hi, I’m Jakub, the backend team leader at Datawrapper. Today I will dive into the challenges of running a large web application.
Datawrapper is available 24/7, but most people make visualizations on a more regular schedule. Our users create and publish a lot of charts during the day and fewer during the night. Moreover, exceptionally high usage happens during large media events such as elections. We have to adjust our infrastructure to handle demand when it’s high without wasting resources when it’s low.
When talking about software infrastructure, scaling is the practice of having several servers provide your product. Autoscaling is then the practice of automatically adjusting the number of these servers. When usage of your product increases, autoscaling starts more servers. And when the usage drops, it stops them again. So the whole system is neither overloaded because too few servers are running, nor wasting money because too many are. It’s optimized.
To achieve a correct autoscaling configuration, you have to choose the right metric to determine whether servers are currently overloaded or idling. It can be CPU, memory utilization, or, in the case I’m looking at today, the number of visualizations that users are trying to export as PNG, PDF, or SVG files. Then you have to decide how to respond to that metric: how many servers should start when it goes too high, and how many should stop when it dips too low?
Knowing all the variables, we wrote Datawrapper’s first autoscaling configuration and naturally chose to make things fast. As soon as we registered a high number of visualizations that needed exporting, we started several servers, and we stopped them equally quickly when the number of exports decreased.
Yet the starting of servers wasn't fast enough. The number of waiting visualizations often reached unacceptable levels. Visualizations waiting meant users were waiting. And no one likes to wait.
One solution to such a problem is to try to start even faster. You can, for example, reduce the size of each server by installing as little software on it as possible. But what if you could win the race by going slower?
We figured out that a better strategy was to stop our servers much more gradually. This is called a scale-in cooldown. So even after a peak in the number of visualizations to export, we still keep a good number of servers available to handle the next peak. The unacceptable waiting times are gone.
I hope you enjoyed today's software-focused blog post and, if you are a Datawrapper user, the chance to export your charts any time of the day, even during elections. Next week, our support engineer Shaylee is going to make a brand new Weekly Chart.
Comments