From news media to Datawrapper: Six things I’ve learned in my first year
November 8th, 2023
5 min
This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.
Hi, this is Marten, head of app development at Datawrapper. This article gives a technical deep dive into the challenges we encountered while migrating our web app to SvelteKit. If terms like server-side rendering or client-side hydration excite you — read on! If not, fear not: The next data vis article will come to this blog soon.
Last week, we celebrated a major milestone at Datawrapper: Our web application is now served entirely by SvelteKit!
This milestone represents the final episode of a journey we embarked on over five years ago: modernizing the tech stack that had been the backbone of Datawrapper since its launch in 2012. In this post, we’ll share the lessons learned and challenges encountered while migrating an app from PHP + jQuery to Node + SvelteKit — an app that hundreds of thousands of people around the world use to create millions of charts, maps, and tables for publications including the New York Times, Washington Post, and Spiegel.
Before we get there, let’s rewind the clock a bit.
⏩ If you want to skip the history lesson and jump straight to the lessons learned, click here.
1 The long refactoring journey: The need for change / Hello Svelte
2 Hapi-ly ever after?
3 Going all in on SvelteKit: What went well / Challenges of the migration
4 The journey continues
Back in 2012, when our cofounder Gregor built the first version of Datawrapper,[1] the tech stack was quite different from what we have today. It consisted of MySQL, PHP using the Slim framework and Twig templating engine, and jQuery running on the client.
This made perfect sense back then. PHP was extremely popular, powering big sites like Facebook, Wikipedia, and the whole WordPress ecosystem. jQuery was a necessity on the client side. Gregor did also consider the flashier, Python-based Django framework, but PHP was easier to install and self-host, which — following WordPress’ successful example — was a key feature of Datawrapper’s initial concept as an open-source service.
For a long time after launch, Datawrapper remained an open-source project. Usage was constantly increasing, but with no full-time employees any bigger refactorings of the codebase were out of the question — all efforts went into launching new features!
Fast forward to 2017. The world of web development had evolved: jQuery was approaching its last major version, web APIs were standardizing across browsers, and frameworks like React were leading the charge in frontend development.
Cofounders Mirko, Gregor, and David were ready to turn Datawrapper into a full-fledged company, but our tech stack was still stuck in 2012. Many of the PHP dependencies the app relied on were no longer maintained, and all of the newly hired developers had a clear preference for the Javascript/Typescript ecosystem.
It would have been tempting to rewrite the app from scratch with more modern tools. But for a small company of still fewer than five developers, a full rewrite would have taken months or years — right at a time when we really needed to be launching new features and growing our customer base. Rewriting from scratch might have been tempting but it will always take longer than initially estimated, might not yield a better end result, and could even have put the company’s future at risk!
This is where Svelte came into the picture. After seeing Rich Harris’ launch announcement, we were immediately sold on his idea of “a framework without a framework.” You could build reactive components using HTML, CSS, and JavaScript, and the Svelte compiler would turn them into tiny, standalone JavaScript modules without requiring an expensive runtime.
We started experimenting with the new framework and soon realized a path forward: We could replace individual pieces of our front end with standalone Svelte apps, slowly expanding these apps to cover more and more functionality until only the route handlers were left in PHP. Then those handlers could be rewritten with Node.js, and the transition away from PHP would be complete.
With this strategy, we could modernize our app from the inside out without a risky rewrite!
And so, step by step, we started the migration. Here’s a rough timeline of how it played out:
If we were starting to migrate our controllers from PHP to Node today, we’d probably go with SvelteKit right away. But when we started in early 2021, SvelteKit was still called “Sapper” and was nowhere near ready for production.
Instead, we built a custom setup based on the Hapi web framework. Our application consisted of different server-side rendered pages, which then got hydrated client-side to make the page interactive.
In server-side rendering, the HTML content of a web page is generated on the server in response to a client request (rather than being rendered in the browser using JavaScript). This makes its content immediately viewable and often results in improved performance and better search engine optimization.
Client-side hydration is the process by which a statically rendered page becomes fully interactive in the browser. Once the page's static content is loaded and displayed, it’s then “hydrated” by client-side JavaScript, which binds event listeners and adds interactivity.
Using these techniques together allows for a rapid display of initial content while still offering a dynamic, app-like experience once the page is hydrated.
Here’s an example of a typical route handler in our Hapi setup — this one handles requests for the page displaying the Datawrapper dashboard:
// routes/dashboard.js
server.route({
method: 'GET',
path: '/',
options: {
auth: 'user',
validate: {
query: Joi.object().keys({
chart: Joi.string().optional(),
...
})
},
async handler(request, h) {
...
const recentlyEdited = ...
const recentlyPublished = ...
return h.view('dashboard/Index.svelte', {
props: {
...
recentlyEdited,
recentlyPublished
}
});
}
}
});
This route handler does a few things:
auth: 'user'
makes sure that only logged-in users can access the page.query: Joi.object(...)
uses Joi to validate the route and its query parameters.handler
collects the necessary data to serve the request. In this case, that’s lists of the current user’s recently edited and published visualizations.h.view
method invokes our custom rendering engine for the @hapi/vision plugin, returning a view compiled from a Svelte component (dashboard/Index.svelte
) that passes the lists of recent visualizations as props.This is what the corresponding Svelte component of the dashboard page looks like:
// dashboard/Index.svelte
<script type="text/javascript">
import MainLayout from '_layout/MainLayout.svelte';
import RecentVisualizations from './RecentVisualizations.svelte';
export let recentlyEdited;
export let recentlyPublished;
...
</script>
<MainLayout>
<RecentVisualizations {__} {recentlyEdited} {recentlyPublished} />
</MainLayout>
When building our app, we’d use Rollup to compile two different versions of this component — one bundle used for server-side rendering, and one for the client side. When the route handler invoked our custom Hapi rendering engine, the following happened under the hood:
dashboard/Index.svelte
) and generate HTML and CSS with properties (i.e. the lists of recent visualizations) from the route handler: const app = requireViewSSR(page); // e.g. page = 'dashboard/Index.svelte'
const { css, html, head } = app.render(props);
const template = await getTemplate('base.ejs');
return ejs.render(template, {
...
SSR_CSS: css,
SSR_HTML: html,
PAGE: page,
PAGE_PROPS: jsesc(JSON.stringify(props)),
...
});
// base.ejs
<html lang="en" class="<%- HTML_CLASS %>">
<head>
...
<style type="text/css">
<%- SSR_CSS %>
</style>
</head>
<body>
<%- SSR_HTML %>
<script async defer>
require(['/csr/<%= PAGE %>.js'], function(App) {
var props = JSON.parse(<%- PAGE_PROPS %>);
var app = new App({
target: document.body,
props: props,
hydrate: true
});
...
});
</script>
...
</body>
</html>
With this setup, we now had server-side rendered pages with client-side hydration, and our page components were written entirely in Svelte.
The Hapi migration also gave us an opportunity to improve the user experience on some parts of our app. In the old PHP setup, switching between steps in our chart editor required waiting for a full page reload in the browser. As we migrated the editor to our Hapi-based setup, we implemented a new client-side routing logic in the editor, making it possible to switch between steps without reloading. Other parts of the app, such as the team settings or account pages, each got their own client-side route logic — essentially turning Datawrapper into a mixture of multi-page application (MPA) and single-page application (SPA) experiences.
A multi-page application is a traditional web app model where navigating between pages causes a whole new HTML document to be loaded from the server. In contrast, a single-page application loads just one HTML document and then updates it dynamically on the client side as the user interacts with the app.
Step by step, we migrated all our old PHP routes to the new setup. And it worked great!
Your whole charting experience got a lot faster! We turned our app into a single-page application, so there’s no wait for the page to reload when moving between steps in the editor (e.g., from “Check & Describe” to “Visualize”) ⚡️ pic.twitter.com/sivqswJBr5
— Datawrapper (@Datawrapper) November 30, 2022
There was just one problem: As mentioned earlier, each Svelte page component had to be pre-compiled into two separate bundles, one server-side and one client-side. The more stuff we migrated to our new setup, the longer our compile times got.
As a result, the developer experience suffered. After any little change, we were stuck waiting 30–60 seconds for the compile step to finish before the change could be observed and tested in the browser. We were able to keep compile times manageable for a while by adding smaller optimizations like code-splitting. But by the time we had migrated our beloved chart editor and turned it into a single-page app, it had become painfully obvious that something needed to change.
Again.
Around the time that we shut down our PHP server for good, SvelteKit 1.0 had just been released. SvelteKit is a web application framework centered around Svelte, and it offered many of the features we’d already set up for ourselves, such as server-side rendering, client-side hydration, routing, and bundling. What’s more, SvelteKit leveraged Vite’s Hot Module Replacement (HMR) to deliver a lightning-fast developer experience — plus additional features like smart data loading, integrated support for layouts, and progressive enhancement.[2]
Hot Module Replacement (HMR) allows you to swap out modules in a running application without a full page reload and without losing application state. This is how it works:
Last but not least, SvelteKit was and still is a hot topic in the web development community. Soon-to-be hires like Jack and Toni were already big proponents and users of the framework. Having SvelteKit as part of our stack would make it easy for them to familiarize themselves with our codebase during onboarding. And we were hopeful that SvelteKit would give us an advantage when hiring more new developers in the future!
With these arguments in mind, we embarked on the next refactoring adventure and migrated from our custom Hapi setup to SvelteKit.
SvelteKit divides routes into page components, which are written in Svelte, and load functions, which are written in Javascript/Typescript.
/src
|-- /routes
| |-- /(dashboard)
| | |-- +page.svelte # contains the page component
| | |-- +page.server.ts # contains the load function
| |
| |-- +layout.svelte
|-- /...
This architecture mapped quite easily onto our existing architecture with Hapi route handlers and Svelte page components. Here’s what the dashboard route handler from above looks like in SvelteKit:
// +page.server.ts
const searchParamsSchema = Joi.object().keys({
chart: Joi.string().optional(),
...
});
export const load = authUser(async ({ url, ... }) => {
const query = {
chart: url.searchParams.get('chart'),
...
};
const validation = searchParamsSchema.validate(query);
if (validation.error) {
throw error(400, `Invalid query parameters: ${validation.error.message}`);
}
const { chart, ... } = validation.value;
...
const recentlyEdited = ...
const recentlyPublished = ...
return {
title: 'Dashboard',
recentlyEdited,
recentlyPublished,
...
};
}) satisfies PageServerLoad;
And here’s the corresponding page component:
// +page.svelte
<script lang="ts">
import RecentVisualizations from './RecentVisualizations.svelte';
export let data;
$: recentlyEdited = data.recentlyEdited;
$: recentlyPublished = data.recentlyPublished;
</script>
<RecentVisualizations {__} {recentlyEdited} {recentlyPublished} />
Despite some small differences, the setup looks very similar to what we had in Hapi!
authUser(...)
makes sure that only logged-in users can access the page.load
function).On the component side, the most notable differences are the missing <MainLayout>
component (which has become a layout component that’s applied automatically by SvelteKit) and the fact that recentlyEdited
and recentlyPublished
are now reactive properties:
export let data;
$: recentlyEdited = data.recentlyEdited;
$: recentlyPublished = data.recentlyPublished;
Having these two properties declared reactively means we can invalidate and re-run the load
function, causing SvelteKit to return the new lists of recently edited and recently published charts to the page. The reactivity then updates the page without requiring a full page reload.
Because this SvelteKit setup was so similar to our Hapi one, we were able to migrate the first couple of routes fairly quickly and could immediately feel the improvements that hot module replacement brings to the developer experience. We were also very happy to see there was no significant change in performance after migrating the first few routes. If anything, we achieved a slightly improved performance by correctly leveraging SvelteKit’s load functions.
SvelteKit’s built-in client-side navigation also meant that our entire app, not just parts of it, was slowly becoming one single-page application — for example, no more full page reload when opening the chart editor from the archive!
With a faster experience for both users and developers, we’re quite happy about the results of this migration. However, switching to SvelteKit has also come with a few hiccups, notably around authentication, validation, and testing.
One question we encountered during the migration was how to add authorization to routes. While it was relatively easy to parse session information with SvelteKit’s hook function, we haven’t found a 100% satisfying way to protect routes and sub-routes.
We tried out a few approaches and finally settled on authentication guards that wrap around load functions:
// +page.server.ts
export const load = authUser(async ({ url, ... }) => { ... }
The guard function itself looks like this:
// guards.ts
export function authUser(loadFn) {
return async function load(page) {
const { user } = page.locals;
if (!user) {
throw error(401, 'Unauthorized');
}
return await loadFn(page);
};
}
We have similar guard functions for different levels of authorization like authPublic
, authGuest
, authUser
, and so on.
The advantage of this approach is that authorization for each route lives at the route level, not all mixed together as a single array of route URLs and regexes in a place like hooks.server.ts
. We find that managing authorization at the route level in this way is simpler and less error-prone as the number of routes in the project increases.
⚠️ If you try this at home!
One could easily fall into the trap of only protecting the layout load function in a +layout.server.ts
file, assuming that this protects the full subtree of pages living under that layout. However, layout load functions aren’t guaranteed to run every time a user navigates, which could cause important authorization logic to be skipped. To be safe, protect each route in +page.server.ts
independently!
The biggest downside of this approach is that nothing enforces the guard function — a developer could simply forget to include it, leaving the application open and vulnerable. To prevent this from happening, we implemented a custom ESLint rule that throws an error when it encounters a load function with no guard.
Another problem with this approach is that SvelteKit has trouble generating correct type information for the load function’s input parameters. We have to resort to type assertions:
export const load = authUser(async ({ locals, ... }) => {
const user = locals.user!; // populated by `authUser` but SvelteKit misses this
}
We’re not quite satisfied with this solution, and there are still other options to explore (not to mention relevant roadmap items to look forward to from SvelteKit and libraries like Auth.js).
Ideally, we want a way to protect routes that’s enforced at the framework level — something that wouldn’t even let you build the app if any route had been left unprotected. We might do another blog post on route protection in SvelteKit once we’ve found a solution that we’re completely happy with.
Another missing piece in the framework is the validation of query parameters (the part of a URL that follows the ?
, as in ?search=Hello+SvelteKit
). While SvelteKit offers a way to validate route parameters using matchers, query parameter validation has to be manually implemented in the load function.
// +page.server.ts
const query = {
chart: url.searchParams.get('chart'),
};
const validation = Joi.object().keys({
chart: Joi.string().optional(),
...
}).validate(query);
if (validation.error) {
throw error(400, `Invalid query parameters: ${validation.error.message}`);
}
const { chart, ... } = validation.value;
Since query parameter validation is not enforced at the framework level, we get the same problem we had with route protection: It’s easy to overlook.
With each migrated route, Datawrapper became more and more of a single-page application. Client-side navigation across the whole app meant no more full reloads when switching between pages — which made for a smoother experience, but also meant the browser could no longer provide immediate feedback that a new page was loading. To make up for this, we added a new loading indicator at the top of the app:
Moving more of our app to the client side raised several similar UX issues — some small and some large — which we generally didn’t have in mind beforehand.
SvelteKit layouts and load functions can be great for optimizing the data loading in your app: SvelteKit tries to track the dependencies of your load function, and will avoid re-running it during navigation whenever possible. But the complexity of an application may still cause load functions to run more times than necessary. As such, we keep a close eye on the network tab of the developer tools for any unusual load function behavior.
One thing we appreciated about Hapi was how easy it was to add tests for routes and pages using the convenient server.inject
method. This method allowed us to inject requests into a server instance without actually going over HTTP. SvelteKit doesn’t offer a straightforward way to set up a test server, inject requests, and then make assertions on the response. We resorted to testing load functions instead.
OAuth providers like Google, GitHub, and X (formerly Twitter) allow users to sign into an application without creating a new username and password. In our Hapi-based server we handled this through the @hapi/bell plugin which required minimal effort to set up and worked pretty much out of the box. But we haven’t found an easy-to-use OAuth implementation for SvelteKit. Using a library such as Auth.js for SvelteKit or Passport.js would have meant revamping our entire authentication/authorization logic, which we wanted to avoid if possible! Luckily our API server still runs on Hapi, so we simply moved the OAuth logic there.
The first migrated SvelteKit routes, including the chart editor, have been running smoothly in production for over three months now. The improved developer experience means we can now focus all our efforts on implementing exciting new features for our users!
However, as every developer knows all too well, the refactoring process is never complete. It's an ongoing journey. As you can see in the timeline above, we’re still transitioning some parts of our app from Svelte 2 to Svelte 4, and we’re also moving more and more code to TypeScript. All the while Svelte 5 is already peeking across the corner.
As we say here in Germany, "Nach dem Refactoring ist vor dem Refactoring" — the end of one refactoring is just the beginning of another.
This year-long journey has taught us valuable lessons, helped us build a stronger product, and (hopefully) set Datawrapper up for a successful future. If you’re currently adopting SvelteKit at your workplace, we’d love to hear how you’ve approached some of these challenges! Write us at hello@datawrapper.de or hit us up on X (formerly Twitter).
Comments