This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.

All Blog Topics

Weekly Charts

August 4th, 2022

Is it a bug or a feature?

Jakub Valenta

Hello, I’m Jakub, a software engineer at Datawrapper.

Recently, I read a few books about software development that I found in our library here in the Berlin office. One statement in Facts and Fallacies of Software Engineering^[1] caught my attention:

Errors tend to cluster.

This fact can be useful, because it suggests that when you find a problem in one module, you should pay close attention to it because there are probably more problems there.

By problems, or bugs, I mean improvements that need to be made to existing code to make it match the current requirements. Maybe the app didn’t support a specific web browser version, or it was slower than desirable, or it relied on third-party code that has become outdated. Many bugs aren’t visible to users and don’t affect them. What they have in common is that fixing them means changing existing code without changing the product requirements. This distinguishes them from features, which extend the requirements.

So to see if errors indeed cluster, and to find modules that are likely to contain more errors, I decided to analyze the code of Datawrapper by looking at the history of changes to our project and counting which modules received the most improvements. A “module” in a software project is usually one single source code file, but to get a higher level view, I will be looking at whole folders of files in this article.

There were, however, a few caveats:

Large modules (defined by the number of code lines) generally need more improvements, solely because they’re large.
Modules in which we add a lot of new features would probably also see a larger number of fixes.

Therefore, a meaningful visualization of bug fixes in a software project should take into account the size of a module and the number of features added to it. Then the focus should be on the outliers — modules that we improved more often than their size and number of features would suggest. Bear in mind that a larger number of fixes doesn’t necessarily mean a bad piece of code: it can also mean that the module just received fewer new features in comparison, because it’s considered complete.

Now without more ado, here is the result of my analysis of Datawrapper code:

What are we looking at?

The circles represent different modules of Datawrapper.
The size of each circle represents the size of each module, measured by the current number of lines in all the files that the module contains.
The horizontal axis shows the number of lines of code related to improvements that we've changed in a given module recently.
The vertical axis shows the number of lines of code related to new features that we've changed in a given module recently.

From my perspective as a Datawrapper developer, the chart matches what I remember we’ve been working on. Since my goal was to discover code to pay extra attention to, to look for any problems that we haven’t yet noticed, I might need to continue the research and zoom in on particular modules or even individual lines of code. Finally, looking at the improvements made tells a lot about which parts of the project we prioritized in the past few years.

The data were collected by analyzing the log of changes to our project (commits in our version control system) since September 2021. Each entry in the log is a set of changes to one or more files. It also comes with a message that the developer added to describe what the changes are about. Since we follow a fairly precise format for these messages, I could tell if the changes created a new feature or improved an existing one.

That's it from me for today. I hope you enjoyed this software-themed chart.

Robert L. Glass, 2007 ↩︎

Jakub Valenta

(he/him) is Datawrapper’s head of platform development. He makes sure our servers run smoothly and the print export produces high-quality results. He also makes conceptual art and never stops being excited that the world exists. Jakub lives in Berlin.

Liked this article? Maybe your friends will too:

Twitter Facebook

Product

Charts

Maps

Tables

Feature highlights

Solutions

Media

Finance

Goverment

Case Studies

Resources

Blog

Academy

FAQ

River

Webinars

Careers

Changelog

API Documentation

Get support

Latest Improvements

Terms of Service

Privacy

Support

Imprint

About Us

Contact Us

Is it a bug or a feature?

Comments

All Blog Topics

All Blog Topics

Product

Charts

Maps

Tables

Feature highlights

Solutions

Media

Finance

Goverment

Case Studies

Resources

Blog

Academy

FAQ

River

Webinars

Careers

Changelog

API Documentation

Get support

Latest Improvements

Terms of Service

Privacy

Support

Imprint

About Us

Contact Us

Is it a bug or a feature?

Comments

NEWSLETTER