How Datawrapper developers perceive the cost of function names

Does “get” sound faster than “lookup”?

Welcome to the first Weekly Chart of 2022. I’m Jakub, a backend developer at Datawrapper, and because I’m a developer, today’s topic will be software. So easy.

Function… what?

Quick programming 101: In virtually all programming languages, code is divided into functions. (Some people call them subroutines. This is how you know those people have computer science degrees. We’ll stick to functions.) A function is several lines of code that you give a name to. The lines can be:

  • math operations,
  • code that downloads a document from the web,
  • or anything else.

For example, this function tells the program to print the message Hello world! Hallo Welt!

def say_greetings():
    print('Hello world!')
    print('Hallo Welt!')

I named this function say_greetings. Now whenever I “call” say_greetings(), the program will print that message. Functions allow us to reuse pieces of code and refer to them by name.

But can you already spot the problem with function names?

The two hard things in programming

There are only two hard things in computer science: cache invalidation and naming things.

— Phil Karlton

This was a developer joke. A true classic. Developers are cracking up now. They can relate. Naming things is indeed hard. (Same with cache invalidation, but that part is not the funny part. That’s just a hard fact.)

See, the problem with function names is that it is completely up to the author to choose the name. The sky is the limit. The name can be as short as a single letter like f or it can be a whole sentence like say_greetings_in_two_different_languages.

Function names don’t matter from the perspective of the computer, but they do matter to humans. Once several people start working on a project, it greatly simplifies the work if one developer can make a correct assumption about what a function that another developer wrote does, even without reading every single line of code inside the function.

That’s why developers think a lot about naming things. They try to find the ideal function name that describes:

  • what the function does (duh)
  • what data it receives
  • what data it produces
  • what transformation of the data it does

Which brings us to the question of the day:

Can a function name describe the speed of its code?

I should have just said straight that I copied the idea for my Weekly Chart from this tweet:

To paraphrase: “It seems that using different verbs in function names implies different speed of the function.” I think this is a brilliant idea. It suggests that we could tell whether a function is slow (because it does many computations on the CPU or because it stores a lot of data in the memory) just by looking at its name. Maybe programmers perceive “get” as fast or “lookup” as slow? And if so, how much agreement is there among them? Can I rely on this perception when choosing a function name? I shall investigate!

I set out to make a survey among the developers at Datawrapper. I came up with twelve function names that all describe a similar task: retrieving a piece of information (“get,” “find,” “select,” “read,” etc.). Then I asked my colleagues to assign a number on a scale from 1 to 10 to each function, where 1 is slowest and 10 fastest. Once I got the answers, I tried to visualize them like the “Perceptions of probability” chart in the tweet above, because this chart nicely shows not only the overall results, but also how much the respondents agree with each other.

So I opened our app — and then I realized I’m a backend developer and I don’t really know how to make charts, but I persevered — and here’s a result:

Not very conclusive, sure, but also not completely random. Notice, for example, that there is a decent agreement on “check” being fast, whereas “select” was rated differently by each respondent. It is also interesting that “check” and “get” were rated as very slow by some people but as very fast by others. That makes me think: Did I not ask the question clearly? Maybe some developers assigned the numbers the other way around — 1 to fast and 10 to slow?

Anyway... What is the answer to the question of whether function names imply speed of code? Probably maybe sometimes, according to the nine Datawrapper developers.

What kind of chart is that?

  • It’s a Datawrapper table with sparkline tiny-charts.
  • The tiny-charts have the “fill area under line” setting turned on.
  • Moreover, the “max” value of the sparkline range was set, so that all the tiny-chart have the same scale.

The obligatory Python code

In our code repository, you can find annotated code that transforms a Google Forms CSV file into a CSV that can be used in a Datawrapper table.

That's it for today. I hope you found this software-themed chart exciting. Happy coding!