The gender income gap, squared
November 14th, 2024
3 min
Datawrapper lets you show your data as beautiful charts, maps or tables with a few clicks. Find out more about all the available visualization types.
Our mission is to help everyone communicate with data - from newsrooms to global enterprises, non-profits or public service.
We want to enable everyone to create beautiful charts, maps, and tables. New to data visualization? Or do you have specific questions about us? You'll find all the answers here.
Data vis best practices, news, and examples
250+ articles that explain how to use Datawrapper
Answers to common questions
An exchange place for Datawrapper visualizations
Attend and watch how to use Datawrapper best
Learn about available positions on our team
Our latest small and big improvements
Build your integration with Datawrapper's API
Get in touch with us – we're happy to help
This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.
Hey, it’s Ivan – developer here at Datawrapper. This week we’ll look at why the Cebuano and Swedish editions of Wikipedia have the 2nd and 3rd highest number of articles out of all the languages.
Wikipedia is divided by language into separate editions – there are currently 317 in total. The three editions with the most articles at the time of writing are:
English holds the top spot – this is presumably because it’s the largest language by number of speakers, with approximately 1.268 billion speakers. But how can Cebuano, with an estimated 20 million speakers, and Swedish, with 10 million speakers, have the 2nd and 3rd highest number of Wikipedia articles?
It turns out that there is indeed a catch: most articles for Cebuano and Swedish Wikipedias were not written by humans.
Up until 2012, the number of articles in most Wikipedias was growing "organically" - that is, human authors were writing them. Then, in 2012, a bot called Lsjbot was launched. It started writing articles for Swedish and Cebuano Wikipedias with lightning speed. Within a few years, it took these two editions into 2nd and 3rd place.
You might ask: but why specifically Swedish and Cebuano? It turns out that Sverker Johansson, the programmer who created Lsjbot, is Swedish. And Cebuano is the native language of his wife.
Each Wikipedia edition has rules on what articles are accepted and whether bots can write them. The Swedish and Cebuano Wikipedia communities evidently decided to allow submissions from bots, but at some point around 2017 - 2018, this decision was reversed. As can be seen in the chart, the rate of new articles has considerably flattened out.
The articles written by Lsjbot are typically very short and factual and have been criticized for lacking meaningful content. So we're still some way away from quality Wikipedia content being automatically generated by bots.
There are other bots that write Wikipedia articles, but none of them have been as prolific as Lsjbot.
Getting the data was easy thanks to the Wikipedia Statistics portal, which I can recommend if you want to explore data on Wikipedia. Here is an example URL which shows the monthly count of articles on the English Wikipedia.
To visualize the data, I used a line chart, which is perfect for illustrating trend changes over time. To focus on the story, I de-emphasized the lines of Wikipedias other than English, Cebuano, and Swedish by using a gray color. Finally, I used a highlight range and a text annotation to communicate the Lsjbot launch date.
Maybe next time you're on Wikipedia, you'll be reading an article written by a bot! Let me know on Twitter or at ivan@datawrapper.de if you have any questions about the chart or the article. We'll see you next week!
Comments