This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.

All Blog Topics

User stories

January 15th, 2020

How to use the new Python library “datawrapper” to create 34 charts in a few seconds

Sergio Sánchez

Today we’re happy to bring you a guest post by Sergio Sánchez, who’ll tell us how to automate creating charts with a Python library he built.

¡Hola! This is Sergio Sánchez, a research associate at the Public Policy Institute of California. I research higher education and migration policy in California.

I happen to love data visualization because of its capacity to convey complex information quickly and easily which is how I ended up using (and even writing a little about) Datawrapper. A few weeks ago Benedict Witzenberger had a guest blog post here announcing DatawRappr, the R package. That inspired me to write the Python equivalent. So here we are: A Python package for the new Datawrapper API called “datawrapper”.

How to use datawrapper (the Python library)

You need to know a little bit of Python to use this library (and I prefer to work in JupyterLab, so knowing Jupyter is another semi-requirement if you want to follow along my code).

First, we need to install datawrapper. To do so, we run pip install datawrapper or choose another way here.

Now we pass our Datawrapper API key to the Datawrapper instance:

from datawrapper import Datawrapper
dw = Datawrapper(access_token = "1234567890")

(If you have your API key exported as an environmental variable, you can skip this step.)

After that, we can check our account info to make sure it’s working with dw.account_info().

Now we can start creating charts. dw.create_chart() will return a Python dictionary containing the information of our newly created chart. This includes its ID, which we’ll need if we want to edit it later. For this reason, I’d suggest saving it to a variable.
The function create_chart() can take four arguments: title, chart_type, data, and folder_id. The most important one is chart_type. We can choose one of many ways to display our data (e.g. chart_type = 'tables'). All of them can be found in the Datawrapper API docs.

As an example, we can recreate Datawrapper’s API tutorial on how to create a chart but with a twist: We’ll use publicly available data found at IPUMS.org from the US American Community Survey to create a stacked bar chart:

This is the type of data I use day to day in my job as a public policy researcher. My research focuses around two topics, California’s colleges and universities, and immigration. For both topics, I use a lot of census data to look at demographic characteristics of certain geographies or people, i.e. to ask questions like “What is the educational background of those arriving in California in the past 5 years?”, like we are asking in this exercise.

So let’s see how we can create the chart above with the Python library datawrapper. Later, we will build even more charts: Using Python and Pandas, we’ll create one chart for California as a whole and one chart for each of the 34 counties for which the data is available. First, California:

california_chart = dw.create_chart(title = "California's Recently Arrived Immigrants", 
chart_type = 'd3-bars-stacked', data = dw_data)

You’ll notice we are passing something called dw_data to data. create_chart() can take a Pandas DataFrame as its data argument and upload it to Datawrapper at the moment of creating a chart. Let’s take a quick detour to learn more about how this data looks:

Preparing the data

After getting the data from IPUMS.org, dropping some columns and aggregating the education variable into only three categories, I get a DataFrame with five columns that looks like this:

year	perwt	countyfip	edu-three-cat	county
2007	92	06037	Advanced degree	Los Angeles
2007	182	06077	High school or less	San Joaquin
2007	217	06065	Bachelor’s degree	Riverside

This is weighted data (each row represents more than one person), and it’s not ready to be used in Datawrapper yet. It takes two simple steps with Pandas to change that:

dw_data = data.pivot_table
(columns = 'edu-three-cat', index = 'year', 
values = 'perwt', aggfunc='sum')
dw_data = dw_data.apply
(lambda x: x*100 / x.sum(), axis = 1).reset_index()

After this, our dw_data DataFrame looks like this:

year	Advanced degree	Bachelor’s degree	High school or less
2007	13.490163	24.928126	61.581711
2008	14.776653	25.3320222	59.903125
2009	15.518236	26.25654	58.231110

All we did was group our data by year and sum up how many people had an Advanced degree, a Bachelor’s degree, or a “High school or less” education. The second step was to calculate each column’s share of the total for that year. You can see all the code for this in this jupyter notebook on GitHub. Ok, back to the chart!

Editing & publishing the chart

To add a source for the data we can use dw.update_description:

dw.update_description(
 california_chart['id'],
 source_name = 'IPUMS',
 source_url = 'https://ipums.org',
 byline = 'Sergio Sánchez',
)

Because we’ve saved our new charts info to the variable california_chart, we can pass its ID to dw.update_description() like we would access any other key-value pair in Python, california_chart['id'].

In the tutorial, we also update the look of our chart. We’d like the bars to be thick and we want to use specific colors:

properties = {
 'visualize' : {
     'thick': True,
     'custom-colors': {
         'Advanced degree': '#15607a',
         "Bachelor's degree": '#1d81a2',
         'High school or less': '#dadada'
     },
    }
}
dw.update_metadata(california_chart['id'], properties)

After all this, we’ll want to publish our chart:

dw.publish_chart(california_chart['id'])

Et voilá! We have published and customized our first datawrapper chart using Python!

Creating 34 Datawrapper charts automatically

This is a basic workflow, but because it’s automated, we can scale things up. We can wrap this all up in a for-loop and create a chart for every county we have data for in just a couple lines of code! All with a consistent look and feel.

properties = {
    'visualize' : {
        'thick': True,
        'custom-colors': {
            'Advanced degree': '#15607a',
            "Bachelor's degree": '#1d81a2',
            'High school or less': '#dadada'
        },
    }
}
# For each county
for county in data['county'].unique():
    if pd.notna(county):
    # prep data
        mask_county = data['county'] == county
        dw_data = data[mask_county].pivot_table(columns = 'edu-three-cat',
index = 'year', values = 'perwt', aggfunc='sum')
        dw_data = dw_data.apply(lambda x: (x / x.sum()) * 100, axis = 1)
        dw_data.reset_index(inplace = True)
        # publish chart
        county_chart = dw.create_chart(title = f"{county} recently arrived immigrants",
chart_type = "d3-bars-stacked", data = dw_data, folder_id=13942)
        dw.update_description(
            county_chart['id'],
            source_name = 'IPUMS',
            source_url = 'https://ipums.org/',
            byline = 'Sergio Sanchez',
        )
        dw.update_metadata(county_chart['id'], properties)
        dw.publish_chart(county_chart['id'], display = False)
    else:
        print("Skipping! Null value.")

And that’s what we’ll see in our Datawrapper dashboard after running this code:

many many similar looking charts exported with datawrapper code

We just generated 34 Datawrapper charts in a few seconds.

But wait, there’s more

There is more we can do with Datawrapper’s API (like exporting charts!). Or we can get a chart’s iframe code to embed it anywhere we like:

dw.get_iframe_code(california_chart['id'])

This means we can integrate awesome Datawrapper charts in our Python apps, Django websites, or anywhere we’re using Python.

You can find the documentation of datawrapper (the Python library) at datawrapper.readthedocs.io. Its code is available on GitHub. There’s even been one pull request already! (I’m looking forward to more.)

I hope you liked this tutorial/explanation of datawrapper, the Python package for Datawrapper. If you want to learn more about my work as a policy researcher you can find it here. Other projects are on my personal website soyserg.io or tacosdedatos.com, where I and other data-enthusiasts from Latin America and Spain write about data analysis and visualization in Spanish. Follow me on twitter at @tacosdedatos too!

Sergio is always happy about pull requests, so if you’re using Python (and Datawrapper!), consider helping him to build the best Datawrapper Python library out there (also the only one so far. But also the best!). If you’d like to write a guestpost yourself, get in touch with me (Lisa) at lisa@datawrapper.de. As always, I’m looking forward to hearing from you.

Sergio Sánchez

(@tacosdedatos) is a computational social scientist and public policy researcher in California. He is very passionate about the democratization of cutting-edge knowledge of all things data and data viz. You can learn more about him on his website tacosdedatos.com.

Liked this article? Maybe your friends will too:

Twitter Facebook

Product

Charts

Maps

Tables

Feature highlights

Solutions

Media

Finance

Goverment

Case Studies

Resources

Blog

Academy

FAQ

River

Webinars

Careers

Changelog

API Documentation

Get support

Latest Improvements

Terms of Service

Privacy

Support

Imprint

About Us

Contact Us

How to use the new Python library “datawrapper” to create 34 charts in a few seconds

How to use datawrapper (the Python library)

Preparing the data

Editing & publishing the chart

Creating 34 Datawrapper charts automatically

But wait, there’s more

Comments

All Blog Topics

All Blog Topics

Product

Charts

Maps

Tables

Feature highlights

Solutions

Media

Finance

Goverment

Case Studies

Resources

Blog

Academy

FAQ

River

Webinars

Careers

Changelog

API Documentation

Get support

Latest Improvements

Terms of Service

Privacy

Support

Imprint

About Us

Contact Us

How to use the new Python library “datawrapper” to create 34 charts in a few seconds

How to use datawrapper (the Python library)

Preparing the data

Editing & publishing the chart

Creating 34 Datawrapper charts automatically

But wait, there’s more

Comments

NEWSLETTER