Explaning how to use the new data engine introduced with Datawrapper 1.5.
First of all: Why the change? In previous versions of Datawrapper there was no option to change anything in the dataset once you uploaded it from a spreadsheet program. Instead, if a label, a number or data needed to be changed, you would have to go back to the spreadsheet, change there, then upload again and so forth.
No more. Since Version 1.5 (released late August 2013) you can use a versatile data editor in Datawrapper. This has two main benefits: First, small corrections and changes can be solved very quickly. Secondly, the data imported into Datawrapper can be interpreted much better by the tool, which is the foundation of better and more functional charts.
With great power comes great responsibility. So, in order to make the most of this feature, some knowledge about how to pre-format data and labels is key. Here is the tutorial to do that, it will take five minutes to read through and apply to your next chart.
Checking the column types
Datawrapper now knows three different column types: text, number and date. The types are auto-detected by analyzing the content of each column. This works basically like this: if most of the values in a column look like valid dates (we’ll get into this in a moment) we assume the type is date. Otherwise we check if most of the values look like valid numbers and assume the type number if this is the case. If neither date or number type has been detected we assume that the column contains just texts.
How do I know what type is detected?
To make the detected types easily recognizable we gave each type a unique styling. Text columns are shown in the default text format which is: black color and left alignment. Number columns are shown in blue color and right alignment (as you would expect it from other spreadsheet applications). For date columns we choose green color and center alignment. The colors actually indicate right away what Datawrapper has detected.
What exactly are ‘most of the values’?
We decided to make the column type detection a bit more generous about mis-parsed values. So instead of requiring all the values of being in the right type we accept an error rate of 10%. That means that if at least 90% of the values look like valid dates we detect it as date column.
Why do you prefer the date type over the number type?
That’s because one of the most simple date formats is full years, and without the preference of date columns these would always be detected as number columns.
Datawrapper detected the wrong type. How can I fix that?
We included the auto-detection of the column type to simplify the work with Datawrapper, but of course we know that this cannot be perfect for all datasets. For instance, if a column happens to contain a lot of values between 1800 and 2100 the auto-detection will interpret them as years.
Changing date types is simple: To set the column type to numbers you just select the column (by clicking on the column index letter above it) and select the right column type in the type select box that appears in the right sidebar. By selecting multiple columns you can also set several column types at one time.
What date formats are detected?
With 1.5 we enhanced the date formatting. To use this feature, try to stick to one of the formats below for quick results. We now detect: full years (2013), half years (2013 H1), quarters (2013 Q3), months (2013-08), dates (2013-08-27) and even time values (2013-08-27 09:47). The date format not only accepts the recommended ISO format but also tries to identify a few other local variants (e.g. 8/27/2013).
You find the full list of date formats in our Github wiki.