The gender income gap, squared
November 14th, 2024
3 min
Datawrapper lets you show your data as beautiful charts, maps or tables with a few clicks. Find out more about all the available visualization types.
Our mission is to help everyone communicate with data - from newsrooms to global enterprises, non-profits or public service.
We want to enable everyone to create beautiful charts, maps, and tables. New to data visualization? Or do you have specific questions about us? You'll find all the answers here.
Data vis best practices, news, and examples
250+ articles that explain how to use Datawrapper
Answers to common questions
An exchange place for Datawrapper visualizations
Attend and watch how to use Datawrapper best
Learn about available positions on our team
Our latest small and big improvements
Build your integration with Datawrapper's API
Get in touch with us – we're happy to help
This article is brought to you by Datawrapper, a data visualization tool for creating charts, maps, and tables. Learn more.
Hi, this is Rose. I write for Datawrapper’s blog — and this week, I made it my business to play some games as well.
If you’ve been on social media in the past month or so, you’ve probably seen people posting their Wordle scores. They look like this:
Wordle 194 6/6
— Josh Wardle (@powerlanguish) December 30, 2021
⬛🟨⬛🟨⬛
⬛🟩🟩⬛⬛
⬛⬛⬛🟨🟨
⬛🟩🟩🟨⬛
⬛🟨🟩⬛⬛
🟩🟩🟩🟩🟩
Almost hoisted by my own petard
For those of you who haven’t been on social media this year — first of all, congratulations — Wordle is an online game in which you have six tries to guess a secret five-letter word. After each try, your guess is colored in to show how close it was: green for letters in the right place, yellow for letters in the wrong place, and gray for letters that aren’t in the target word at all. It’s a fun way to pass five minutes and you definitely don’t need to overthink it.
But can you overthink it if you want to? Of course, and many people want to. In the few weeks since the game has taken off, there have already been dozens of blog posts and videos that try to crunch the numbers and prove how to maximize your chance of winning, what the most efficient guesses are, and which word makes the best first move.
It’s cool to know that optimal strategies are out there, but personally I enjoy some guessing in my guessing games. So instead of trying to look forward and solve the game in advance, I decided to look backwards at a real round I’d already played. With the benefit of hindsight — and some computing power, and a public list of all five-letter words the game can choose from — I can see how well-chosen my guesses were and what exactly I learned from each one.
Wordle seems to start as a blank slate, but we know a lot about the target word before even making our first guess. We know that it’s a real, five-letter English word — so it definitely doesn’t start with JF or end with KM. We know that it’s more likely to end with a Y than to start with one, and that it’s more likely to have an A than an X in any position.
One way to quantify this kind of knowledge is to think about how much uncertainty we have about each letter. I'm already not as uncertain as someone who knows nothing about English; with every guess, I'll learn something new and decrease my uncertainty even more.[1] The units of measure for that uncertainty are called bits.
On the first round I guessed STARE. All of those letters are pretty common, so there's a good chance of getting one right:
I got lucky: Of these five letters, only R is in the target word. How is lucky to get so many wrong? Because these letters are so common, it's somewhat unusual to find an English word without S, T, A, or E. With one guess, the number of possible answers goes from 2315 words to 100.
There has to be at least one vowel in this word though, so for the next round I guessed PROUD:
Now I have three letters down, though I'm still not sure where they belong.
You might have noticed that these bits of uncertainty aren't evenly distributed. The first slot always has the highest number, and the last three slots are down to zero bits — even though we don't yet know what they are! The reason is that letters aren't independent of each other. There are words that start with J (like JOLLY) and words with second letter F (like OFTEN), but there aren't any words that start with JF. Calculating every slot's uncertainty independently would be way too pessimistic: we know more about the word than that, because each letter gives us clues to the other ones. Instead, I read the word from left to right and calculated the slots conditionally — that is, subtracting out the uncertainty that's already been accounted for.
At this point I got excited and just tried to guess the final word with RUMOR. It wasn't a super-logical choice: words from the solution list are chosen at random, so it doesn't matter that RUMOR is more common than CURIO or FUROR.
It wasn't correct. But it left me with zero uncertainty — there's only one word (on the Wordle list) that fits the evidence we've gathered so far:
The most surprising thing about this game to me is how winnable it is. It may not feel like these colors give much to go on, but just three guesses took us from 2315 possibilities to one certain answer. Of course, no answer is so certain that it's beyond argument — so apologies to the Brits, Canadians, Indians, Australians, and others who have to humour the quirks of this globetrotting word game.
That's it from me this week! Next week, we'll have a historical Weekly Chart from Datawrapper chairman Mirko Lorenz.
Comments