Speaking, singing, and clicking in whale

Hi, this is Marten, software developer at Datawrapper. In this Weekly Chart I’ll be showcasing a little feature that we launched last week. So listen closely!

Last week I stumbled upon a fascinating piece in the New Yorker about a group of computer scientists and marine biologists attempting to decipher “Whale.” They believe that by training a Large Language Model on enough recorded whale sounds (sperm whale sounds, to be exact), they’ll be able to create a model that understands and produces the whales’ language — similar to how ChatGPT learned to “speak” proper English without anyone ever directly teaching it English grammar.


To attempt this goal, the scientists will need data. A lot of data. The team estimates that a successful model would need to be trained on some four billion sperm whale sounds or clicks — forty thousand times more than the current largest collection of about one hundred thousand clicks collected off the coast of Dominica!

Unfortunately, whales aren’t posting comments and memes all over the internet just yet, so the group of scientists have to resort to more old-school means of collecting the necessary data, by installing a network of underwater microphones and planting recording devices on the whales themselves.

Reading about this made me wonder if I could find a set of whale sound recordings already online. And it just so happens that I could! An extensive collection of whale sounds can be found in the Watkins Marine Mammal Sound Database maintained by the New Bedford Whaling Museum. The database contains about 2000 unique recordings of more than 60 species of marine mammals, recorded all over the world in a timeframe spanning from the 1940s to the 2000s.

Use the +/- buttons on the top right-hand side of the map to zoom in on individual markers. Clicking on a marker opens the tooltip, which allows you to play the sound files.

What I find fascinating about these sound recordings is their variety. They don’t all sound similar, but quite different from species to species. Interestingly, the sound that I most associated with whales — the very melodic “singing” of the humpback — isn’t all too common among marine mammals.

It seems I heard a lot more of these sort of clicks from sperm whales:

And then other mammals, like this bearded seal, sound straight from a science fiction movie:

And here’s another one, from the mythical-looking narwhal:

Now let’s take a look at how I created the map.

How I created this map

This data captured my imagination — so much so that it inspired me to add a brand-new feature for playing sound files in tooltips. That’s the nice thing about working as a developer at Datawrapper: if a feature is easy enough to implement and makes sense for the tool, you can just do it yourself. And I did!

The rest of the process went like this:

  1. After obtaining permission from the New Bedford Whaling Museum to use their database for this blog post, I collected the links to the sound files and data like location and date of recording.
  2. Each row in the resulting CSV represents a single date and location. As such, one single row may contain multiple sound recordings, represented as links to multiple sound files in columns named sound_0, sound_1, sound_2, etc.
  3. I uploaded the CSV file to Datawrapper as a symbol map. Some recordings, like the ones taken at the New York Aquarium, did not contain coordinates. For these recordings I used the in-app geocoding feature to determine a latitude and longitude for their named location.
  4. The final step was to customize the tooltips! Putting sound files in tooltips is now as simple as putting each file link into an HTML audio tag. This is what an individual sound embed looks like:
{{ sound_0 ? CONCAT('<audio src="', sound_0, '" controls style="width: 180px"></audio><br>') : ""}}

Different sites have different numbers of available recordings, so the brackets contain a bit of if-else logic to only display the <audio> tag if the respective sound column exists. Then I actually create the <audio> tag passing the link as an href attribute. I add the controls property to the tag so that audio controls from the browser show up, add a bit of styling, and voilà, I can play sounds in tooltips!

Here’s how the whole tooltip looks:

<img src="{{ imgurl }}" style="width: 200px; margin-bottom: 12px"/>
<br>
{{ sound_0 ? CONCAT('<audio src="', sound_0, '" controls style="width: 180px"></audio><br>') : ""}}{{ sound_1 ? CONCAT('<audio src="', sound_1, '" controls style="width: 180px"></audio><br>') : ""}}{{ sound_2 ? CONCAT('<audio src="', sound_2, '" controls style="width: 180px"></audio><br>') : ""}}
<br> <i>Recorded on {{ FORMAT(date, "MMMM DD, YYYY") }} 
{{ location ? CONCAT('at ', location) : '' }}</i>

Some data points contain as many as 57 audio samples, but in the tooltips I only included a maximum of three. Check out the Watkins Marine Mammal Dataset if you’re still hungry for whale sounds afterwards!


And that’s it from me! I guess we won’t be speaking whale anytime soon. In the meantime, the thing we can look forward to is next week’s Weekly Chart, brought to you by Rose. Thanks for listening!

Comments