How to split and extract text from data columns in Excel & Google Sheets

feature image

Sometimes, your data columns don’t just have one variable but multiple variables. Like a column with US states in the format US-TX instead of two columns – one with the country (US), the other one with the state ID (TX). Or a column with companies and the product they sell: Datawrapper (Software).

At first, this might not be a problem. But when you need the state IDs to create a choropleth map or want to analyze & visualize the data based on company products, you start to care. Good thing there are easy ways to separate the country from its states and the company from its product into two columns.

I’ll show two ways to make two (or more) columns out of one column. To do so, we’ll use Google Sheets – but everything I’ll explain in the following article should work with LibreOffice Calc, Excel or any other spreadsheet software.

The first method is the formula =SPLIT(B1,"-"):

Split columns with SPLIT()

  1. Create at least two columns next to the column with the content you want to split. You can do so, click on the header indices (A, B, C, etc.). Then click the little triangle and select “Insert 1 right”. Repeat to create a second free column. (If you see the error #REF! later on in your cell, you’ll need to create more columns.)
  2. In the first free column, write =SPLIT(B1,"-"), with B1 being the cell you want to split and - the character you want the cell to split on.
  3. To apply the changes to the cells below, drag down the blue little square in the bottom right of the selected cell(s). Double-click on the blue little square to fill all cells below.

Extract content from columns with LEFT()

Sometimes you don’t have clear separator characters, but just want to extract the first or last characters of a cell. Here, the formulas =LEFT(B1,2), =RIGHT(B1,8) and =MID(B1,2,4) become useful:

  1. Insert a new column again. (Or two. Or three! As many as you need.)
  2. In the new column(s), write
    =LEFT(B1,2) to extract the first 2 characters of the cell B1.
    =RIGHT(B1,8) to extract the last 8 characters of the cell B1.
    =MID(B1,4,2) to extract the 2 characters following the 4th character in B1.
  3. To apply the changes to the cells below, drag down the blue little square again.

Pro tips for the use of LEFT()

Pro tip 1: You can combine formulas to extract characters at all sorts of crazy positions. For example, the formula =LEN() gives back the number of characters in a cell. So =LEFT(A1,LEN(A1)-2) extracts the entire text in a cell but its last two characters.

Pro tip 2: Now that you learned to separate text, you can also bring it together again. To combine the column “US” from your cell A1 and “TX” from B1 with a hyphen, you can use ampersands and write =A1&"-"&B1.

Pro tip 3: You can extract content with LEFT(), RIGHT() and MID() not just from text cells, but also from number and date cells. If you want to apply formulas like LEFT() to your dates, it helps to transform them to text first. To do so, use the formula =TEXT(A1, "MM/DD/YYYY"). Instead of MM/DD/YYYY, you can use any combination of these date codes and /, -, a space, etc. For example, =TEXT(A1, "dd-mmm-yy") will transform the 1st of November 2019 to a text cell with the content 01-Nov-2019.

I hope this was helpful! If you need more help in cleaning your data to prepare it for a charting tool like Datawrapper, visit our article “How to prepare your data for analysis and charting in Excel & Google Sheets”. And if you have any questions, please leave a comment or write to me at [email protected].

We have updated our Privacy Policy to reflect the new EU regulations. Please give it a read (it is written with the goal of clarity) and click here to accept it.