Excel – Extracting Letters from a String
07/10/2023
In the previous post I covered how to extract numbers, including decimal numbers, from a string. The flip side of that is to extract the letters from the same string. The formula to do this are very similar to the extracting numbers formula but with the reverse sense. We don’t want numbers but non-numbers and we don’t want periods or decimals. So here is the formula to do this below and after it I will describe how it works.
=TRIM(TEXTJOIN("",TRUE,FILTER(MID(T4,SEQUENCE(1,LEN(T4)),1),NOT(ISNUMBER(VALUE(MID(T4,SEQUENCE(1,LEN(T4)),1))))*(MID(T4,SEQUENCE(1,LEN(T4)),1)<>"."),"")))
Formula Explanation
1. First we use the Mid function with an array generated by the sequence function in the start field and length of 1 extracts each character into an array
| Data | Formula Component | ||||||
| 2.5 TB | MID(A2,SEQUENCE(1,LEN(A2)),1) MID(A2,{1,2,3,4,5,6},1) | 2 | . | 5 | T | B |
2. The formula then uses the isnumber function against the array to return binary for each element of the array
| Data | Formula Components | 2 | . | 5 | T | B | |
| 2.5 TB | NOT(ISNUMBER( VALUE(MID(A2,SEQUENCE(1,LEN(A2)),1)))) | false | true | false | true | true | true |
3. Then the not equal to comparison operator (<>) is used to compare each members of the array returned in step 1 to check that the array elements are not equal to a period which will return false if a period is found. (Note: if the string has periods at the end of sentences these will be excluded)
| Data | Formula Components | 2 | . | 5 | T | B | |
| 2.5 TB | (MID(A2,SEQUENCE(1,LEN(A2)),1)<>”.”) | true | false | true | true | true | true |
4. The results of these two comparisons need to be combined in a bitwise and fashion which is accomplished by multiplying them.
| Data | Formula Components | 2 | . | 5 | T | B | |
| 2.5 TB | NOT(ISNUMBER( VALUE(MID(A2,SEQUENCE(1,LEN(A2)),1)))) *(MID(A2,SEQUENCE(1,LEN(A2)),1)<>”.”) | 0 | 0 | 0 | 1 | 1 | 1 |
5. Almost there, now the values returned in step 1 need to be filtered based upon the combined comparisons from step 4. The filter function takes in the values from the array and then in the next filed the array of comparison results is used to filter out any 0 results.
| Data | Formula Components | 2 | . | 5 | T | B | |
| 2.5 TB | FILTER(MID(A2,SEQUENCE(1,LEN(A2)),1), NOT(ISNUMBER( VALUE(MID(A2,SEQUENCE(1,LEN(A2),1)))) *(MID(A2,SEQUENCE(1,LEN(A2)),1)<>”.”),””) | T | B |
6. Finally use the textjoin function to combine the elements of the array into a single result and use trim function clean up any leading or trailing spaces.
| Data | Formula Components | 2 | . | 5 | T | B | |
| 2.5 TB | TRIM(TEXTJOIN(“”,TRUE, FILTER(MID(A2,SEQUENCE(1,LEN(A2)),1), NOT(ISNUMBER(VALUE (MID(A2,SEQUENCE(1,LEN(A2)),1))) *(MID(D5,SEQUENCE(1,LEN(A2)),1)<>”.”),””))) | TB |
Make it a Function
Rather than enter this in and copy it down to every cell you need it adjusting the reference you can actually make it into a function using the Name Manager under formulas with the Lambda function. You may be familiar with creating named ranges. Under the Formula menu item in Excel you can select a range of cells and choose define name so that you can refer to it else where. But what you can also do is create a name that behaves like a function. Open Name Manager and add a name GetLetters and paste into the reference field the formula below. Then you can reuse this formula against any reference just by using =GetLetters(Ref).
=LAMBDA(vX,TRIM(TEXTJOIN("",TRUE,FILTER(MID(vX,SEQUENCE(1,LEN(vX)),1),NOT(ISNUMBER(VALUE(MID(vX,SEQUENCE(1,LEN(vX)),1))))*(MID(vX,SEQUENCE(1,LEN(vX)),1)<>"."),""))))
# Then in any cell you can type the examples below and the formula will be applied to the reference supplied returning the extracted decimal number
=GetLetters({Reference})
=GetLetters(A2)