Excel Extracting Numbers from a String

I was looking for a way to extract numbers from a string in and excel cell. Years ago, I created a VBA module with regular expression functions which could do this kind of work. However, using VBA tends to decrease the portability of an excel spreadsheets depending upon the policies that might be in place. So I have been trying to use standard functions and formulas to improve my spreadsheet portability. I was searching the internet for a solution to this problem and indeed I found several examples of formulas which extracted numbers but what I eventually found is they left the decimals behind.

After more searching, I concluded that I was going to have to figure it out and write my own formula. I thought I should be able to use array-based formulas to solve the problem and a little while later I came up with this.

=VALUE(TEXTJOIN("",TRUE,FILTER(MID(A2,SEQUENCE(1,LEN(A2)),1),ISNUMBER(VALUE(MID(A2,SEQUENCE(1,LEN(A2)),1)))+(MID(A2,SEQUENCE(1,LEN(A2)),1)="."),"")))

Formula Explanation

To explain this formula it will be easiest if we break it down into its parts and describe each part and then put it all together.

1. First we use the Mid function with an array generated by the sequence function in the start field and length of 1 extracts each character into an array

DataFormula Component
2.5 TBMID(A2,SEQUENCE(1,LEN(A2)),1)
MID(A2,{1,2,3,4,5,6},1)
2.5 TB

2. The formula then uses the isnumber function against the array to return binary for each element of the array

DataFormula Components2.5TB
2.5 TBISNUMBER(
VALUE(MID(A2,SEQUENCE(1,LEN(A2)),1)))
truefalsetruefalsefalsefalse

3. Then a comparison operator is used to compare each members of the array returned in step 1 to look for the decimal or period

DataFormula Components2.5TB
2.5 TB(MID(A2,SEQUENCE(1,LEN(A2)),1)=”.”) falsetruefalsefalsefalsefalse

4. The results of these two comparisons need to be combined in a bitwise or fashion which is accomplished by adding them together

DataFormula Components 2 . 5 T B
2.5 TBISNUMBER(
VALUE(MID(A2,SEQUENCE(1,LEN(A2)),1)))
+(MID(A2,SEQUENCE(1,LEN(A2)),1)=”.”)
111000

5. Almost there, now the values returned in step 1 need to be filtered based upon the combined comparisons from step 4. The filter function takes in the values from the array and then in the next filed the array of comparison results is used to filter out any 0 results.

DataFormula Components 2 . 5 T B
2.5 TBFILTER(MID(A2,SEQUENCE(1,LEN(A2)),1),
ISNUMBER(
VALUE(MID(A2,SEQUENCE(1,LEN(A2),1)))
+(MID(A2,SEQUENCE(1,LEN(A2)),1)=”.”),””)
2.5

6. Finally use the textjoin function to combine the elements of the array into a single result and use value to ensure they data is converted to a number type.

DataFormula Components 2 . 5 T B
2.5 TBVALUE(TEXTJOIN(“”,TRUE,
FILTER(MID(A2,SEQUENCE(1,LEN(A2)),1),
ISNUMBER(VALUE(MID(A2,SEQUENCE(1,LEN(A2)),1)))
+(MID(D5,SEQUENCE(1,LEN(A2)),1)=”.”),””)))
2.5

Make It A Function

Rather than enter this in and copy it down to every cell you need it adjusting the reference you can actually make it into a function using the Name Manager under formulas with the Lambda function. You may be familiar with creating named ranges. Under the Formula menu item in Excel you can select a range of cells and choose define name so that you can refer to it else where. But what you can also do is create a name that behaves like a function. Open Name Manager and add a name GetNum and paste into the reference field the formula below. Then you can reuse this formula against any reference just by using =GetNum(Ref).

=LAMBDA(vX,VALUE(TEXTJOIN("",TRUE,FILTER(MID(vX,SEQUENCE(1,LEN(vX)),1),ISNUMBER(VALUE(MID(vX,SEQUENCE(1,LEN(vX)),1)))+(MID(vX,SEQUENCE(1,LEN(vX)),1)="."),""))))

# Then in any cell you can type the examples below and the formula will be applied to the reference supplied returning the extracted decimal number
=GetNum({Reference})
=GetNum(A2)

Comments are closed.

Design a site like this with WordPress.com
Get started