Cat On A Spreadsheet
If you’ve ever needed to extract patterns like email addresses, dates, product codes, or anything in between from messy data, you’ve probably hit a wall with Excel’s built-in formulas.
Enter Regular Expressions (RegEx) — a powerful pattern-matching language that makes complex text operations simple, elegant, and lightning-fast.
In this post, you’ll learn:
What RegEx is and why it's useful in Excel
How to use the VBScript.RegExp object in VBA
How to build reusable RegEx helper functions
Practical examples for cleaning and extracting data
Regular Expressions are a way to define search patterns for text. Think of them as formulas for finding and manipulating text that fits a rule. Instead of looking for exact words, RegEx looks for rules—like:
“any 5-digit number”
“a word that starts with a capital letter”
“an email address”
“a price that starts with a $ sign”
Examples:
\d+ → one or more digits
[A-Z]{2,4} → 2–4 uppercase letters
\b\w+@\w+\.\w+\b → a basic email format
In RegEx, these kinds of rules are written using special symbols.
Let’s look at some of the most common symbols you’ll use when building expressions:
Symbol | Meaning | Example | Matches |
---|---|---|---|
. | Any one character | a.c | abc, a7c, a_c |
\d | Any digit (0–9) | \d\d\d | 123, 456 |
\w | Any letter, number, or underscore | \w+ | hello, word123 |
\s | Any whitespace (space, tab, line break) | \s+ | ' ', \t, \n |
+ | One or more of the previous character | \d+ | 1, 99, 2024 |
* | Zero or more of the previous character | a* | "", a, aaaa |
? | Zero or one of the previous character | colou?r | color, colour |
[...] | Any one character from inside brackets | [aeiou] | a, e, i, o, u |
[^...] | Any character except inside brackets | [^0-9] | Any non-digit |
{n} | Exactly n times | \d{4} | 2023, 1234 |
` | ` | OR | `cat |
() | Grouping | `(Mr | Mrs). \w+` |
\b | Word boundary | \bword\b | Matches word not password |
VBA doesn’t have built-in RegEx functions — but it does support the Microsoft VBScript Regular Expressions 5.5 library.
In the VBA editor (Alt + F11):
Go to Tools > References
Check Microsoft VBScript Regular Expressions 5.5
Now you can use the RegExp object.
Usage:
=RegexIsMatch(A2, "\d{4}-\d{2}-\d{2}")→ Returns TRUE if the cell contains a date like 2023-05-17.
🔁 Extracting All Matches
🔄 Replacing Text with RegEx
Task | Pattern | Description |
---|---|---|
Extract 5-digit codes | \b\d{5}\b | Zip codes |
Find emails | \b\w+@\w+\.\w+\b | Basic email format |
Remove punctuation | [^\w\s] | Strip symbols |
Normalize spacing | \s+ | Replace multiple spaces with 1 |
Detect product codes | [A-Z]{2}\d{4} | Format like AB1234 |
If you're learning, test your expressions online at tools like:
These tools show matches in real time and help you learn syntax as you go.
Use re.Global = True to match all occurrences
Use \b for word boundaries to avoid partial matches
Combine RegEx with UDFs for dynamic extraction
Return CVErr(xlErrNA) if you want to signal no match in Excel cells
Cat On A Spreadsheet