19. Regular Expressions for Pattern Matching
Level: AdvancedDuration: 38m
Introduction to Regular Expressions
Regular expressions (regex) are powerful tools for pattern matching and text processing. They allow you to search for specific patterns, validate inputs, and extract data efficiently.
The `re` Module
Python provides the `re` module to work with regular expressions. Key functions include `match()`, `search()`, `findall()`, `sub()`, and `split()`.
python
import re
text = 'My phone number is 123-456-7890.'
pattern = r'\d{3}-\d{3}-\d{4}'
match = re.search(pattern, text)
if match:
print('Found:', match.group())Basic Pattern Syntax
- `\d` – digit, `\D` – non-digit
- `\w` – word character, `\W` – non-word character
- `\s` – whitespace, `\S` – non-whitespace
- `.` – any character except newline
- `*` – 0 or more, `+` – 1 or more, `?` – 0 or 1
- `{n}` – exactly n times, `{n,m}` – between n and m times
- `^` – start of string, `$` – end of string
- `[]` – character set, `|` – OR
Finding All Matches
python
text = 'Emails: alice@example.com, bob@domain.org'
pattern = r'[\w.-]+@[\w.-]+'
emails = re.findall(pattern, text)
print(emails)Replacing Text with `sub`
python
text = 'My phone number is 123-456-7890.'
pattern = r'\d{3}-\d{3}-\d{4}'
masked = re.sub(pattern, 'XXX-XXX-XXXX', text)
print(masked)Splitting Text
python
text = 'apple, banana; cherry|date'
pattern = r'[;,|]'
fruits = re.split(pattern, text)
print(fruits)Best Practices
- Use raw strings (`r'pattern'`) to avoid issues with escape characters.
- Test regex patterns with simple examples before using them in production.
- Keep patterns readable; complex regex can be hard to debug.
- Use `re.compile()` for frequently used patterns to improve performance.
Common Misconceptions
- Regex is not always the fastest solution for simple string operations.
- Overcomplicated regex can be error-prone; sometimes Python string methods suffice.
- Lookahead/lookbehind patterns are powerful but tricky; test them carefully.
Python `re` Module Documentation
💡 Regular expressions are like supercharged find-and-replace. Learn the basics first, then practice with real-world patterns.
Mini Project Step
Write a program that extracts all phone numbers and email addresses from a block of text. Then mask the phone numbers using `sub` for privacy.