The Ultimate Regex Cheat Sheet with Real-World Examples (2026)
Regular expressions are one of those tools that every developer needs but nobody enjoys writing from scratch. You know the pattern exists somewhere in your brain — or more likely, somewhere on Stack Overflow — but translating "match an email address" into ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ never feels intuitive.
This cheat sheet is different. Instead of listing every metacharacter in alphabetical order (you can find that anywhere), we've organized patterns by real-world use case. Each section includes a production-ready pattern, a plain-English explanation, and notes on edge cases. Bookmark this page — you'll be back.
Quick Reference: Core Syntax
If you need a refresher on the basics, here's the essential syntax. Skip ahead to the real-world patterns if you're already comfortable with regex fundamentals.
| Pattern | Meaning | Example |
|---|---|---|
| . | Any character except newline | a.c → "abc", "a1c" |
| ^ | Start of string (or line with m flag) | ^Hello → "Hello world" |
| $ | End of string (or line with m flag) | world$ → "Hello world" |
| * | Zero or more of previous | ab*c → "ac", "abc", "abbc" |
| + | One or more of previous | ab+c → "abc", "abbc" (not "ac") |
| ? | Zero or one of previous (optional) | colou?r → "color", "colour" |
| {n,m} | Between n and m repetitions | a{2,4} → "aa", "aaa", "aaaa" |
| [abc] | Character class — any of a, b, c | [aeiou] → any vowel |
| [^abc] | Negated class — not a, b, or c | [^0-9] → any non-digit |
| \d | Digit [0-9] | \d{3} → "123", "456" |
| \w | Word character [a-zA-Z0-9_] | \w+ → "hello_world" |
| \s | Whitespace (space, tab, newline) | \s+ → " " |
| \b | Word boundary | \bcat\b → "cat" (not "catch") |
| (abc) | Capture group | (\d+)-(\d+) → groups "123" and "456" |
| (?:abc) | Non-capturing group | Groups without capturing |
| a|b | Alternation — a or b | cat|dog → "cat" or "dog" |
| (?=abc) | Positive lookahead | \d+(?=px) → "12" in "12px" |
| (?<=abc) | Positive lookbehind | (?<=\$)\d+ → "50" in "$50" |
Real-World Patterns You'll Actually Use
Here's where it gets practical. These are the patterns developers reach for most often, tested against real data and edge cases.
Email Validation
📧 Email Address
Matches standard email formats. Handles dots, hyphens, and plus-addressing ([email protected]). Requires a TLD of at least 2 characters.
Matches: [email protected], [email protected]
Doesn't match: @domain.com, [email protected], user@domain
Caveat: No regex can fully validate email per RFC 5322. For production, send a confirmation email — that's the only real validation. This pattern covers 99.9% of real-world addresses.
URL Matching
🔗 URL (HTTP/HTTPS)
Matches HTTP and HTTPS URLs with optional www prefix. Handles query strings, fragments, and paths. Works for most common URL formats.
Matches: https://example.com/path?q=test#section, http://sub.domain.co.uk/page
Doesn't match: ftp://files.example.com, example.com (no protocol)
IP Address Validation
🌐 IPv4 Address
Validates IPv4 addresses with proper octet range (0-255). Rejects values like 256.1.1.1 or 999.999.999.999 that simpler patterns would accept.
Why this matters: The naive pattern \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} matches "999.999.999.999" — which isn't a valid IP. The pattern above enforces the 0-255 range per octet.
Date Formats
📅 ISO 8601 Date (YYYY-MM-DD)
Matches dates in ISO 8601 format with basic month (01-12) and day (01-31) validation. Doesn't validate month-day combinations (e.g., Feb 31).
📅 Common Date Formats (MM/DD/YYYY or DD/MM/YYYY)
Matches dates with slash or hyphen separators. Accepts both DD/MM/YYYY and MM/DD/YYYY — you'll need application logic to disambiguate.
Password Strength
🔐 Strong Password
Requires at least 8 characters with: one lowercase, one uppercase, one digit, and one special character. Uses lookaheads to check each requirement independently.
How it works: Each (?=.*X) is a lookahead that asserts "somewhere in the string, there's an X" without consuming characters. This lets you check multiple conditions at the same position.
Phone Numbers
📱 International Phone (E.164)
Matches E.164 international phone format: plus sign, country code (1-3 digits), subscriber number. Total 2-15 digits after the plus.
📱 US Phone Number (Flexible)
Matches US phone numbers in various formats: (555) 123-4567, 555-123-4567, 5551234567, +1 555 123 4567.
Code & Development Patterns
🏷️ HTML Tags
Matches opening and closing HTML tag pairs. The \1 backreference ensures the closing tag matches the opening tag name. Use with caution — regex is not an HTML parser.
📝 Hex Color Code
Matches 3-digit (#fff) and 6-digit (#ffffff) hex color codes. Case-insensitive.
🔢 Semantic Version (SemVer)
Full SemVer 2.0.0 validation including pre-release tags and build metadata. Matches "1.2.3", "1.0.0-alpha.1", "2.1.0+build.123".
Log Parsing
📋 Common Log Format (Apache/Nginx)
Extracts IP, timestamp, method, path, status code, and bytes from standard web server access logs. Groups: (1) IP, (2) timestamp, (3) method, (4) path, (5) status, (6) bytes.
# Example log line:
# 192.168.1.1 - - [27/Feb/2026:10:15:32 +0000] "GET /api/users HTTP/1.1" 200 1234
#
# Captured groups:
# Group 1: 192.168.1.1
# Group 2: 27/Feb/2026:10:15:32 +0000
# Group 3: GET
# Group 4: /api/users
# Group 5: 200
# Group 6: 1234
📋 JSON Key-Value Extraction
Extracts string key-value pairs from JSON. Group 1 is the key, Group 2 is the value. For proper JSON parsing, use a JSON parser — but this works great for quick log grep.
Advanced Techniques
Named Capture Groups
Instead of referencing groups by number, give them names for readable code:
# Python
import re
pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
match = re.match(pattern, '2026-02-27')
print(match.group('year')) # 2026
print(match.group('month')) # 02
# JavaScript
const pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
const match = '2026-02-27'.match(pattern)
console.log(match.groups.year) // 2026
console.log(match.groups.month) // 02
Lazy vs. Greedy Quantifiers
By default, quantifiers are greedy — they match as much as possible. Add ? to make them lazy:
# Greedy: matches everything between FIRST < and LAST >
<.+> on "<b>bold</b>" → "<b>bold</b>"
# Lazy: matches between FIRST < and NEXT >
<.+?> on "<b>bold</b>" → "<b>", "</b>"
This is one of the most common regex gotchas. When extracting content between delimiters, you almost always want the lazy version.
Atomic Groups & Possessive Quantifiers
For performance-critical regex (processing millions of lines), possessive quantifiers prevent catastrophic backtracking:
# Standard (can backtrack — slow on non-matches):
^(a+)+$
# Possessive (no backtracking — fails fast):
^(a++)++$ # Supported in Java, PCRE, .NET
# Atomic group equivalent:
^(?>a+)+$ # Supported in PCRE, .NET, Ruby
Rule of thumb: If your regex takes more than a second on a non-matching string, you likely have a catastrophic backtracking problem. Possessive quantifiers or atomic groups are the fix.
Common Mistakes to Avoid
- Forgetting to escape dots —
.matches ANY character. Use\.for a literal dot. The pattern192.168.1.1also matches "192x168y1z1". - Greedy matching in HTML —
<.*>on<b>text</b>matches the entire string, not just<b>. Use<.*?>instead. - Not anchoring patterns — Without
^and$, your pattern matches substrings.\d{3}matches "123" inside "abc123def". - Catastrophic backtracking — Nested quantifiers like
(a+)+can cause exponential runtime. Always test with non-matching input. - Using regex for everything — Don't parse HTML, JSON, or XML with regex. Use a proper parser. Regex is for pattern matching, not structural parsing.
Stop Writing Regex by Hand
Even with this cheat sheet, writing complex regex from scratch is error-prone. Our AI Regex Generator lets you describe what you want to match in plain English and generates a tested pattern instantly. It also explains each part of the pattern so you understand what it does.
🎯 Describe your pattern in English, get production-ready regex instantly. No guessing, no debugging.
Try AI Regex Generator →