What Is Regex? (Regular Expressions) Explained
A regular expression (regex or regexp) is a sequence of characters that forms a search pattern. Regex is used to match, find, or replace text based on rules — such as 'find any string that looks like an email address' or 'remove all non-numeric characters from this text'.
Core Regex Syntax
Key metacharacters: . matches any character except newline; * matches 0 or more; + matches 1 or more; ? matches 0 or 1; ^ anchors to line start; $ anchors to line end; [] defines a character class ([a-z] = any lowercase letter); | is alternation (cat|dog = cat or dog); () groups; \ escapes a special character. Quantifiers: {n} exactly n; {n,m} between n and m.
Common Regex Patterns
Email validation (simplified): ^[\w.-]+@[\w.-]+\.[a-z]{2,}$. URL: ^https?:\/\/[\w.-]+(\.[\w.-]+)+(\/.*)? $. Phone (US): ^\+?1?[\s.-]?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$. IPv4: ^(\d{1,3}\.){3}\d{1,3}$. Hex color: ^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$.
Regex Flags
Common flags: i (case-insensitive), g (global — find all matches, not just first), m (multiline — ^ and $ match line boundaries), s (dotall — . matches newlines), u (unicode — treat pattern as UTF-16 code points), v (new in 2023 — extended Unicode with set notation). Flags are appended after the closing slash: /pattern/gi.
Catastrophic Backtracking
Poorly written regex with nested quantifiers can cause exponential backtracking on certain inputs, making the engine hang or crash — a vulnerability called ReDoS (Regular Expression Denial of Service). Example: (a+)+ on a string like 'aaaaaaaaab'. Always test regex against worst-case inputs and prefer atomic groups or possessive quantifiers when available.