Regular Expressions
Match and extract patterns in text — validate phone numbers, find data, and clean strings with pregmatch and pregreplace.
What you will learn
- Understand what a regular expression is
- Match and validate text with preg_match
- Find, extract and replace patterns
What is a regular expression?
A regular expression (regex for short) is a tiny pattern language for describing the *shape* of text — "a 10-digit number", "something that looks like an email", "a word followed by a space". You write the pattern once, and PHP can then check whether text matches it, pull pieces out, or replace them. It is the standard tool for validation and text-parsing across every language.
A pattern is written between two slashes, like /.../ . A few building blocks cover most needs:
| Piece | Matches |
|---|---|
\d | Any digit 0-9 |
\w | A word character (letter, digit, underscore) |
+ | One or more of the previous piece |
{10} | Exactly 10 of the previous piece |
^ ... $ | Start and end of the text |
[a-z] | Any one lowercase letter |
Validating with preg_match
preg_match($pattern, $text) returns 1 if the text matches the pattern and 0 if it does not — perfect for validation. Here we check that a string is exactly a 10-digit Indian mobile number.
<?php
$phone = "9876543210";
$pattern = "/^\d{10}$/"; // start, exactly 10 digits, end
if (preg_match($pattern, $phone)) {
echo "Valid phone number";
} else {
echo "Invalid — must be 10 digits";
}
?>Read the pattern /^\d{10}$/ piece by piece: the ^ anchors to the start of the text, \d{10} demands exactly ten digits, and $ anchors to the end — together they mean "nothing but ten digits, start to finish". preg_match returns 1 for "9876543210", so the if is true and we print "Valid".
Note: Output (in the browser):
Valid phone number
Without the ^ and $ anchors, a value like "abc9876543210xyz" would also pass, because the ten digits exist *somewhere* inside. The anchors force the whole string to be just those digits.
Extracting and replacing
Round brackets (...) in a pattern capture the part they wrap, so you can pull it out of $matches. And preg_replace swaps every match for new text.
<?php
// extract the year from a date string
$date = "Published: 2026-06-13";
preg_match("/(\d{4})-\d{2}-\d{2}/", $date, $matches);
echo "Year: " . $matches[1];
// replace all digits with a hash
$card = "Card 4242 4242";
echo "<br>" . preg_replace("/\d/", "#", $card);
?>In the first part, the pattern matches a YYYY-MM-DD date, and the brackets around (\d{4}) capture the four-digit year. After preg_match runs, $matches[1] holds that captured year (slot [0] is always the whole match, [1] the first captured group). In the second part, preg_replace("/\d/", "#", $card) finds every single digit and replaces it with #, masking the numbers.
Note: Output (in the browser):
Year: 2026
Card 4242 4242 → Card #### ####
Capturing groups let you *extract* a specific slice; preg_replace lets you *transform* text wholesale. Together they handle most parsing and cleaning jobs.
Tip: Regex is powerful but easy to over-use. For emails, PHP’s filter_var($email, FILTER_VALIDATE_EMAIL) is more reliable than a hand-written pattern — reach for regex when there is no built-in tool.
Q. What does preg_match return when the text matches the pattern?
✍️ Practice
- Validate that a PIN code is exactly 6 digits with
preg_match. - Use
preg_replaceto remove all spaces from a string.
🏠 Homework
- Write a function that extracts every hashtag (a word starting with #) from a sentence using a regex.