LogoReturn to Home RegEx
Integrations
All integrations
AWS API
AWS Lambda
DynamoDB
Oracle
Redshift
Snowflake
GraphQL
Supabase
Twilio
Azure Blob Storage
Slack
SendGrid
Generic HTTP API
AWS S3
Stripe
Microsoft SQL
Salesforce
PostgreSQL
MySQL
MongoDB
HubSpot
Google Sheets
Google BigQuery
Firebase
Airtable
Integrations
About UI Bakery
Log in
Request UI Bakery demo
RegEx library
Email regex PHP
Phone number regex PHP
IP address regex PHP
Date regex PHP
URL regex PHP
Numbers only regex (digits only) PHP
UUID regex PHP
Regex match words PHP
ZIP code regex PHP
GUID regex PHP
Password regex PHP
HTML regex PHP
SSN regex PHP
XML regex PHP
Mac address regex PHP
Street address regex PHP

HTML regex PHP

HTML stands for HyperText Markup Language and is used to display information in the browser. HTML regular expressions can be used to find tags in the text, extract them or remove them. Generally, it’s not a good idea to parse HTML with regex, but a limited known set of HTML can be sometimes parsed.

Discover UI Bakery – an intuitive visual internal tools builder. Try it now!
JavaScript
Python
Java
C#
PHP
No items found.

Match all HTML tags

Below is a simple regex to validate the string against HTML tag pattern. This can be later used to remove all tags and leave text only.

"/<(?:\"[^\"]*\"['\"]*|'[^']*'['\"]*|[^'\">])+>/"
Test it!
/<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>/

True

False

Enter a text in the input above to see the result

Example code in PHP:

// Remove all HTML tags from a string
$html_pattern = "/<(?:\"[^\"]*\"['\"]*|'[^']*'['\"]*|[^'\">])+>/";
$string_to_match = '<html><body>Hello, <b>world</b>!<br /></body></html>';
preg_replace($html_pattern, $string_to_match, $output);
echo $output; // prints 'Hello, world!'

Extract text between certain tags

One of the most common operations with HTML and regex is the extraction of the text between certain tags (a.k.a. scraping). For this operation, the following regular expression can be used.

$pattern1 = '/<div>(.*?)<\\/div>/'; // Tag only
$pattern2 = '/(?:<div.*?class=\"some-class\".*?>)(.*?)(?:<\\/div>)/';// Tag and class
Test it!
/<div>(.*?)<\/div>/g

True

False

Enter a text in the input above to see the result

Example code in PHP:

// Extract text between specific HTML tag
$extract_from_html_pattern = '/(?:<div.*?class=\"some-class\".*?>)(.*?)(?:<\\/div>)/';
$string_to_match = '<html><body>Probably.<div class="some-class">Hello, world!</div><br />Today</body></html>';
preg_match_all($extract_from_html_pattern, $string_to_match, $matches);
print_r($matches[1]) // matches[0] is ['Hello, world!']

‍

Test it!

True

False

Enter a text in the input above to see the result

Notes on HTML regex

You should never use regular expressions to fully parse HTML documents as regular expressions are not intended for such tasks. Instead, you can use HTML or XML document parsers that can do validation alongside parsing.

Create an internal tool with UI Bakery

Discover UI Bakery – an intuitive visual internal tools builder.

Try it now