Constructing hard regular expressions with VerbalExpressions


Most newbie (and some seasoned) programmers have difficultly constructing Regular Expressions. Many a times one needs to create a Regexp quickly to test a a particular piece of code. However, not being comfortable withe Regexps can be a problem. VerbalExpressions is a PHP library that enables you to construct regular expressions using natural language like constructs. Think of it like a DSL for building Regexps.

Below is a sample PHP code that constructs a regular expression which tests whether a given string is a valid url.

<?php
 
include_once('VerbalExpressions.php');
 
$regex = new VerEx;
 
$regex  ->startOfLine()
        ->then("http")
        ->maybe("s")
        ->then("://")
        ->maybe("www.")
        ->anythingBut(" ")
        ->endOfLine();
 
 
if($regex->test("http://www.codediesel.com"))
    echo "valid url";
else
    echo "invalid url";
 
?>

The main part of the code the DSL like interface that helps you build a Regexp.

$regex  ->startOfLine()
        ->then("http")
        ->maybe("s")
        ->then("://")
        ->maybe("www.")
        ->anythingBut(" ")
        ->endOfLine();

If you want to see what regular expression the code has built, we can use the getRegex function of the class.

<?php
 
include_once('VerbalExpressions.php');
 
$regex = new VerEx;
 
$regex  ->startOfLine()
        ->then("http")
        ->maybe("s")
        ->then("://")
        ->maybe("www.")
        ->anythingBut(" ")
        ->endOfLine();
 
echo $regex->getRegex();

This will print the Regexp given below.

/^(http)(s)?(\:\/\/)(www\.)?([^ ]*)$/m

We can now use the above Regexp in our code to accomplish the same thing as above.

$myRegexp = '/^(http)(s)?(\:\/\/)(www\.)?([^ ]*)$/m';
 
if (preg_match($myRegexp, 'http://www.codediesel.com')) {
    echo 'valid url';
} else {
    echo 'invalud url';
}

We can also use the $regex object given in the above example directly in our code where the particular Regexp is required.

include_once('VerbalExpressions.php');
 
$regex = new VerEx;
 
$regex  ->startOfLine()
        ->then("http")
        ->maybe("s")
        ->then("://")
        ->maybe("www.")
        ->anythingBut(" ")
        ->endOfLine();
 
if (preg_match($regex, 'http://www.codediesel.com')) {
    echo 'valid url';
} else {
    echo 'invalud url';
}

The PHP version of VerbalExpressions is a port of the original JavaScript version, JSVerbalExpressions. Additional modifiers that will help you build regular expressions can be found here.

This site is a digital habitat of Sameer Borate, a freelance web developer working in PHP, MySQL and WordPress. I also provide web scraping services, website design and development and integration of various Open Source API's. Contact me at metapix[at]gmail.com for any new project requirements and price quotes.

3 Responses

1

Steven Wade

August 14th, 2013 at 10:14 am

Great article! VerbalExpressions is a great library and helps me out a lot.

I built a little GUI that uses the JS version the library to help build regular expressions.

http://buildregex.com

2

一种更符合自然语意的PHP正则表达式

August 15th, 2013 at 7:10 am

[...] 文章节选:constructing-hard-regular-expressions-with-verbalexpressions [...]

3

PHP Digest №1 ( 11.08.2013 — 25.08.2013 ) | FASIGN Blog

August 28th, 2013 at 10:40 am

[...] Reguläre Ausrücke mit hilfe von VerbalExpressions bauen — VerbalExpressions ist eine PHP-Lib, mit der man Reguläre ausdrücke mit einfacher Sprache zusammenstellen kann ohne sich im RegEx auszukennen. z.B. um eine URL zu matchen: $regex->startOfLine()->then("http")->maybe("s")->then("://")->maybe("www.") ->anythingBut(" ")->endOfLine(); [...]

Your thoughts