Home » Articles » Unique Product Features » Do You Know What Your Regular Expressions Are Doing?

Do You Know What Your Regular Expressions Are Doing?

By Mark Joseph - September 6, 2010 @ 8:53 am

It can be difficult to derive the correct regular expression for the problem you are trying to solve. There are many internet sites (e.g., Regular Expression Tester) and PC applications that help to construct what seems to be the right expression. However, how do you know that that expression is working properly in a production system? Sure a program can log each input string that a regex was successful for or failed on, but wouldn’t be better to see details on exactly what the regular expression was doing step by step against each input string. For this reason, we have added a regex-oriented tracing feature to P6R’s RGX™ 1.0 Regular Expression Engine.


Using the following code snippet:

p6IWRegex* pRegex = NULL;
P6UINT32 offset = 0;
P6UINT32 strLength = 0;
P6ERR err = eOk;

// * * * *

pRegex->setTrace( P6WREGEX_TRACE_BASIC );

err = pRegex->compile( P6TEXT( "[a-zA-Z09]+,?[ ]+([0-9][0-9])[ ]+([a-zA-z]+)[ ]+
([0-9][0-9][0-9][0-9])[ ]+([0-9][0-9]):([0-9][0-9]):([0-9][0-9])[ ]+([a-zA-z0-9+-]+)"), 
P6MOD_NULL );   

pRegex->search( P6TEXT("Wed, 02 Oct 2002 13:00:00 GMT"), P6MOD_NULL, &offset, &strLength );

pRegex->setTrace( P6WREGEX_TRACE_OFF );


we get the following trace written into a log for the successful match:

09/06/2010-15:26:05 WRegex regex: '[a-zA-Z09]+,?[ ]+([0-9][0-9])[ ]+([a-zA-z]+)[ ]+
([0-9][0-9][0-9][0-9])[ ]+([0-9][0-9]):([0-9][0-9]):([0-9][0-9])[ ]+([a-zA-z0-9+-]+)'  
flavor: Perl

09/06/2010-15:26:05 WRegex function search in input string: 'Wed, 02 Oct 2002 13:00:00 GMT',  
using engine: Perl backtracking

09/06/2010-15:26:05 WRegex found '[a-zA-Z09]' at offset 0
09/06/2010-15:26:05 WRegex found '[a-zA-Z09]' at offset 1
09/06/2010-15:26:05 WRegex found '[a-zA-Z09]' at offset 2
09/06/2010-15:26:05 WRegex found ',' at offset 3
09/06/2010-15:26:05 WRegex found '[ ]' at offset 4
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 5
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 6
09/06/2010-15:26:05 WRegex found '[ ]' at offset 7
09/06/2010-15:26:05 WRegex found '[a-zA-z]' at offset 8
09/06/2010-15:26:05 WRegex found '[a-zA-z]' at offset 9
09/06/2010-15:26:05 WRegex found '[a-zA-z]' at offset 10
09/06/2010-15:26:05 WRegex found '[ ]' at offset 11
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 12
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 13
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 14
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 15
09/06/2010-15:26:05 WRegex found '[ ]' at offset 16
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 17
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 18
09/06/2010-15:26:05 WRegex found ':' at offset 19
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 20
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 21
09/06/2010-15:26:05 WRegex found ':' at offset 22
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 23
09/06/2010-15:26:05 WRegex found '[0-9]' at offset 24
09/06/2010-15:26:05 WRegex found '[ ]' at offset 25
09/06/2010-15:26:05 WRegex found '[a-zA-z0-9+-]' at offset 26
09/06/2010-15:26:05 WRegex found '[a-zA-z0-9+-]' at offset 27
09/06/2010-15:26:05 WRegex found '[a-zA-z0-9+-]' at offset 28


This tracing feature can be turned on when the regular expression component is created and/or can be turned on/off by the component method p6IWRegex::setTrace(); Thus this type of tracing can be done programatically based on whatever conditions the developer or QA engineer chooses. This dynamic tracing is useful for automatic test tools to help detect a bug and provide detailed information to the developer to aid in its repair.

All P6R products are designed with the developer and QA engineer in mind. Our goal is to make development of complex software systems easier. The inclusion of a detailed, regex-oriented tracing feature in our regular expression engine helps to achieve this goal.

"Do You Know What Your Regular Expressions Are Doing?" was published on September 6th, 2010 and is listed in Unique Product Features.

Follow comments via the RSS Feed | Leave a comment | Trackback URL


Leave Your Comment