01.02.01 Definition of Regular Expressions

What does „Regular Expression“ mean?

Regex are not only used in TB! You can find them in quite a lot of different UNIX-tools (e.g. grep), in some programming languages like PERL (Practical Extraction and Report Language, sometimes called ‚Pathologically Eclectic Rubbish Lister‘ <bg>), PHP, Javascript and even various editors like UltraEdit or jEdit use them.

Laura Lemay wrote in her book „PERL in 21 days“ that the term „Regular Expression“ makes no sense at first sight (to be honest: even at second sight it still makes no sense to me), because these are not real expressions and furthermore no one really can explain why they are „regular“! Well, let’s ignore this; let’s simply accept that the term „Regular Expression“ has its origin in formal algebra and that they are indeed part of Mathematics.

The easiest and most convenient way to define „Regular Expression“ is to say: „They are search patterns to match characters in strings.“

Those of you who have tried to find files using the DOS command line or the search function in the Explorer may have used patterns like:

dir *.doc
copy *.??t c:\temp

These examples show patterns that consist of letters, stars, question marks and other characters to define which files should be listed or copied. In the first example only files that have the suffix „doc“ should be listed. In the second example only files that have a three-letter suffix and a „t“ as last character in the suffix should be copied.

But these regex are merely wildcards! In no way as mighty as „Regular Expressions“. One can’t compare them to real regex, which offer much more than wildcards for characters.

next