Go to the first, previous, next, last section, table of contents.


8. Regular expressions

Exim uses the PCRE regular expression library; this provides regular expression matching that is compatible with Perl 5. The syntax and semantics of these regular expressions is discussed in many Perl reference books, and also in Jeffrey Friedl's Mastering Regular Expressions (O'Reilly, ISBN 1-56592-257-3).

The documentation for PCRE, in plain text and HTML, is included in the doc directory of the Exim distribution. This describes the features of the regular expressions that PCRE supports, so no further description is included here. The PCRE functions are called from Exim using the default option settings, except that the PCRE_CASELESS option is set when the matching is required to be independent of the case of letters.

8.1 Testing regular expressions

A program called pcretest forms part of the PCRE distribution and is built with PCRE during the process of building Exim. It is primarily intended for testing PCRE itself, but it can also be used for experimenting with regular expressions. The binary can be found in the util sub-directory of the Exim build directory. There is documentation of various options in doc/pcretest.txt, but for simple testing, none are needed. This is the output of a sample run of pcretest:

  re> /^([^@]+)@.+\.(ac|edu)\.(?!kr)[a-z]{2}$/
data> x@y.ac.uk
 0: x@y.ac.uk
 1: x
 2: ac
data> x@y.ac.kr
No match
data> x@y.edu.com
No match
data> x@y.edu.co
 0: x@y.edu.co
 1: x
 2: edu

After the `re>' prompt, a regular expression enclosed in delimiters is expected. If this compiles without error, `data>' prompts are given for strings against which the expression is matched. An empty data line causes a new regular expression to be read. If the match is successful, the captured substring values (that is, what would be in the variables $0, $1, $2, etc.) are shown. The above example tests for an email address whose domain ends with either `ac' or `edu' followed by a two-character top-level domain that is not `kr'. The local part is captured in $1 and the `ac' or `edu' in $2.


Go to the first, previous, next, last section, table of contents.