| 1 |
nigel |
41 |
The perltest program |
| 2 |
|
|
-------------------- |
| 3 |
|
|
|
| 4 |
ph10 |
456 |
The perltest.pl script tests Perl's regular expressions; it has the same |
| 5 |
nigel |
41 |
specification as pcretest, and so can be given identical input, except that |
| 6 |
ph10 |
691 |
input patterns can be followed only by Perl's lower case modifiers and certain |
| 7 |
ph10 |
678 |
other pcretest modifiers that are either handled or ignored: |
| 8 |
nigel |
41 |
|
| 9 |
ph10 |
678 |
/+ recognized and handled by perltest |
| 10 |
|
|
/++ the second + is ignored |
| 11 |
|
|
/8 recognized and handled by perltest |
| 12 |
|
|
/J ignored |
| 13 |
|
|
/K ignored |
| 14 |
|
|
/W ignored |
| 15 |
|
|
/S ignored |
| 16 |
|
|
/SS ignored |
| 17 |
ph10 |
903 |
/Y ignored |
| 18 |
ph10 |
678 |
|
| 19 |
ph10 |
868 |
The pcretest \Y escape in data lines is removed before matching. The data lines |
| 20 |
|
|
are processed as Perl double-quoted strings, so if they contain " $ or @ |
| 21 |
|
|
characters, these have to be escaped. For this reason, all such characters in |
| 22 |
|
|
the Perl-compatible testinput1 file are escaped so that they can be used for |
| 23 |
|
|
perltest as well as for pcretest. The special upper case pattern modifiers such |
| 24 |
|
|
as /A that pcretest recognizes, and its special data line escapes, are not used |
| 25 |
|
|
in the Perl-compatible test file. The output should be identical, apart from |
| 26 |
|
|
the initial identifying banner. |
| 27 |
nigel |
41 |
|
| 28 |
ph10 |
456 |
The perltest.pl script can also test UTF-8 features. It recognizes the special |
| 29 |
|
|
modifier /8 that pcretest uses to invoke UTF-8 functionality. The testinput4 |
| 30 |
ph10 |
458 |
and testinput6 files can be fed to perltest to run compatible UTF-8 tests. |
| 31 |
ph10 |
871 |
However, it is necessary to add "use utf8; require Encode" to the script to |
| 32 |
ph10 |
903 |
make this work correctly. I have not managed to find a way to handle this |
| 33 |
ph10 |
871 |
automatically. |
| 34 |
nigel |
49 |
|
| 35 |
ph10 |
456 |
The other testinput files are not suitable for feeding to perltest.pl, since |
| 36 |
|
|
they make use of the special upper case modifiers and escapes that pcretest |
| 37 |
ph10 |
871 |
uses to test certain features of PCRE. Some of these files also contain |
| 38 |
|
|
malformed regular expressions, in order to check that PCRE diagnoses them |
| 39 |
|
|
correctly. |
| 40 |
nigel |
63 |
|
| 41 |
nigel |
77 |
Philip Hazel |
| 42 |
ph10 |
868 |
January 2012 |