| 71 |
-S size On Unix-like systems, set the size of the run-time stack to |
-S size On Unix-like systems, set the size of the run-time stack to |
| 72 |
size megabytes. |
size megabytes. |
| 73 |
|
|
| 74 |
-s Behave as if each pattern has the /S modifier; in other |
-s or -s+ Behave as if each pattern has the /S modifier; in other |
| 75 |
words, force each pattern to be studied. If the /I or /D |
words, force each pattern to be studied. If -s+ is used, the |
| 76 |
option is present on a pattern (requesting output about the |
PCRE_STUDY_JIT_COMPILE flag is passed to pcre_study(), caus- |
| 77 |
compiled pattern), information about the result of studying |
ing just-in-time optimization to be set up if it is avail- |
| 78 |
is not included when studying is caused only by -s and nei- |
able. If the /I or /D option is present on a pattern |
| 79 |
ther -i nor -d is present on the command line. This behaviour |
(requesting output about the compiled pattern), information |
| 80 |
means that the output from tests that are run with and with- |
about the result of studying is not included when studying is |
| 81 |
out -s should be identical, except when options that output |
caused only by -s and neither -i nor -d is present on the |
| 82 |
information about the actual running of a match are set. The |
command line. This behaviour means that the output from tests |
| 83 |
-M, -t, and -tm options, which give information about |
that are run with and without -s should be identical, except |
| 84 |
resources used, are likely to produce different output with |
when options that output information about the actual running |
| 85 |
and without -s. Output may also differ if the /C option is |
of a match are set. The -M, -t, and -tm options, which give |
| 86 |
present on an individual pattern. This uses callouts to trace |
information about resources used, are likely to produce dif- |
| 87 |
the the matching process, and this may be different between |
ferent output with and without -s. Output may also differ if |
| 88 |
studied and non-studied patterns. If the pattern contains |
the /C option is present on an individual pattern. This uses |
| 89 |
(*MARK) items there may also be differences, for the same |
callouts to trace the the matching process, and this may be |
| 90 |
reason. The -s command line option can be overridden for spe- |
different between studied and non-studied patterns. If the |
| 91 |
cific patterns that should never be studied (see the /S |
pattern contains (*MARK) items there may also be differences, |
| 92 |
option below). |
for the same reason. The -s command line option can be over- |
| 93 |
|
ridden for specific patterns that should never be studied |
| 94 |
|
(see the /S pattern modifier below). |
| 95 |
|
|
| 96 |
-t Run each compile, study, and match many times with a timer, |
-t Run each compile, study, and match many times with a timer, |
| 97 |
and output resulting time per compile or match (in millisec- |
and output resulting time per compile or match (in millisec- |
| 247 |
subject contains multiple copies of the same substring. If the + modi- |
subject contains multiple copies of the same substring. If the + modi- |
| 248 |
fier appears twice, the same action is taken for captured substrings. |
fier appears twice, the same action is taken for captured substrings. |
| 249 |
In each case the remainder is output on the following line with a plus |
In each case the remainder is output on the following line with a plus |
| 250 |
character following the capture number. |
character following the capture number. Note that this modifier must |
| 251 |
|
not immediately follow the /S modifier because /S+ has another meaning. |
| 252 |
|
|
| 253 |
The /= modifier requests that the values of all potential captured |
The /= modifier requests that the values of all potential captured |
| 254 |
parentheses be output after a match by pcre_exec(). By default, only |
parentheses be output after a match by pcre_exec(). By default, only |
| 255 |
those up to the highest one actually used in the match are output (cor- |
those up to the highest one actually used in the match are output (cor- |
| 256 |
responding to the return code from pcre_exec()). Values in the offsets |
responding to the return code from pcre_exec()). Values in the offsets |
| 257 |
vector corresponding to higher numbers should be set to -1, and these |
vector corresponding to higher numbers should be set to -1, and these |
| 258 |
are output as "<unset>". This modifier gives a way of checking that |
are output as "<unset>". This modifier gives a way of checking that |
| 259 |
this is happening. |
this is happening. |
| 260 |
|
|
| 261 |
The /B modifier is a debugging feature. It requests that pcretest out- |
The /B modifier is a debugging feature. It requests that pcretest out- |
| 262 |
put a representation of the compiled byte code after compilation. Nor- |
put a representation of the compiled byte code after compilation. Nor- |
| 263 |
mally this information contains length and offset values; however, if |
mally this information contains length and offset values; however, if |
| 264 |
/Z is also present, this data is replaced by spaces. This is a special |
/Z is also present, this data is replaced by spaces. This is a special |
| 265 |
feature for use in the automatic test scripts; it ensures that the same |
feature for use in the automatic test scripts; it ensures that the same |
| 266 |
output is generated for different internal link sizes. |
output is generated for different internal link sizes. |
| 267 |
|
|
| 268 |
The /D modifier is a PCRE debugging feature, and is equivalent to /BI, |
The /D modifier is a PCRE debugging feature, and is equivalent to /BI, |
| 269 |
that is, both the /B and the /I modifiers. |
that is, both the /B and the /I modifiers. |
| 270 |
|
|
| 271 |
The /F modifier causes pcretest to flip the byte order of the fields in |
The /F modifier causes pcretest to flip the byte order of the fields in |
| 272 |
the compiled pattern that contain 2-byte and 4-byte numbers. This |
the compiled pattern that contain 2-byte and 4-byte numbers. This |
| 273 |
facility is for testing the feature in PCRE that allows it to execute |
facility is for testing the feature in PCRE that allows it to execute |
| 274 |
patterns that were compiled on a host with a different endianness. This |
patterns that were compiled on a host with a different endianness. This |
| 275 |
feature is not available when the POSIX interface to PCRE is being |
feature is not available when the POSIX interface to PCRE is being |
| 276 |
used, that is, when the /P pattern modifier is specified. See also the |
used, that is, when the /P pattern modifier is specified. See also the |
| 277 |
section about saving and reloading compiled patterns below. |
section about saving and reloading compiled patterns below. |
| 278 |
|
|
| 279 |
The /I modifier requests that pcretest output information about the |
The /I modifier requests that pcretest output information about the |
| 280 |
compiled pattern (whether it is anchored, has a fixed first character, |
compiled pattern (whether it is anchored, has a fixed first character, |
| 281 |
and so on). It does this by calling pcre_fullinfo() after compiling a |
and so on). It does this by calling pcre_fullinfo() after compiling a |
| 282 |
pattern. If the pattern is studied, the results of that are also out- |
pattern. If the pattern is studied, the results of that are also out- |
| 283 |
put. |
put. |
| 284 |
|
|
| 285 |
The /K modifier requests pcretest to show names from backtracking con- |
The /K modifier requests pcretest to show names from backtracking con- |
| 286 |
trol verbs that are returned from calls to pcre_exec(). It causes |
trol verbs that are returned from calls to pcre_exec(). It causes |
| 287 |
pcretest to create a pcre_extra block if one has not already been cre- |
pcretest to create a pcre_extra block if one has not already been cre- |
| 288 |
ated by a call to pcre_study(), and to set the PCRE_EXTRA_MARK flag and |
ated by a call to pcre_study(), and to set the PCRE_EXTRA_MARK flag and |
| 289 |
the mark field within it, every time that pcre_exec() is called. If the |
the mark field within it, every time that pcre_exec() is called. If the |
| 290 |
variable that the mark field points to is non-NULL for a match, non- |
variable that the mark field points to is non-NULL for a match, non- |
| 291 |
match, or partial match, pcretest prints the string to which it points. |
match, or partial match, pcretest prints the string to which it points. |
| 292 |
For a match, this is shown on a line by itself, tagged with "MK:". For |
For a match, this is shown on a line by itself, tagged with "MK:". For |
| 293 |
a non-match it is added to the message. |
a non-match it is added to the message. |
| 294 |
|
|
| 295 |
The /L modifier must be followed directly by the name of a locale, for |
The /L modifier must be followed directly by the name of a locale, for |
| 296 |
example, |
example, |
| 297 |
|
|
| 298 |
/pattern/Lfr_FR |
/pattern/Lfr_FR |
| 299 |
|
|
| 300 |
For this reason, it must be the last modifier. The given locale is set, |
For this reason, it must be the last modifier. The given locale is set, |
| 301 |
pcre_maketables() is called to build a set of character tables for the |
pcre_maketables() is called to build a set of character tables for the |
| 302 |
locale, and this is then passed to pcre_compile() when compiling the |
locale, and this is then passed to pcre_compile() when compiling the |
| 303 |
regular expression. Without an /L (or /T) modifier, NULL is passed as |
regular expression. Without an /L (or /T) modifier, NULL is passed as |
| 304 |
the tables pointer; that is, /L applies only to the expression on which |
the tables pointer; that is, /L applies only to the expression on which |
| 305 |
it appears. |
it appears. |
| 306 |
|
|
| 307 |
The /M modifier causes the size of memory block used to hold the com- |
The /M modifier causes the size of memory block used to hold the com- |
| 308 |
piled pattern to be output. |
piled pattern to be output. |
| 309 |
|
|
| 310 |
If the /S modifier appears once, it causes pcre_study() to be called |
If the /S modifier appears once, it causes pcre_study() to be called |
| 311 |
after the expression has been compiled, and the results used when the |
after the expression has been compiled, and the results used when the |
| 312 |
expression is matched. If /S appears twice, it suppresses studying, |
expression is matched. If /S appears twice, it suppresses studying, |
| 313 |
even if it was requested externally by the -s command line option. This |
even if it was requested externally by the -s command line option. This |
| 314 |
makes it possible to specify that certain patterns are always studied, |
makes it possible to specify that certain patterns are always studied, |
| 315 |
and others are never studied, independently of -s. This feature is used |
and others are never studied, independently of -s. This feature is used |
| 316 |
in the test files in a few cases where the output is different when the |
in the test files in a few cases where the output is different when the |
| 317 |
pattern is studied. |
pattern is studied. |
| 318 |
|
|
| 319 |
|
If the /S modifier is immediately followed by a + character, the call |
| 320 |
|
to pcre_study() is made with the PCRE_STUDY_JIT_COMPILE option, |
| 321 |
|
requesting just-in-time optimization support if it is available. Note |
| 322 |
|
that there is also a /+ modifier; it must not be given immediately |
| 323 |
|
after /S because this will be misinterpreted. If JIT studying is suc- |
| 324 |
|
cessful, it will automatically be used when pcre_exec() is run, except |
| 325 |
|
when incompatible run-time options are specified. These include the |
| 326 |
|
partial matching options; a complete list is given in the pcrejit docu- |
| 327 |
|
mentation. See also the \J escape sequence below for a way of setting |
| 328 |
|
the size of the JIT stack. |
| 329 |
|
|
| 330 |
The /T modifier must be followed by a single digit. It causes a spe- |
The /T modifier must be followed by a single digit. It causes a spe- |
| 331 |
cific set of built-in character tables to be passed to pcre_compile(). |
cific set of built-in character tables to be passed to pcre_compile(). |
| 332 |
It is used in the standard PCRE tests to check behaviour with different |
It is used in the standard PCRE tests to check behaviour with different |
| 406 |
\Gname call pcre_get_named_substring() for substring |
\Gname call pcre_get_named_substring() for substring |
| 407 |
"name" after a successful match (name termin- |
"name" after a successful match (name termin- |
| 408 |
ated by next non-alphanumeric character) |
ated by next non-alphanumeric character) |
| 409 |
|
\Jdd set up a JIT stack of dd kilobytes maximum (any |
| 410 |
|
number of digits) |
| 411 |
\L call pcre_get_substringlist() after a |
\L call pcre_get_substringlist() after a |
| 412 |
successful match |
successful match |
| 413 |
\M discover the minimum MATCH_LIMIT and |
\M discover the minimum MATCH_LIMIT and |
| 460 |
way of passing an empty line as data, since a real empty line termi- |
way of passing an empty line as data, since a real empty line termi- |
| 461 |
nates the data input. |
nates the data input. |
| 462 |
|
|
| 463 |
If \M is present, pcretest calls pcre_exec() several times, with dif- |
The \J escape provides a way of setting the maximum stack size that is |
| 464 |
ferent values in the match_limit and match_limit_recursion fields of |
used by the just-in-time optimization code. It is ignored if JIT opti- |
| 465 |
the pcre_extra data structure, until it finds the minimum numbers for |
mization is not being used. Providing a stack that is larger than the |
| 466 |
each parameter that allow pcre_exec() to complete. The match_limit num- |
default 32K is necessary only for very complicated patterns. |
| 467 |
ber is a measure of the amount of backtracking that takes place, and |
|
| 468 |
checking it out can be instructive. For most simple matches, the number |
If \M is present, pcretest calls pcre_exec() several times, with dif- |
| 469 |
is quite small, but for patterns with very large numbers of matching |
ferent values in the match_limit and match_limit_recursion fields of |
| 470 |
possibilities, it can become large very quickly with increasing length |
the pcre_extra data structure, until it finds the minimum numbers for |
| 471 |
of subject string. The match_limit_recursion number is a measure of how |
each parameter that allow pcre_exec() to complete without error. |
| 472 |
much stack (or, if PCRE is compiled with NO_RECURSE, how much heap) |
Because this is testing a specific feature of the normal interpretive |
| 473 |
memory is needed to complete the match attempt. |
pcre_exec() execution, the use of any JIT optimization that might have |
| 474 |
|
been set up by the /S+ qualifier of -s+ option is disabled. |
| 475 |
|
|
| 476 |
|
The match_limit number is a measure of the amount of backtracking that |
| 477 |
|
takes place, and checking it out can be instructive. For most simple |
| 478 |
|
matches, the number is quite small, but for patterns with very large |
| 479 |
|
numbers of matching possibilities, it can become large very quickly |
| 480 |
|
with increasing length of subject string. The match_limit_recursion |
| 481 |
|
number is a measure of how much stack (or, if PCRE is compiled with |
| 482 |
|
NO_RECURSE, how much heap) memory is needed to complete the match |
| 483 |
|
attempt. |
| 484 |
|
|
| 485 |
When \O is used, the value specified may be higher or lower than the |
When \O is used, the value specified may be higher or lower than the |
| 486 |
size set by the -O command line option (or defaulted to 45); \O applies |
size set by the -O command line option (or defaulted to 45); \O applies |
| 746 |
/pattern/im >/some/file |
/pattern/im >/some/file |
| 747 |
|
|
| 748 |
See the pcreprecompile documentation for a discussion about saving and |
See the pcreprecompile documentation for a discussion about saving and |
| 749 |
re-using compiled patterns. |
re-using compiled patterns. Note that if the pattern was successfully |
| 750 |
|
studied with JIT optimization, the JIT data cannot be saved. |
| 751 |
|
|
| 752 |
The data that is written is binary. The first eight bytes are the |
The data that is written is binary. The first eight bytes are the |
| 753 |
length of the compiled pattern data followed by the length of the |
length of the compiled pattern data followed by the length of the |
| 754 |
optional study data, each written as four bytes in big-endian order |
optional study data, each written as four bytes in big-endian order |
| 755 |
(most significant byte first). If there is no study data (either the |
(most significant byte first). If there is no study data (either the |
| 756 |
pattern was not studied, or studying did not return any data), the sec- |
pattern was not studied, or studying did not return any data), the sec- |
| 757 |
ond length is zero. The lengths are followed by an exact copy of the |
ond length is zero. The lengths are followed by an exact copy of the |
| 758 |
compiled pattern. If there is additional study data, this follows imme- |
compiled pattern. If there is additional study data, this (excluding |
| 759 |
diately after the compiled pattern. After writing the file, pcretest |
any JIT data) follows immediately after the compiled pattern. After |
| 760 |
expects to read a new pattern. |
writing the file, pcretest expects to read a new pattern. |
| 761 |
|
|
| 762 |
A saved pattern can be reloaded into pcretest by specifying < and a |
A saved pattern can be reloaded into pcretest by specifying < and a |
| 763 |
file name instead of a pattern. The name of the file must not contain a |
file name instead of a pattern. The name of the file must not contain a |
| 764 |
< character, as otherwise pcretest will interpret the line as a pattern |
< character, as otherwise pcretest will interpret the line as a pattern |
| 765 |
delimited by < characters. For example: |
delimited by < characters. For example: |
| 768 |
Compiled pattern loaded from /some/file |
Compiled pattern loaded from /some/file |
| 769 |
No study data |
No study data |
| 770 |
|
|
| 771 |
When the pattern has been loaded, pcretest proceeds to read data lines |
If the pattern was previously studied with the JIT optimization, the |
| 772 |
in the usual way. |
JIT information cannot be saved and restored, and so is lost. When the |
| 773 |
|
pattern has been loaded, pcretest proceeds to read data lines in the |
| 774 |
You can copy a file written by pcretest to a different host and reload |
usual way. |
| 775 |
it there, even if the new host has opposite endianness to the one on |
|
| 776 |
which the pattern was compiled. For example, you can compile on an i86 |
You can copy a file written by pcretest to a different host and reload |
| 777 |
|
it there, even if the new host has opposite endianness to the one on |
| 778 |
|
which the pattern was compiled. For example, you can compile on an i86 |
| 779 |
machine and run on a SPARC machine. |
machine and run on a SPARC machine. |
| 780 |
|
|
| 781 |
File names for saving and reloading can be absolute or relative, but |
File names for saving and reloading can be absolute or relative, but |
| 782 |
note that the shell facility of expanding a file name that starts with |
note that the shell facility of expanding a file name that starts with |
| 783 |
a tilde (~) is not available. |
a tilde (~) is not available. |
| 784 |
|
|
| 785 |
The ability to save and reload files in pcretest is intended for test- |
The ability to save and reload files in pcretest is intended for test- |
| 786 |
ing and experimentation. It is not intended for production use because |
ing and experimentation. It is not intended for production use because |
| 787 |
only a single pattern can be written to a file. Furthermore, there is |
only a single pattern can be written to a file. Furthermore, there is |
| 788 |
no facility for supplying custom character tables for use with a |
no facility for supplying custom character tables for use with a |
| 789 |
reloaded pattern. If the original pattern was compiled with custom |
reloaded pattern. If the original pattern was compiled with custom |
| 790 |
tables, an attempt to match a subject string using a reloaded pattern |
tables, an attempt to match a subject string using a reloaded pattern |
| 791 |
is likely to cause pcretest to crash. Finally, if you attempt to load |
is likely to cause pcretest to crash. Finally, if you attempt to load |
| 792 |
a file that is not in the correct format, the result is undefined. |
a file that is not in the correct format, the result is undefined. |
| 793 |
|
|
| 794 |
|
|
| 795 |
SEE ALSO |
SEE ALSO |
| 796 |
|
|
| 797 |
pcre(3), pcreapi(3), pcrecallout(3), pcrematching(3), pcrepartial(d), |
pcre(3), pcreapi(3), pcrecallout(3), pcrejit, pcrematching(3), pcrepar- |
| 798 |
pcrepattern(3), pcreprecompile(3). |
tial(d), pcrepattern(3), pcreprecompile(3). |
| 799 |
|
|
| 800 |
|
|
| 801 |
AUTHOR |
AUTHOR |
| 807 |
|
|
| 808 |
REVISION |
REVISION |
| 809 |
|
|
| 810 |
Last updated: 01 August 2011 |
Last updated: 26 August 2011 |
| 811 |
Copyright (c) 1997-2011 University of Cambridge. |
Copyright (c) 1997-2011 University of Cambridge. |