| 137 |
.SH "PCRE API OVERVIEW" |
.SH "PCRE API OVERVIEW" |
| 138 |
.rs |
.rs |
| 139 |
.sp |
.sp |
| 140 |
PCRE has its own native API, which is described in this document. There is |
PCRE has its own native API, which is described in this document. There are |
| 141 |
also a set of wrapper functions that correspond to the POSIX regular expression |
also some wrapper functions that correspond to the POSIX regular expression |
| 142 |
API. These are described in the |
API. These are described in the |
| 143 |
.\" HREF |
.\" HREF |
| 144 |
\fBpcreposix\fP |
\fBpcreposix\fP |
| 170 |
A second matching function, \fBpcre_dfa_exec()\fP, which is not |
A second matching function, \fBpcre_dfa_exec()\fP, which is not |
| 171 |
Perl-compatible, is also provided. This uses a different algorithm for the |
Perl-compatible, is also provided. This uses a different algorithm for the |
| 172 |
matching. The alternative algorithm finds all possible matches (at a given |
matching. The alternative algorithm finds all possible matches (at a given |
| 173 |
point in the subject). However, this algorithm does not return captured |
point in the subject), and scans the subject just once. However, this algorithm |
| 174 |
substrings. A description of the two matching algorithms and their advantages |
does not return captured substrings. A description of the two matching |
| 175 |
and disadvantages is given in the |
algorithms and their advantages and disadvantages is given in the |
| 176 |
.\" HREF |
.\" HREF |
| 177 |
\fBpcrematching\fP |
\fBpcrematching\fP |
| 178 |
.\" |
.\" |
| 244 |
. |
. |
| 245 |
. |
. |
| 246 |
.SH NEWLINES |
.SH NEWLINES |
| 247 |
PCRE supports three different conventions for indicating line breaks in |
.rs |
|
strings: a single CR character, a single LF character, or the two-character |
|
|
sequence CRLF. All three are used as "standard" by different operating systems. |
|
|
When PCRE is built, a default can be specified. The default default is LF, |
|
|
which is the Unix standard. When PCRE is run, the default can be overridden, |
|
|
either when a pattern is compiled, or when it is matched. |
|
| 248 |
.sp |
.sp |
| 249 |
|
PCRE supports four different conventions for indicating line breaks in |
| 250 |
|
strings: a single CR (carriage return) character, a single LF (linefeed) |
| 251 |
|
character, the two-character sequence CRLF, or any Unicode newline sequence. |
| 252 |
|
The Unicode newline sequences are the three just mentioned, plus the single |
| 253 |
|
characters VT (vertical tab, U+000B), FF (formfeed, U+000C), NEL (next line, |
| 254 |
|
U+0085), LS (line separator, U+2028), and PS (paragraph separator, U+2029). |
| 255 |
|
.P |
| 256 |
|
Each of the first three conventions is used by at least one operating system as |
| 257 |
|
its standard newline sequence. When PCRE is built, a default can be specified. |
| 258 |
|
The default default is LF, which is the Unix standard. When PCRE is run, the |
| 259 |
|
default can be overridden, either when a pattern is compiled, or when it is |
| 260 |
|
matched. |
| 261 |
|
.P |
| 262 |
In the PCRE documentation the word "newline" is used to mean "the character or |
In the PCRE documentation the word "newline" is used to mean "the character or |
| 263 |
pair of characters that indicate a line break". |
pair of characters that indicate a line break". The choice of newline |
| 264 |
|
convention affects the handling of the dot, circumflex, and dollar |
| 265 |
|
metacharacters, the handling of #-comments in /x mode, and, when CRLF is a |
| 266 |
|
recognized line ending sequence, the match position advancement for a |
| 267 |
|
non-anchored pattern. The choice of newline convention does not affect the |
| 268 |
|
interpretation of the \en or \er escape sequences. |
| 269 |
. |
. |
| 270 |
. |
. |
| 271 |
.SH MULTITHREADING |
.SH MULTITHREADING |
| 321 |
PCRE_CONFIG_NEWLINE |
PCRE_CONFIG_NEWLINE |
| 322 |
.sp |
.sp |
| 323 |
The output is an integer whose value specifies the default character sequence |
The output is an integer whose value specifies the default character sequence |
| 324 |
that is recognized as meaning "newline". The three values that are supported |
that is recognized as meaning "newline". The four values that are supported |
| 325 |
are: 10 for LF, 13 for CR, and 3338 for CRLF. The default should normally be |
are: 10 for LF, 13 for CR, 3338 for CRLF, and -1 for ANY. The default should |
| 326 |
the standard sequence for your operating system. |
normally be the standard sequence for your operating system. |
| 327 |
.sp |
.sp |
| 328 |
PCRE_CONFIG_LINK_SIZE |
PCRE_CONFIG_LINK_SIZE |
| 329 |
.sp |
.sp |
| 400 |
fully relocatable, because it may contain a copy of the \fItableptr\fP |
fully relocatable, because it may contain a copy of the \fItableptr\fP |
| 401 |
argument, which is an address (see below). |
argument, which is an address (see below). |
| 402 |
.P |
.P |
| 403 |
The \fIoptions\fP argument contains independent bits that affect the |
The \fIoptions\fP argument contains various bit settings that affect the |
| 404 |
compilation. It should be zero if no options are required. The available |
compilation. It should be zero if no options are required. The available |
| 405 |
options are described below. Some of them, in particular, those that are |
options are described below. Some of them, in particular, those that are |
| 406 |
compatible with Perl, can also be set and unset from within the pattern (see |
compatible with Perl, can also be set and unset from within the pattern (see |
| 493 |
including those that indicate newline. Without it, a dot does not match when |
including those that indicate newline. Without it, a dot does not match when |
| 494 |
the current position is at a newline. This option is equivalent to Perl's /s |
the current position is at a newline. This option is equivalent to Perl's /s |
| 495 |
option, and it can be changed within a pattern by a (?s) option setting. A |
option, and it can be changed within a pattern by a (?s) option setting. A |
| 496 |
negative class such as [^a] always matches newlines, independent of the setting |
negative class such as [^a] always matches newline characters, independent of |
| 497 |
of this option. |
the setting of this option. |
| 498 |
.sp |
.sp |
| 499 |
PCRE_DUPNAMES |
PCRE_DUPNAMES |
| 500 |
.sp |
.sp |
| 557 |
PCRE_NEWLINE_CR |
PCRE_NEWLINE_CR |
| 558 |
PCRE_NEWLINE_LF |
PCRE_NEWLINE_LF |
| 559 |
PCRE_NEWLINE_CRLF |
PCRE_NEWLINE_CRLF |
| 560 |
|
PCRE_NEWLINE_ANY |
| 561 |
.sp |
.sp |
| 562 |
These options override the default newline definition that was chosen when PCRE |
These options override the default newline definition that was chosen when PCRE |
| 563 |
was built. Setting the first or the second specifies that a newline is |
was built. Setting the first or the second specifies that a newline is |
| 564 |
indicated by a single character (CR or LF, respectively). Setting both of them |
indicated by a single character (CR or LF, respectively). Setting |
| 565 |
specifies that a newline is indicated by the two-character CRLF sequence. For |
PCRE_NEWLINE_CRLF specifies that a newline is indicated by the two-character |
| 566 |
convenience, PCRE_NEWLINE_CRLF is defined to contain both bits. The only time |
CRLF sequence. Setting PCRE_NEWLINE_ANY specifies that any Unicode newline |
| 567 |
that a line break is relevant when compiling a pattern is if PCRE_EXTENDED is |
sequence should be recognized. The Unicode newline sequences are the three just |
| 568 |
set, and an unescaped # outside a character class is encountered. This |
mentioned, plus the single characters VT (vertical tab, U+000B), FF (formfeed, |
| 569 |
indicates a comment that lasts until after the next newline. |
U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and PS |
| 570 |
|
(paragraph separator, U+2029). The last two are recognized only in UTF-8 mode. |
| 571 |
|
.P |
| 572 |
|
The newline setting in the options word uses three bits that are treated |
| 573 |
|
as a number, giving eight possibilities. Currently only five are used (default |
| 574 |
|
plus the four values above). This means that if you set more than one newline |
| 575 |
|
option, the combination may or may not be sensible. For example, |
| 576 |
|
PCRE_NEWLINE_CR with PCRE_NEWLINE_LF is equivalent to PCRE_NEWLINE_CRLF, but |
| 577 |
|
other combinations yield unused numbers and cause an error. |
| 578 |
|
.P |
| 579 |
|
The only time that a line break is specially recognized when compiling a |
| 580 |
|
pattern is if PCRE_EXTENDED is set, and an unescaped # outside a character |
| 581 |
|
class is encountered. This indicates a comment that lasts until after the next |
| 582 |
|
line break sequence. In other circumstances, line break sequences are treated |
| 583 |
|
as literal data, except that in PCRE_EXTENDED mode, both CR and LF are treated |
| 584 |
|
as whitespace characters and are therefore ignored. |
| 585 |
.P |
.P |
| 586 |
The newline option set at compile time becomes the default that is used for |
The newline option that is set at compile time becomes the default that is used |
| 587 |
\fBpcre_exec()\fP and \fBpcre_dfa_exec()\fP, but it can be overridden. |
for \fBpcre_exec()\fP and \fBpcre_dfa_exec()\fP, but it can be overridden. |
| 588 |
.sp |
.sp |
| 589 |
PCRE_NO_AUTO_CAPTURE |
PCRE_NO_AUTO_CAPTURE |
| 590 |
.sp |
.sp |
| 635 |
.sp |
.sp |
| 636 |
The following table lists the error codes than may be returned by |
The following table lists the error codes than may be returned by |
| 637 |
\fBpcre_compile2()\fP, along with the error messages that may be returned by |
\fBpcre_compile2()\fP, along with the error messages that may be returned by |
| 638 |
both compiling functions. |
both compiling functions. As PCRE has developed, some error codes have fallen |
| 639 |
|
out of use. To avoid confusion, they have not been re-used. |
| 640 |
.sp |
.sp |
| 641 |
0 no error |
0 no error |
| 642 |
1 \e at end of pattern |
1 \e at end of pattern |
| 648 |
7 invalid escape sequence in character class |
7 invalid escape sequence in character class |
| 649 |
8 range out of order in character class |
8 range out of order in character class |
| 650 |
9 nothing to repeat |
9 nothing to repeat |
| 651 |
10 operand of unlimited repeat could match the empty string |
10 [this code is not in use] |
| 652 |
11 internal error: unexpected repeat |
11 internal error: unexpected repeat |
| 653 |
12 unrecognized character after (? |
12 unrecognized character after (? |
| 654 |
13 POSIX named classes are supported only within a class |
13 POSIX named classes are supported only within a class |
| 657 |
16 erroffset passed as NULL |
16 erroffset passed as NULL |
| 658 |
17 unknown option bit(s) set |
17 unknown option bit(s) set |
| 659 |
18 missing ) after comment |
18 missing ) after comment |
| 660 |
19 parentheses nested too deeply |
19 [this code is not in use] |
| 661 |
20 regular expression too large |
20 regular expression too large |
| 662 |
21 failed to get memory |
21 failed to get memory |
| 663 |
22 unmatched parentheses |
22 unmatched parentheses |
| 671 |
30 unknown POSIX class name |
30 unknown POSIX class name |
| 672 |
31 POSIX collating elements are not supported |
31 POSIX collating elements are not supported |
| 673 |
32 this version of PCRE is not compiled with PCRE_UTF8 support |
32 this version of PCRE is not compiled with PCRE_UTF8 support |
| 674 |
33 spare error |
33 [this code is not in use] |
| 675 |
34 character value in \ex{...} sequence is too large |
34 character value in \ex{...} sequence is too large |
| 676 |
35 invalid condition (?(0) |
35 invalid condition (?(0) |
| 677 |
36 \eC not allowed in lookbehind assertion |
36 \eC not allowed in lookbehind assertion |
| 680 |
39 closing ) for (?C expected |
39 closing ) for (?C expected |
| 681 |
40 recursive call could loop indefinitely |
40 recursive call could loop indefinitely |
| 682 |
41 unrecognized character after (?P |
41 unrecognized character after (?P |
| 683 |
42 syntax error after (?P |
42 syntax error in subpattern name (missing terminator) |
| 684 |
43 two named subpatterns have the same name |
43 two named subpatterns have the same name |
| 685 |
44 invalid UTF-8 string |
44 invalid UTF-8 string |
| 686 |
45 support for \eP, \ep, and \eX has not been compiled |
45 support for \eP, \ep, and \eX has not been compiled |
| 690 |
49 too many named subpatterns (maximum 10,000) |
49 too many named subpatterns (maximum 10,000) |
| 691 |
50 repeated subpattern is too long |
50 repeated subpattern is too long |
| 692 |
51 octal value is greater than \e377 (not in UTF-8 mode) |
51 octal value is greater than \e377 (not in UTF-8 mode) |
| 693 |
|
52 internal error: overran compiling workspace |
| 694 |
|
53 internal error: previously-checked referenced subpattern not found |
| 695 |
|
54 DEFINE group contains more than one branch |
| 696 |
|
55 repeating a DEFINE group is not allowed |
| 697 |
|
56 inconsistent NEWLINE options" |
| 698 |
. |
. |
| 699 |
. |
. |
| 700 |
.SH "STUDYING A PATTERN" |
.SH "STUDYING A PATTERN" |
| 862 |
still recognized for backwards compatibility.) |
still recognized for backwards compatibility.) |
| 863 |
.P |
.P |
| 864 |
If there is a fixed first byte, for example, from a pattern such as |
If there is a fixed first byte, for example, from a pattern such as |
| 865 |
(cat|cow|coyote). Otherwise, if either |
(cat|cow|coyote), its value is returned. Otherwise, if either |
| 866 |
.sp |
.sp |
| 867 |
(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch |
(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch |
| 868 |
starts with "^", or |
starts with "^", or |
| 917 |
PCRE_EXTENDED is set, so white space - including newlines - is ignored): |
PCRE_EXTENDED is set, so white space - including newlines - is ignored): |
| 918 |
.sp |
.sp |
| 919 |
.\" JOIN |
.\" JOIN |
| 920 |
(?P<date> (?P<year>(\ed\ed)?\ed\ed) - |
(?<date> (?<year>(\ed\ed)?\ed\ed) - |
| 921 |
(?P<month>\ed\ed) - (?P<day>\ed\ed) ) |
(?<month>\ed\ed) - (?<day>\ed\ed) ) |
| 922 |
.sp |
.sp |
| 923 |
There are four named subpatterns, so the table has four entries, and each entry |
There are four named subpatterns, so the table has four entries, and each entry |
| 924 |
in the table is eight bytes long. The table is as follows, with non-printing |
in the table is eight bytes long. The table is as follows, with non-printing |
| 1166 |
PCRE_NEWLINE_CR |
PCRE_NEWLINE_CR |
| 1167 |
PCRE_NEWLINE_LF |
PCRE_NEWLINE_LF |
| 1168 |
PCRE_NEWLINE_CRLF |
PCRE_NEWLINE_CRLF |
| 1169 |
|
PCRE_NEWLINE_ANY |
| 1170 |
.sp |
.sp |
| 1171 |
These options override the newline definition that was chosen or defaulted when |
These options override the newline definition that was chosen or defaulted when |
| 1172 |
the pattern was compiled. For details, see the description \fBpcre_compile()\fP |
the pattern was compiled. For details, see the description of |
| 1173 |
above. During matching, the newline choice affects the behaviour of the dot, |
\fBpcre_compile()\fP above. During matching, the newline choice affects the |
| 1174 |
circumflex, and dollar metacharacters. |
behaviour of the dot, circumflex, and dollar metacharacters. It may also alter |
| 1175 |
|
the way the match position is advanced after a match failure for an unanchored |
| 1176 |
|
pattern. When PCRE_NEWLINE_CRLF or PCRE_NEWLINE_ANY is set, and a match attempt |
| 1177 |
|
fails when the current position is at a CRLF sequence, the match position is |
| 1178 |
|
advanced by two characters instead of one, in other words, to after the CRLF. |
| 1179 |
.sp |
.sp |
| 1180 |
PCRE_NOTBOL |
PCRE_NOTBOL |
| 1181 |
.sp |
.sp |
| 1376 |
other endianness. This is the error that PCRE gives when the magic number is |
other endianness. This is the error that PCRE gives when the magic number is |
| 1377 |
not present. |
not present. |
| 1378 |
.sp |
.sp |
| 1379 |
PCRE_ERROR_UNKNOWN_NODE (-5) |
PCRE_ERROR_UNKNOWN_OPCODE (-5) |
| 1380 |
.sp |
.sp |
| 1381 |
While running the pattern match, an unknown item was encountered in the |
While running the pattern match, an unknown item was encountered in the |
| 1382 |
compiled pattern. This error could be caused by a bug in PCRE or by overwriting |
compiled pattern. This error could be caused by a bug in PCRE or by overwriting |
| 1402 |
\fBpcre_extra\fP structure (or defaulted) was reached. See the description |
\fBpcre_extra\fP structure (or defaulted) was reached. See the description |
| 1403 |
above. |
above. |
| 1404 |
.sp |
.sp |
|
PCRE_ERROR_RECURSIONLIMIT (-21) |
|
|
.sp |
|
|
The internal recursion limit, as specified by the \fImatch_limit_recursion\fP |
|
|
field in a \fBpcre_extra\fP structure (or defaulted) was reached. See the |
|
|
description above. |
|
|
.sp |
|
| 1405 |
PCRE_ERROR_CALLOUT (-9) |
PCRE_ERROR_CALLOUT (-9) |
| 1406 |
.sp |
.sp |
| 1407 |
This error is never generated by \fBpcre_exec()\fP itself. It is provided for |
This error is never generated by \fBpcre_exec()\fP itself. It is provided for |
| 1445 |
PCRE_ERROR_BADCOUNT (-15) |
PCRE_ERROR_BADCOUNT (-15) |
| 1446 |
.sp |
.sp |
| 1447 |
This error is given if the value of the \fIovecsize\fP argument is negative. |
This error is given if the value of the \fIovecsize\fP argument is negative. |
| 1448 |
|
.sp |
| 1449 |
|
PCRE_ERROR_RECURSIONLIMIT (-21) |
| 1450 |
|
.sp |
| 1451 |
|
The internal recursion limit, as specified by the \fImatch_limit_recursion\fP |
| 1452 |
|
field in a \fBpcre_extra\fP structure (or defaulted) was reached. See the |
| 1453 |
|
description above. |
| 1454 |
|
.sp |
| 1455 |
|
PCRE_ERROR_NULLWSLIMIT (-22) |
| 1456 |
|
.sp |
| 1457 |
|
When a group that can match an empty substring is repeated with an unbounded |
| 1458 |
|
upper limit, the subject position at the start of the group must be remembered, |
| 1459 |
|
so that a test for an empty string can be made when the end of the group is |
| 1460 |
|
reached. Some workspace is required for this; if it runs out, this error is |
| 1461 |
|
given. |
| 1462 |
|
.sp |
| 1463 |
|
PCRE_ERROR_BADNEWLINE (-23) |
| 1464 |
|
.sp |
| 1465 |
|
An invalid combination of PCRE_NEWLINE_\fIxxx\fP options was given. |
| 1466 |
|
.P |
| 1467 |
|
Error numbers -16 to -20 are not used by \fBpcre_exec()\fP. |
| 1468 |
. |
. |
| 1469 |
. |
. |
| 1470 |
.SH "EXTRACTING CAPTURED SUBSTRINGS BY NUMBER" |
.SH "EXTRACTING CAPTURED SUBSTRINGS BY NUMBER" |
| 1522 |
\fIbuffersize\fP, while for \fBpcre_get_substring()\fP a new block of memory is |
\fIbuffersize\fP, while for \fBpcre_get_substring()\fP a new block of memory is |
| 1523 |
obtained via \fBpcre_malloc\fP, and its address is returned via |
obtained via \fBpcre_malloc\fP, and its address is returned via |
| 1524 |
\fIstringptr\fP. The yield of the function is the length of the string, not |
\fIstringptr\fP. The yield of the function is the length of the string, not |
| 1525 |
including the terminating zero, or one of |
including the terminating zero, or one of these error codes: |
| 1526 |
.sp |
.sp |
| 1527 |
PCRE_ERROR_NOMEMORY (-6) |
PCRE_ERROR_NOMEMORY (-6) |
| 1528 |
.sp |
.sp |
| 1538 |
memory that is obtained via \fBpcre_malloc\fP. The address of the memory block |
memory that is obtained via \fBpcre_malloc\fP. The address of the memory block |
| 1539 |
is returned via \fIlistptr\fP, which is also the start of the list of string |
is returned via \fIlistptr\fP, which is also the start of the list of string |
| 1540 |
pointers. The end of the list is marked by a NULL pointer. The yield of the |
pointers. The end of the list is marked by a NULL pointer. The yield of the |
| 1541 |
function is zero if all went well, or |
function is zero if all went well, or the error code |
| 1542 |
.sp |
.sp |
| 1543 |
PCRE_ERROR_NOMEMORY (-6) |
PCRE_ERROR_NOMEMORY (-6) |
| 1544 |
.sp |
.sp |
| 1590 |
To extract a substring by name, you first have to find associated number. |
To extract a substring by name, you first have to find associated number. |
| 1591 |
For example, for this pattern |
For example, for this pattern |
| 1592 |
.sp |
.sp |
| 1593 |
(a+)b(?P<xxx>\ed+)... |
(a+)b(?<xxx>\ed+)... |
| 1594 |
.sp |
.sp |
| 1595 |
the number of the subpattern called "xxx" is 2. If the name is known to be |
the number of the subpattern called "xxx" is 2. If the name is known to be |
| 1596 |
unique (PCRE_DUPNAMES was not set), you can find the number from the name by |
unique (PCRE_DUPNAMES was not set), you can find the number from the name by |
| 1644 |
fourth are pointers to variables which are updated by the function. After it |
fourth are pointers to variables which are updated by the function. After it |
| 1645 |
has run, they point to the first and last entries in the name-to-number table |
has run, they point to the first and last entries in the name-to-number table |
| 1646 |
for the given name. The function itself returns the length of each entry, or |
for the given name. The function itself returns the length of each entry, or |
| 1647 |
PCRE_ERROR_NOSUBSTRING if there are none. The format of the table is described |
PCRE_ERROR_NOSUBSTRING (-7) if there are none. The format of the table is |
| 1648 |
above in the section entitled \fIInformation about a pattern\fP. Given all the |
described above in the section entitled \fIInformation about a pattern\fP. |
| 1649 |
relevant entries for the name, you can extract each of their numbers, and hence |
Given all the relevant entries for the name, you can extract each of their |
| 1650 |
the captured data, if any. |
numbers, and hence the captured data, if any. |
| 1651 |
. |
. |
| 1652 |
. |
. |
| 1653 |
.SH "FINDING ALL POSSIBLE MATCHES" |
.SH "FINDING ALL POSSIBLE MATCHES" |
| 1685 |
.B int *\fIworkspace\fP, int \fIwscount\fP); |
.B int *\fIworkspace\fP, int \fIwscount\fP); |
| 1686 |
.P |
.P |
| 1687 |
The function \fBpcre_dfa_exec()\fP is called to match a subject string against |
The function \fBpcre_dfa_exec()\fP is called to match a subject string against |
| 1688 |
a compiled pattern, using a "DFA" matching algorithm. This has different |
a compiled pattern, using a matching algorithm that scans the subject string |
| 1689 |
characteristics to the normal algorithm, and is not compatible with Perl. Some |
just once, and does not backtrack. This has different characteristics to the |
| 1690 |
of the features of PCRE patterns are not supported. Nevertheless, there are |
normal algorithm, and is not compatible with Perl. Some of the features of PCRE |
| 1691 |
times when this kind of matching can be useful. For a discussion of the two |
patterns are not supported. Nevertheless, there are times when this kind of |
| 1692 |
matching algorithms, see the |
matching can be useful. For a discussion of the two matching algorithms, see |
| 1693 |
|
the |
| 1694 |
.\" HREF |
.\" HREF |
| 1695 |
\fBpcrematching\fP |
\fBpcrematching\fP |
| 1696 |
.\" |
.\" |
| 1746 |
PCRE_DFA_SHORTEST |
PCRE_DFA_SHORTEST |
| 1747 |
.sp |
.sp |
| 1748 |
Setting the PCRE_DFA_SHORTEST option causes the matching algorithm to stop as |
Setting the PCRE_DFA_SHORTEST option causes the matching algorithm to stop as |
| 1749 |
soon as it has found one match. Because of the way the DFA algorithm works, |
soon as it has found one match. Because of the way the alternative algorithm |
| 1750 |
this is necessarily the shortest possible match at the first possible matching |
works, this is necessarily the shortest possible match at the first possible |
| 1751 |
point in the subject string. |
matching point in the subject string. |
| 1752 |
.sp |
.sp |
| 1753 |
PCRE_DFA_RESTART |
PCRE_DFA_RESTART |
| 1754 |
.sp |
.sp |
| 1787 |
On success, the yield of the function is a number greater than zero, which is |
On success, the yield of the function is a number greater than zero, which is |
| 1788 |
the number of matched substrings. The substrings themselves are returned in |
the number of matched substrings. The substrings themselves are returned in |
| 1789 |
\fIovector\fP. Each string uses two elements; the first is the offset to the |
\fIovector\fP. Each string uses two elements; the first is the offset to the |
| 1790 |
start, and the second is the offset to the end. All the strings have the same |
start, and the second is the offset to the end. In fact, all the strings have |
| 1791 |
start offset. (Space could have been saved by giving this only once, but it was |
the same start offset. (Space could have been saved by giving this only once, |
| 1792 |
decided to retain some compatibility with the way \fBpcre_exec()\fP returns |
but it was decided to retain some compatibility with the way \fBpcre_exec()\fP |
| 1793 |
data, even though the meaning of the strings is different.) |
returns data, even though the meaning of the strings is different.) |
| 1794 |
.P |
.P |
| 1795 |
The strings are returned in reverse order of length; that is, the longest |
The strings are returned in reverse order of length; that is, the longest |
| 1796 |
matching string is given first. If there were too many matches to fit into |
matching string is given first. If there were too many matches to fit into |
| 1817 |
.sp |
.sp |
| 1818 |
PCRE_ERROR_DFA_UCOND (-17) |
PCRE_ERROR_DFA_UCOND (-17) |
| 1819 |
.sp |
.sp |
| 1820 |
This return is given if \fBpcre_dfa_exec()\fP encounters a condition item in a |
This return is given if \fBpcre_dfa_exec()\fP encounters a condition item that |
| 1821 |
pattern that uses a back reference for the condition. This is not supported. |
uses a back reference for the condition, or a test for recursion in a specific |
| 1822 |
|
group. These are not supported. |
| 1823 |
.sp |
.sp |
| 1824 |
PCRE_ERROR_DFA_UMLIMIT (-18) |
PCRE_ERROR_DFA_UMLIMIT (-18) |
| 1825 |
.sp |
.sp |
| 1838 |
recursively, using private vectors for \fIovector\fP and \fIworkspace\fP. This |
recursively, using private vectors for \fIovector\fP and \fIworkspace\fP. This |
| 1839 |
error is given if the output vector is not large enough. This should be |
error is given if the output vector is not large enough. This should be |
| 1840 |
extremely rare, as a vector of size 1000 is used. |
extremely rare, as a vector of size 1000 is used. |
| 1841 |
|
. |
| 1842 |
|
. |
| 1843 |
|
.SH "SEE ALSO" |
| 1844 |
|
.rs |
| 1845 |
|
.sp |
| 1846 |
|
\fBpcrebuild\fP(3), \fBpcrecallout\fP(3), \fBpcrecpp(3)\fP(3), |
| 1847 |
|
\fBpcrematching\fP(3), \fBpcrepartial\fP(3), \fBpcreposix\fP(3), |
| 1848 |
|
\fBpcreprecompile\fP(3), \fBpcresample\fP(3), \fBpcrestack\fP(3). |
| 1849 |
.P |
.P |
| 1850 |
.in 0 |
.in 0 |
| 1851 |
Last updated: 08 June 2006 |
Last updated: 30 November 2006 |
| 1852 |
.br |
.br |
| 1853 |
Copyright (c) 1997-2006 University of Cambridge. |
Copyright (c) 1997-2006 University of Cambridge. |