| 395 |
Either of the functions \fBpcre_compile()\fP or \fBpcre_compile2()\fP can be |
Either of the functions \fBpcre_compile()\fP or \fBpcre_compile2()\fP can be |
| 396 |
called to compile a pattern into an internal form. The only difference between |
called to compile a pattern into an internal form. The only difference between |
| 397 |
the two interfaces is that \fBpcre_compile2()\fP has an additional argument, |
the two interfaces is that \fBpcre_compile2()\fP has an additional argument, |
| 398 |
\fIerrorcodeptr\fP, via which a numerical error code can be returned. To avoid |
\fIerrorcodeptr\fP, via which a numerical error code can be returned. To avoid |
| 399 |
too much repetition, we refer just to \fBpcre_compile()\fP below, but the |
too much repetition, we refer just to \fBpcre_compile()\fP below, but the |
| 400 |
information applies equally to \fBpcre_compile2()\fP. |
information applies equally to \fBpcre_compile2()\fP. |
| 401 |
.P |
.P |
| 402 |
The pattern is a C string terminated by a binary zero, and is passed in the |
The pattern is a C string terminated by a binary zero, and is passed in the |
| 421 |
.\" |
.\" |
| 422 |
documentation). For those options that can be different in different parts of |
documentation). For those options that can be different in different parts of |
| 423 |
the pattern, the contents of the \fIoptions\fP argument specifies their |
the pattern, the contents of the \fIoptions\fP argument specifies their |
| 424 |
settings at the start of compilation and execution. The PCRE_ANCHORED, |
settings at the start of compilation and execution. The PCRE_ANCHORED, |
| 425 |
PCRE_BSR_\fIxxx\fP, and PCRE_NEWLINE_\fIxxx\fP options can be set at the time |
PCRE_BSR_\fIxxx\fP, and PCRE_NEWLINE_\fIxxx\fP options can be set at the time |
| 426 |
of matching as well as at compile time. |
of matching as well as at compile time. |
| 427 |
.P |
.P |
| 785 |
.P |
.P |
| 786 |
If studying the pattern does not produce any useful information, |
If studying the pattern does not produce any useful information, |
| 787 |
\fBpcre_study()\fP returns NULL. In that circumstance, if the calling program |
\fBpcre_study()\fP returns NULL. In that circumstance, if the calling program |
| 788 |
wants to pass any of the other fields to \fBpcre_exec()\fP or |
wants to pass any of the other fields to \fBpcre_exec()\fP or |
| 789 |
\fBpcre_dfa_exec()\fP, it must set up its own \fBpcre_extra\fP block. |
\fBpcre_dfa_exec()\fP, it must set up its own \fBpcre_extra\fP block. |
| 790 |
.P |
.P |
| 791 |
The second argument of \fBpcre_study()\fP contains option bits. At present, no |
The second argument of \fBpcre_study()\fP contains option bits. At present, no |
| 807 |
&error); /* set to NULL or points to a message */ |
&error); /* set to NULL or points to a message */ |
| 808 |
.sp |
.sp |
| 809 |
Studying a pattern does two things: first, a lower bound for the length of |
Studying a pattern does two things: first, a lower bound for the length of |
| 810 |
subject string that is needed to match the pattern is computed. This does not |
subject string that is needed to match the pattern is computed. This does not |
| 811 |
mean that there are any strings of that length that match, but it does |
mean that there are any strings of that length that match, but it does |
| 812 |
guarantee that no shorter strings match. The value is used by |
guarantee that no shorter strings match. The value is used by |
| 813 |
\fBpcre_exec()\fP and \fBpcre_dfa_exec()\fP to avoid wasting time by trying to |
\fBpcre_exec()\fP and \fBpcre_dfa_exec()\fP to avoid wasting time by trying to |
| 814 |
match strings that are shorter than the lower bound. You can find out the value |
match strings that are shorter than the lower bound. You can find out the value |
| 815 |
in a calling program via the \fBpcre_fullinfo()\fP function. |
in a calling program via the \fBpcre_fullinfo()\fP function. |
| 816 |
.P |
.P |
| 817 |
Studying a pattern is also useful for non-anchored patterns that do not have a |
Studying a pattern is also useful for non-anchored patterns that do not have a |
| 818 |
single fixed starting character. A bitmap of possible starting bytes is |
single fixed starting character. A bitmap of possible starting bytes is |
| 819 |
created. This speeds up finding a position in the subject at which to start |
created. This speeds up finding a position in the subject at which to start |
| 820 |
matching. |
matching. |
| 821 |
. |
. |
| 822 |
. |
. |
| 1012 |
length of the longest name. PCRE_INFO_NAMETABLE returns a pointer to the first |
length of the longest name. PCRE_INFO_NAMETABLE returns a pointer to the first |
| 1013 |
entry of the table (a pointer to \fBchar\fP). The first two bytes of each entry |
entry of the table (a pointer to \fBchar\fP). The first two bytes of each entry |
| 1014 |
are the number of the capturing parenthesis, most significant byte first. The |
are the number of the capturing parenthesis, most significant byte first. The |
| 1015 |
rest of the entry is the corresponding name, zero terminated. |
rest of the entry is the corresponding name, zero terminated. |
| 1016 |
.P |
.P |
| 1017 |
The names are in alphabetical order. Duplicate names may appear if (?| is used |
The names are in alphabetical order. Duplicate names may appear if (?| is used |
| 1018 |
to create multiple groups with the same number, as described in the |
to create multiple groups with the same number, as described in the |
| 1024 |
.\" HREF |
.\" HREF |
| 1025 |
\fBpcrepattern\fP |
\fBpcrepattern\fP |
| 1026 |
.\" |
.\" |
| 1027 |
page. Duplicate names for subpatterns with different numbers are permitted only |
page. Duplicate names for subpatterns with different numbers are permitted only |
| 1028 |
if PCRE_DUPNAMES is set. In all cases of duplicate names, they appear in the |
if PCRE_DUPNAMES is set. In all cases of duplicate names, they appear in the |
| 1029 |
table in the order in which they were found in the pattern. In the absence of |
table in the order in which they were found in the pattern. In the absence of |
| 1030 |
(?| this is the order of increasing number; when (?| is used this is not |
(?| this is the order of increasing number; when (?| is used this is not |
| 1031 |
necessarily the case because later subpatterns may have lower numbers. |
necessarily the case because later subpatterns may have lower numbers. |
| 1032 |
.P |
.P |
| 1033 |
As a simple example of the name/number table, consider the following pattern |
As a simple example of the name/number table, consider the following pattern |
| 1371 |
.sp |
.sp |
| 1372 |
PCRE_NOTEMPTY_ATSTART |
PCRE_NOTEMPTY_ATSTART |
| 1373 |
.sp |
.sp |
| 1374 |
This is like PCRE_NOTEMPTY, except that an empty string match that is not at |
This is like PCRE_NOTEMPTY, except that an empty string match that is not at |
| 1375 |
the start of the subject is permitted. If the pattern is anchored, such a match |
the start of the subject is permitted. If the pattern is anchored, such a match |
| 1376 |
can occur only if the pattern contains \eK. |
can occur only if the pattern contains \eK. |
| 1377 |
.P |
.P |
| 1427 |
subject, or a value of \fIstartoffset\fP that does not point to the start of a |
subject, or a value of \fIstartoffset\fP that does not point to the start of a |
| 1428 |
UTF-8 character, is undefined. Your program may crash. |
UTF-8 character, is undefined. Your program may crash. |
| 1429 |
.sp |
.sp |
| 1430 |
PCRE_PARTIAL_HARD |
PCRE_PARTIAL_HARD |
| 1431 |
PCRE_PARTIAL_SOFT |
PCRE_PARTIAL_SOFT |
| 1432 |
.sp |
.sp |
| 1433 |
These options turn on the partial matching feature. For backwards |
These options turn on the partial matching feature. For backwards |
| 1634 |
.sp |
.sp |
| 1635 |
This code is no longer in use. It was formerly returned when the PCRE_PARTIAL |
This code is no longer in use. It was formerly returned when the PCRE_PARTIAL |
| 1636 |
option was used with a compiled pattern containing items that were not |
option was used with a compiled pattern containing items that were not |
| 1637 |
supported for partial matching. From release 8.00 onwards, there are no |
supported for partial matching. From release 8.00 onwards, there are no |
| 1638 |
restrictions on partial matching. |
restrictions on partial matching. |
| 1639 |
.sp |
.sp |
| 1640 |
PCRE_ERROR_INTERNAL (-14) |
PCRE_ERROR_INTERNAL (-14) |
| 1898 |
just once, and does not backtrack. This has different characteristics to the |
just once, and does not backtrack. This has different characteristics to the |
| 1899 |
normal algorithm, and is not compatible with Perl. Some of the features of PCRE |
normal algorithm, and is not compatible with Perl. Some of the features of PCRE |
| 1900 |
patterns are not supported. Nevertheless, there are times when this kind of |
patterns are not supported. Nevertheless, there are times when this kind of |
| 1901 |
matching can be useful. For a discussion of the two matching algorithms, and a |
matching can be useful. For a discussion of the two matching algorithms, and a |
| 1902 |
list of features that \fBpcre_dfa_exec()\fP does not support, see the |
list of features that \fBpcre_dfa_exec()\fP does not support, see the |
| 1903 |
.\" HREF |
.\" HREF |
| 1904 |
\fBpcrematching\fP |
\fBpcrematching\fP |
| 1944 |
for \fBpcre_exec()\fP, so their description is not repeated here. |
for \fBpcre_exec()\fP, so their description is not repeated here. |
| 1945 |
.sp |
.sp |
| 1946 |
PCRE_PARTIAL_HARD |
PCRE_PARTIAL_HARD |
| 1947 |
PCRE_PARTIAL_SOFT |
PCRE_PARTIAL_SOFT |
| 1948 |
.sp |
.sp |
| 1949 |
These have the same general effect as they do for \fBpcre_exec()\fP, but the |
These have the same general effect as they do for \fBpcre_exec()\fP, but the |
| 1950 |
details are slightly different. When PCRE_PARTIAL_HARD is set for |
details are slightly different. When PCRE_PARTIAL_HARD is set for |