/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 247 by ph10, Mon Sep 17 09:38:32 2007 UTC revision 341 by ph10, Sat Apr 19 16:41:04 2008 UTC
# Line 9  are described in detail below. There is Line 9  are described in detail below. There is
9  .\" HREF  .\" HREF
10  \fBpcresyntax\fP  \fBpcresyntax\fP
11  .\"  .\"
12  page. Perl's regular expressions are described in its own documentation, and  page. PCRE tries to match Perl syntax and semantics as closely as it can. PCRE
13    also supports some alternative regular expression syntax (which does not
14    conflict with the Perl syntax) in order to provide some compatibility with
15    regular expressions in Python, .NET, and Oniguruma.
16    .P
17    Perl's regular expressions are described in its own documentation, and
18  regular expressions in general are covered in a number of books, some of which  regular expressions in general are covered in a number of books, some of which
19  have copious examples. Jeffrey Friedl's "Mastering Regular Expressions",  have copious examples. Jeffrey Friedl's "Mastering Regular Expressions",
20  published by O'Reilly, covers regular expressions in great detail. This  published by O'Reilly, covers regular expressions in great detail. This
# Line 310  parenthesized subpatterns. Line 315  parenthesized subpatterns.
315  .\"  .\"
316  .  .
317  .  .
318    .SS "Absolute and relative subroutine calls"
319    .rs
320    .sp
321    For compatibility with Oniguruma, the non-Perl syntax \eg followed by a name or
322    a number enclosed either in angle brackets or single quotes, is an alternative
323    syntax for referencing a subpattern as a "subroutine". Details are discussed
324    .\" HTML <a href="#onigurumasubroutines">
325    .\" </a>
326    later.
327    .\"
328    Note that \eg{...} (Perl syntax) and \eg<...> (Oniguruma syntax) are \fInot\fP
329    synonymous. The former is a back reference; the latter is a subroutine call.
330    .
331    .
332  .SS "Generic character types"  .SS "Generic character types"
333  .rs  .rs
334  .sp  .sp
# Line 1034  matches "ab", "aB", "c", and "C", even t Line 1053  matches "ab", "aB", "c", and "C", even t
1053  branch is abandoned before the option setting. This is because the effects of  branch is abandoned before the option setting. This is because the effects of
1054  option settings happen at compile time. There would be some very weird  option settings happen at compile time. There would be some very weird
1055  behaviour otherwise.  behaviour otherwise.
1056    .P
1057    \fBNote:\fP There are other PCRE-specific options that can be set by the
1058    application when the compile or match functions are called. In some cases the
1059    pattern can contain special leading sequences to override what the application
1060    has set or what has been defaulted. Details are given in the section entitled
1061    .\" HTML <a href="#newlineseq">
1062    .\" </a>
1063    "Newline sequences"
1064    .\"
1065    above.
1066  .  .
1067  .  .
1068  .\" HTML <a name="subpattern"></a>  .\" HTML <a name="subpattern"></a>
# Line 1230  support is available, \eX{3} matches thr Line 1259  support is available, \eX{3} matches thr
1259  which may be several bytes long (and they may be of different lengths).  which may be several bytes long (and they may be of different lengths).
1260  .P  .P
1261  The quantifier {0} is permitted, causing the expression to behave as if the  The quantifier {0} is permitted, causing the expression to behave as if the
1262  previous item and the quantifier were not present.  previous item and the quantifier were not present. This may be useful for
1263    subpatterns that are referenced as
1264    .\" HTML <a href="#subpatternsassubroutines">
1265    .\" </a>
1266    subroutines
1267    .\"
1268    from elsewhere in the pattern. Items other than subpatterns that have a {0}
1269    quantifier are omitted from the compiled pattern.
1270  .P  .P
1271  For convenience, the three most common quantifiers have single-character  For convenience, the three most common quantifiers have single-character
1272  abbreviations:  abbreviations:
# Line 2013  It matches "abcabc". It does not match " Line 2049  It matches "abcabc". It does not match "
2049  processing option does not affect the called subpattern.  processing option does not affect the called subpattern.
2050  .  .
2051  .  .
2052    .\" HTML <a name="onigurumasubroutines"></a>
2053    .SH "ONIGURUMA SUBROUTINE SYNTAX"
2054    .rs
2055    .sp
2056    For compatibility with Oniguruma, the non-Perl syntax \eg followed by a name or
2057    a number enclosed either in angle brackets or single quotes, is an alternative
2058    syntax for referencing a subpattern as a subroutine, possibly recursively. Here
2059    are two of the examples used above, rewritten using this syntax:
2060    .sp
2061      (?<pn> \e( ( (?>[^()]+) | \eg<pn> )* \e) )
2062      (sens|respons)e and \eg'1'ibility
2063    .sp
2064    PCRE supports an extension to Oniguruma: if a number is preceded by a
2065    plus or a minus sign it is taken as a relative reference. For example:
2066    .sp
2067      (abc)(?i:\eg<-1>)
2068    .sp
2069    Note that \eg{...} (Perl syntax) and \eg<...> (Oniguruma syntax) are \fInot\fP
2070    synonymous. The former is a back reference; the latter is a subroutine call.
2071    .
2072    .
2073  .SH CALLOUTS  .SH CALLOUTS
2074  .rs  .rs
2075  .sp  .sp
# Line 2058  or removal in a future version of Perl". Line 2115  or removal in a future version of Perl".
2115  production code should be noted to avoid problems during upgrades." The same  production code should be noted to avoid problems during upgrades." The same
2116  remarks apply to the PCRE features described in this section.  remarks apply to the PCRE features described in this section.
2117  .P  .P
2118  Since these verbs are specifically related to backtracking, they can be used  Since these verbs are specifically related to backtracking, most of them can be
2119  only when the pattern is to be matched using \fBpcre_exec()\fP, which uses a  used only when the pattern is to be matched using \fBpcre_exec()\fP, which uses
2120  backtracking algorithm. They cause an error if encountered by  a backtracking algorithm. With the exception of (*FAIL), which behaves like a
2121    failing negative assertion, they cause an error if encountered by
2122  \fBpcre_dfa_exec()\fP.  \fBpcre_dfa_exec()\fP.
2123  .P  .P
2124  The new verbs make use of what was previously invalid syntax: an opening  The new verbs make use of what was previously invalid syntax: an opening
# Line 2182  Cambridge CB2 3QH, England. Line 2240  Cambridge CB2 3QH, England.
2240  .rs  .rs
2241  .sp  .sp
2242  .nf  .nf
2243  Last updated: 14 September 2007  Last updated: 19 April 2008
2244  Copyright (c) 1997-2007 University of Cambridge.  Copyright (c) 1997-2008 University of Cambridge.
2245  .fi  .fi

Legend:
Removed from v.247  
changed lines
  Added in v.341

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12