/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 227 by ph10, Tue Aug 21 15:00:15 2007 UTC revision 251 by ph10, Mon Sep 17 10:33:48 2007 UTC
# Line 79  example, on a Unix system where LF is th Line 79  example, on a Unix system where LF is th
79  changes the convention to CR. That pattern matches "a\enb" because LF is no  changes the convention to CR. That pattern matches "a\enb" because LF is no
80  longer a newline. Note that these special settings, which are not  longer a newline. Note that these special settings, which are not
81  Perl-compatible, are recognized only at the very start of a pattern, and that  Perl-compatible, are recognized only at the very start of a pattern, and that
82  they must be in upper case.  they must be in upper case. If more than one of them is present, the last one
83    is used.
84    .P
85    The newline convention does not affect what the \eR escape sequence matches. By
86    default, this is any Unicode newline sequence, for Perl compatibility. However,
87    this can be changed; see the description of \eR in the section entitled
88    .\" HTML <a href="#newlineseq">
89    .\" </a>
90    "Newline sequences"
91    .\"
92    below. A change of \eR setting can be combined with a change of newline
93    convention.
94  .  .
95  .  .
96  .SH "CHARACTERS AND METACHARACTERS"  .SH "CHARACTERS AND METACHARACTERS"
# Line 388  accented letters, and these are matched Line 399  accented letters, and these are matched
399  is discouraged.  is discouraged.
400  .  .
401  .  .
402    .\" HTML <a name="newlineseq"></a>
403  .SS "Newline sequences"  .SS "Newline sequences"
404  .rs  .rs
405  .sp  .sp
406  Outside a character class, the escape sequence \eR matches any Unicode newline  Outside a character class, by default, the escape sequence \eR matches any
407  sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is equivalent to  Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is
408  the following:  equivalent to the following:
409  .sp  .sp
410    (?>\er\en|\en|\ex0b|\ef|\er|\ex85)    (?>\er\en|\en|\ex0b|\ef|\er|\ex85)
411  .sp  .sp
# Line 413  are added: LS (line separator, U+2028) a Line 425  are added: LS (line separator, U+2028) a
425  Unicode character property support is not needed for these characters to be  Unicode character property support is not needed for these characters to be
426  recognized.  recognized.
427  .P  .P
428    It is possible to restrict \eR to match only CR, LF, or CRLF (instead of the
429    complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF
430    either at compile time or when the pattern is matched. (BSR is an abbrevation
431    for "backslash R".) This can be made the default when PCRE is built; if this is
432    the case, the other behaviour can be requested via the PCRE_BSR_UNICODE option.
433    It is also possible to specify these settings by starting a pattern string with
434    one of the following sequences:
435    .sp
436      (*BSR_ANYCRLF)   CR, LF, or CRLF only
437      (*BSR_UNICODE)   any Unicode newline sequence
438    .sp
439    These override the default and the options given to \fBpcre_compile()\fP, but
440    they can be overridden by options given to \fBpcre_exec()\fP. Note that these
441    special settings, which are not Perl-compatible, are recognized only at the
442    very start of a pattern, and that they must be in upper case. If more than one
443    of them is present, the last one is used. They can be combined with a change of
444    newline convention, for example, a pattern can start with:
445    .sp
446      (*ANY)(*BSR_ANYCRLF)
447    .sp
448  Inside a character class, \eR matches the letter "R".  Inside a character class, \eR matches the letter "R".
449  .  .
450  .  .
# Line 960  alternative in the subpattern. Line 992  alternative in the subpattern.
992  .rs  .rs
993  .sp  .sp
994  The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and  The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and
995  PCRE_EXTENDED options can be changed from within the pattern by a sequence of  PCRE_EXTENDED options (which are Perl-compatible) can be changed from within
996  Perl option letters enclosed between "(?" and ")". The option letters are  the pattern by a sequence of Perl option letters enclosed between "(?" and ")".
997    The option letters are
998  .sp  .sp
999    i  for PCRE_CASELESS    i  for PCRE_CASELESS
1000    m  for PCRE_MULTILINE    m  for PCRE_MULTILINE
# Line 975  PCRE_MULTILINE while unsetting PCRE_DOTA Line 1008  PCRE_MULTILINE while unsetting PCRE_DOTA
1008  permitted. If a letter appears both before and after the hyphen, the option is  permitted. If a letter appears both before and after the hyphen, the option is
1009  unset.  unset.
1010  .P  .P
1011    The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be
1012    changed in the same way as the Perl-compatible options by using the characters
1013    J, U and X respectively.
1014    .P
1015  When an option change occurs at top level (that is, not inside subpattern  When an option change occurs at top level (that is, not inside subpattern
1016  parentheses), the change applies to the remainder of the pattern that follows.  parentheses), the change applies to the remainder of the pattern that follows.
1017  If the change is placed right at the start of a pattern, PCRE extracts it into  If the change is placed right at the start of a pattern, PCRE extracts it into
# Line 998  branch is abandoned before the option se Line 1035  branch is abandoned before the option se
1035  option settings happen at compile time. There would be some very weird  option settings happen at compile time. There would be some very weird
1036  behaviour otherwise.  behaviour otherwise.
1037  .P  .P
1038  The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be  \fBNote:\fP There are other PCRE-specific options that can be set by the
1039  changed in the same way as the Perl-compatible options by using the characters  application when the compile or match functions are called. In some cases the
1040  J, U and X respectively.  pattern can contain special leading sequences to override what the application
1041    has set or what has been defaulted. Details are given in the section entitled
1042    .\" HTML <a href="#newlineseq">
1043    .\" </a>
1044    "Newline sequences"
1045    .\"
1046    above.
1047  .  .
1048  .  .
1049  .\" HTML <a name="subpattern"></a>  .\" HTML <a name="subpattern"></a>
# Line 2016  description of the interface to the call Line 2059  description of the interface to the call
2059  documentation.  documentation.
2060  .  .
2061  .  .
2062  .SH "BACTRACKING CONTROL"  .SH "BACKTRACKING CONTROL"
2063  .rs  .rs
2064  .sp  .sp
2065  Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which  Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which
# Line 2149  Cambridge CB2 3QH, England. Line 2192  Cambridge CB2 3QH, England.
2192  .rs  .rs
2193  .sp  .sp
2194  .nf  .nf
2195  Last updated: 21 August 2007  Last updated: 17 September 2007
2196  Copyright (c) 1997-2007 University of Cambridge.  Copyright (c) 1997-2007 University of Cambridge.
2197  .fi  .fi

Legend:
Removed from v.227  
changed lines
  Added in v.251

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12