/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 227 by ph10, Tue Aug 21 15:00:15 2007 UTC revision 231 by ph10, Tue Sep 11 11:15:33 2007 UTC
# Line 79  example, on a Unix system where LF is th Line 79  example, on a Unix system where LF is th
79  changes the convention to CR. That pattern matches "a\enb" because LF is no  changes the convention to CR. That pattern matches "a\enb" because LF is no
80  longer a newline. Note that these special settings, which are not  longer a newline. Note that these special settings, which are not
81  Perl-compatible, are recognized only at the very start of a pattern, and that  Perl-compatible, are recognized only at the very start of a pattern, and that
82  they must be in upper case.  they must be in upper case. If more than one of them is present, the last one
83    is used.
84    .P
85    The newline convention does not affect what the \eR escape sequence matches. By
86    default, this is any Unicode newline sequence, for Perl compatibility. However,
87    this can be changed; see the description of \eR in the section entitled
88    .\" HTML <a href="#newlineseq">
89    .\" </a>
90    "Newline sequences"
91    .\"
92    below.
93  .  .
94  .  .
95  .SH "CHARACTERS AND METACHARACTERS"  .SH "CHARACTERS AND METACHARACTERS"
# Line 388  accented letters, and these are matched Line 398  accented letters, and these are matched
398  is discouraged.  is discouraged.
399  .  .
400  .  .
401    .\" HTML <a name="newlineseq"></a>
402  .SS "Newline sequences"  .SS "Newline sequences"
403  .rs  .rs
404  .sp  .sp
405  Outside a character class, the escape sequence \eR matches any Unicode newline  Outside a character class, by default, the escape sequence \eR matches any
406  sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is equivalent to  Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is
407  the following:  equivalent to the following:
408  .sp  .sp
409    (?>\er\en|\en|\ex0b|\ef|\er|\ex85)    (?>\er\en|\en|\ex0b|\ef|\er|\ex85)
410  .sp  .sp
# Line 413  are added: LS (line separator, U+2028) a Line 424  are added: LS (line separator, U+2028) a
424  Unicode character property support is not needed for these characters to be  Unicode character property support is not needed for these characters to be
425  recognized.  recognized.
426  .P  .P
427    It is possible to restrict \eR to match only CR, LF, or CRLF (instead of the
428    complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF
429    either at compile time or when the pattern is matched. This can be made the
430    default when PCRE is built; if this is the case, the other behaviour can be
431    requested via the PCRE_BSR_UNICODE option. It is also possible to specify these
432    settings by starting a pattern string with one of the following sequences:
433    .sp
434      (*BSR_ANYCRLF)   CR, LF, or CRLF only
435      (*BSR_UNICODE)   any Unicode newline sequence
436    .sp
437    These override the default and the options given to \fBpcre_compile()\fP, but
438    they can be overridden by options given to \fBpcre_exec()\fP. Note that these
439    special settings, which are not Perl-compatible, are recognized only at the
440    very start of a pattern, and that they must be in upper case. If more than one
441    of them is present, the last one is used.
442    .P
443  Inside a character class, \eR matches the letter "R".  Inside a character class, \eR matches the letter "R".
444  .  .
445  .  .
# Line 960  alternative in the subpattern. Line 987  alternative in the subpattern.
987  .rs  .rs
988  .sp  .sp
989  The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and  The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and
990  PCRE_EXTENDED options can be changed from within the pattern by a sequence of  PCRE_EXTENDED options (which are Perl-compatible) can be changed from within
991  Perl option letters enclosed between "(?" and ")". The option letters are  the pattern by a sequence of Perl option letters enclosed between "(?" and ")".
992    The option letters are
993  .sp  .sp
994    i  for PCRE_CASELESS    i  for PCRE_CASELESS
995    m  for PCRE_MULTILINE    m  for PCRE_MULTILINE
# Line 975  PCRE_MULTILINE while unsetting PCRE_DOTA Line 1003  PCRE_MULTILINE while unsetting PCRE_DOTA
1003  permitted. If a letter appears both before and after the hyphen, the option is  permitted. If a letter appears both before and after the hyphen, the option is
1004  unset.  unset.
1005  .P  .P
1006    The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be
1007    changed in the same way as the Perl-compatible options by using the characters
1008    J, U and X respectively.
1009    .P
1010  When an option change occurs at top level (that is, not inside subpattern  When an option change occurs at top level (that is, not inside subpattern
1011  parentheses), the change applies to the remainder of the pattern that follows.  parentheses), the change applies to the remainder of the pattern that follows.
1012  If the change is placed right at the start of a pattern, PCRE extracts it into  If the change is placed right at the start of a pattern, PCRE extracts it into
# Line 997  matches "ab", "aB", "c", and "C", even t Line 1029  matches "ab", "aB", "c", and "C", even t
1029  branch is abandoned before the option setting. This is because the effects of  branch is abandoned before the option setting. This is because the effects of
1030  option settings happen at compile time. There would be some very weird  option settings happen at compile time. There would be some very weird
1031  behaviour otherwise.  behaviour otherwise.
 .P  
 The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be  
 changed in the same way as the Perl-compatible options by using the characters  
 J, U and X respectively.  
1032  .  .
1033  .  .
1034  .\" HTML <a name="subpattern"></a>  .\" HTML <a name="subpattern"></a>
# Line 2149  Cambridge CB2 3QH, England. Line 2177  Cambridge CB2 3QH, England.
2177  .rs  .rs
2178  .sp  .sp
2179  .nf  .nf
2180  Last updated: 21 August 2007  Last updated: 11 September 2007
2181  Copyright (c) 1997-2007 University of Cambridge.  Copyright (c) 1997-2007 University of Cambridge.
2182  .fi  .fi

Legend:
Removed from v.227  
changed lines
  Added in v.231

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12