/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 406 by ph10, Mon Mar 23 12:05:43 2009 UTC revision 412 by ph10, Sat Apr 11 10:34:37 2009 UTC
# Line 23  description of PCRE's regular expression Line 23  description of PCRE's regular expression
23  The original operation of PCRE was on strings of one-byte characters. However,  The original operation of PCRE was on strings of one-byte characters. However,
24  there is now also support for UTF-8 character strings. To use this, you must  there is now also support for UTF-8 character strings. To use this, you must
25  build PCRE to include UTF-8 support, and then call \fBpcre_compile()\fP with  build PCRE to include UTF-8 support, and then call \fBpcre_compile()\fP with
26  the PCRE_UTF8 option. How this affects pattern matching is mentioned in several  the PCRE_UTF8 option. There is also a special sequence that can be given at the
27  places below. There is also a summary of UTF-8 features in the  start of a pattern:
28    .sp
29      (*UTF8)
30    .sp
31    Starting a pattern with this sequence is equivalent to setting the PCRE_UTF8
32    option. This feature is not Perl-compatible. How setting UTF-8 mode affects
33    pattern matching is mentioned in several places below. There is also a summary
34    of UTF-8 features in the
35  .\" HTML <a href="pcre.html#utf8support">  .\" HTML <a href="pcre.html#utf8support">
36  .\" </a>  .\" </a>
37  section on UTF-8 support  section on UTF-8 support
# Line 1032  The PCRE-specific options PCRE_DUPNAMES, Line 1039  The PCRE-specific options PCRE_DUPNAMES,
1039  changed in the same way as the Perl-compatible options by using the characters  changed in the same way as the Perl-compatible options by using the characters
1040  J, U and X respectively.  J, U and X respectively.
1041  .P  .P
1042  When an option change occurs at top level (that is, not inside subpattern  When one of these option changes occurs at top level (that is, not inside
1043  parentheses), the change applies to the remainder of the pattern that follows.  subpattern parentheses), the change applies to the remainder of the pattern
1044  If the change is placed right at the start of a pattern, PCRE extracts it into  that follows. If the change is placed right at the start of a pattern, PCRE
1045  the global options (and it will therefore show up in data extracted by the  extracts it into the global options (and it will therefore show up in data
1046  \fBpcre_fullinfo()\fP function).  extracted by the \fBpcre_fullinfo()\fP function).
1047  .P  .P
1048  An option change within a subpattern (see below for a description of  An option change within a subpattern (see below for a description of
1049  subpatterns) affects only that part of the current pattern that follows it, so  subpatterns) affects only that part of the current pattern that follows it, so
# Line 1057  behaviour otherwise. Line 1064  behaviour otherwise.
1064  .P  .P
1065  \fBNote:\fP There are other PCRE-specific options that can be set by the  \fBNote:\fP There are other PCRE-specific options that can be set by the
1066  application when the compile or match functions are called. In some cases the  application when the compile or match functions are called. In some cases the
1067  pattern can contain special leading sequences to override what the application  pattern can contain special leading sequences such as (*CRLF) to override what
1068  has set or what has been defaulted. Details are given in the section entitled  the application has set or what has been defaulted. Details are given in the
1069    section entitled
1070  .\" HTML <a href="#newlineseq">  .\" HTML <a href="#newlineseq">
1071  .\" </a>  .\" </a>
1072  "Newline sequences"  "Newline sequences"
1073  .\"  .\"
1074  above.  above. There is also the (*UTF8) leading sequence that can be used to set UTF-8
1075    mode; this is equivalent to setting the PCRE_UTF8 option.
1076  .  .
1077  .  .
1078  .\" HTML <a name="subpattern"></a>  .\" HTML <a name="subpattern"></a>
# Line 2245  Cambridge CB2 3QH, England. Line 2254  Cambridge CB2 3QH, England.
2254  .rs  .rs
2255  .sp  .sp
2256  .nf  .nf
2257  Last updated: 18 March 2009  Last updated: 11 April 2009
2258  Copyright (c) 1997-2009 University of Cambridge.  Copyright (c) 1997-2009 University of Cambridge.
2259  .fi  .fi

Legend:
Removed from v.406  
changed lines
  Added in v.412

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12