/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 75 by nigel, Sat Feb 24 21:40:37 2007 UTC revision 77 by nigel, Sat Feb 24 21:40:45 2007 UTC
# Line 26  in the main Line 26  in the main
26  .\"  .\"
27  page.  page.
28  .P  .P
29    The remainder of this document discusses the patterns that are supported by
30    PCRE when its main matching function, \fBpcre_exec()\fP, is used.
31    From release 6.0, PCRE offers a second matching function,
32    \fBpcre_dfa_exec()\fP, which matches using a different algorithm that is not
33    Perl-compatible. The advantages and disadvantages of the alternative function,
34    and how it differs from the normal function, are discussed in the
35    .\" HREF
36    \fBpcrematching\fP
37    .\"
38    page.
39    .P
40  A regular expression is a pattern that is matched against a subject string from  A regular expression is a pattern that is matched against a subject string from
41  left to right. Most characters stand for themselves in a pattern, and match the  left to right. Most characters stand for themselves in a pattern, and match the
42  corresponding characters in the subject. As a trivial example, the pattern  corresponding characters in the subject. As a trivial example, the pattern
43  .sp  .sp
44    The quick brown fox    The quick brown fox
45  .sp  .sp
46  matches a portion of a subject string that is identical to itself. The power of  matches a portion of a subject string that is identical to itself. When
47  regular expressions comes from the ability to include alternatives and  caseless matching is specified (the PCRE_CASELESS option), letters are matched
48  repetitions in the pattern. These are encoded in the pattern by the use of  independently of case. In UTF-8 mode, PCRE always understands the concept of
49    case for characters whose values are less than 128, so caseless matching is
50    always possible. For characters with higher values, the concept of case is
51    supported if PCRE is compiled with Unicode property support, but not otherwise.
52    If you want to use caseless matching for characters 128 and above, you must
53    ensure that PCRE is compiled with Unicode property support as well as with
54    UTF-8 support.
55    .P
56    The power of regular expressions comes from the ability to include alternatives
57    and repetitions in the pattern. These are encoded in the pattern by the use of
58  \fImetacharacters\fP, which do not stand for themselves but instead are  \fImetacharacters\fP, which do not stand for themselves but instead are
59  interpreted in some special way.  interpreted in some special way.
60  .P  .P
# Line 527  class as a literal string of bytes, or b Line 547  class as a literal string of bytes, or b
547  When caseless matching is set, any letters in a class represent both their  When caseless matching is set, any letters in a class represent both their
548  upper case and lower case versions, so for example, a caseless [aeiou] matches  upper case and lower case versions, so for example, a caseless [aeiou] matches
549  "A" as well as "a", and a caseless [^aeiou] does not match "A", whereas a  "A" as well as "a", and a caseless [^aeiou] does not match "A", whereas a
550  caseful version would. When running in UTF-8 mode, PCRE supports the concept of  caseful version would. In UTF-8 mode, PCRE always understands the concept of
551  case for characters with values greater than 128 only when it is compiled with  case for characters whose values are less than 128, so caseless matching is
552  Unicode property support.  always possible. For characters with higher values, the concept of case is
553    supported if PCRE is compiled with Unicode property support, but not otherwise.
554    If you want to use caseless matching for characters 128 and above, you must
555    ensure that PCRE is compiled with Unicode property support as well as with
556    UTF-8 support.
557  .P  .P
558  The newline character is never treated in any special way in character classes,  The newline character is never treated in any special way in character classes,
559  whatever the setting of the PCRE_DOTALL or PCRE_MULTILINE options is. A class  whatever the setting of the PCRE_DOTALL or PCRE_MULTILINE options is. A class
# Line 1451  description of the interface to the call Line 1475  description of the interface to the call
1475  documentation.  documentation.
1476  .P  .P
1477  .in 0  .in 0
1478  Last updated: 09 September 2004  Last updated: 28 February 2005
1479  .br  .br
1480  Copyright (c) 1997-2004 University of Cambridge.  Copyright (c) 1997-2005 University of Cambridge.

Legend:
Removed from v.75  
changed lines
  Added in v.77

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12