/[pcre]/code/trunk/doc/pcrecompat.3
ViewVC logotype

Diff of /code/trunk/doc/pcrecompat.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 74 by nigel, Sat Feb 24 21:40:30 2007 UTC revision 75 by nigel, Sat Feb 24 21:40:37 2007 UTC
# Line 1  Line 1 
1  .TH PCRE 3  .TH PCRE 3
2  .SH NAME  .SH NAME
3  PCRE - Perl-compatible regular expressions  PCRE - Perl-compatible regular expressions
4  .SH DIFFERENCES FROM PERL  .SH "DIFFERENCES BETWEEN PCRE AND PERL"
5  .rs  .rs
6  .sp  .sp
7  This document describes the differences in the ways that PCRE and Perl handle  This document describes the differences in the ways that PCRE and Perl handle
8  regular expressions. The differences described here are with respect to Perl  regular expressions. The differences described here are with respect to Perl
9  5.8.  5.8.
10    .P
11  1. PCRE does not have full UTF-8 support. Details of what it does have are  1. PCRE does not have full UTF-8 support. Details of what it does have are
12  given in the  given in the
13  .\" HTML <a href="pcre.html#utf8support">  .\" HTML <a href="pcre.html#utf8support">
# Line 16  section on UTF-8 support Line 16  section on UTF-8 support
16  .\"  .\"
17  in the main  in the main
18  .\" HREF  .\" HREF
19  \fBpcre\fR  \fBpcre\fP
20  .\"  .\"
21  page.  page.
22    .P
23  2. PCRE does not allow repeat quantifiers on lookahead assertions. Perl permits  2. PCRE does not allow repeat quantifiers on lookahead assertions. Perl permits
24  them, but they do not mean what you might think. For example, (?!a){3} does  them, but they do not mean what you might think. For example, (?!a){3} does
25  not assert that the next three characters are not "a". It just asserts that the  not assert that the next three characters are not "a". It just asserts that the
26  next character is not "a" three times.  next character is not "a" three times.
27    .P
28  3. Capturing subpatterns that occur inside negative lookahead assertions are  3. Capturing subpatterns that occur inside negative lookahead assertions are
29  counted, but their entries in the offsets vector are never set. Perl sets its  counted, but their entries in the offsets vector are never set. Perl sets its
30  numerical variables from any such patterns that are matched before the  numerical variables from any such patterns that are matched before the
31  assertion fails to match something (thereby succeeding), but only if the  assertion fails to match something (thereby succeeding), but only if the
32  negative lookahead assertion contains just one branch.  negative lookahead assertion contains just one branch.
33    .P
34  4. Though binary zero characters are supported in the subject string, they are  4. Though binary zero characters are supported in the subject string, they are
35  not allowed in a pattern string because it is passed as a normal C string,  not allowed in a pattern string because it is passed as a normal C string,
36  terminated by zero. The escape sequence "\\0" can be used in the pattern to  terminated by zero. The escape sequence \e0 can be used in the pattern to
37  represent a binary zero.  represent a binary zero.
38    .P
39  5. The following Perl escape sequences are not supported: \\l, \\u, \\L,  5. The following Perl escape sequences are not supported: \el, \eu, \eL,
40  \\U, \\P, \\p, \\N, and \\X. In fact these are implemented by Perl's general  \eU, and \eN. In fact these are implemented by Perl's general string-handling
41  string-handling and are not part of its pattern matching engine. If any of  and are not part of its pattern matching engine. If any of these are
42  these are encountered by PCRE, an error is generated.  encountered by PCRE, an error is generated.
43    .P
44  6. PCRE does support the \\Q...\\E escape for quoting substrings. Characters in  6. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE is
45    built with Unicode character property support. The properties that can be
46    tested with \ep and \eP are limited to the general category properties such as
47    Lu and Nd.
48    .P
49    7. PCRE does support the \eQ...\eE escape for quoting substrings. Characters in
50  between are treated as literals. This is slightly different from Perl in that $  between are treated as literals. This is slightly different from Perl in that $
51  and @ are also handled as literals inside the quotes. In Perl, they cause  and @ are also handled as literals inside the quotes. In Perl, they cause
52  variable interpolation (but of course PCRE does not have variables). Note the  variable interpolation (but of course PCRE does not have variables). Note the
53  following examples:  following examples:
54    .sp
55      Pattern            PCRE matches      Perl matches      Pattern            PCRE matches      Perl matches
56    .sp
57      \\Qabc$xyz\\E        abc$xyz           abc followed by the  .\" JOIN
58        \eQabc$xyz\eE        abc$xyz           abc followed by the
59                                             contents of $xyz                                             contents of $xyz
60      \\Qabc\\$xyz\\E       abc\\$xyz          abc\\$xyz      \eQabc\e$xyz\eE       abc\e$xyz          abc\e$xyz
61      \\Qabc\\E\\$\\Qxyz\\E   abc$xyz           abc$xyz      \eQabc\eE\e$\eQxyz\eE   abc$xyz           abc$xyz
62    .sp
63  The \\Q...\\E sequence is recognized both inside and outside character classes.  The \eQ...\eE sequence is recognized both inside and outside character classes.
64    .P
65  7. Fairly obviously, PCRE does not support the (?{code}) and (?p{code})  8. Fairly obviously, PCRE does not support the (?{code}) and (?p{code})
66  constructions. However, there is some experimental support for recursive  constructions. However, there is support for recursive patterns using the
67  patterns using the non-Perl items (?R), (?number) and (?P>name). Also, the PCRE  non-Perl items (?R), (?number), and (?P>name). Also, the PCRE "callout" feature
68  "callout" feature allows an external function to be called during pattern  allows an external function to be called during pattern matching. See the
69  matching.  .\" HREF
70    \fBpcrecallout\fP
71  8. There are some differences that are concerned with the settings of captured  .\"
72    documentation for details.
73    .P
74    9. There are some differences that are concerned with the settings of captured
75  strings when part of a pattern is repeated. For example, matching "aba" against  strings when part of a pattern is repeated. For example, matching "aba" against
76  the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".  the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".
77    .P
78  9. PCRE provides some extensions to the Perl regular expression facilities:  10. PCRE provides some extensions to the Perl regular expression facilities:
79    .sp
80  (a) Although lookbehind assertions must match fixed length strings, each  (a) Although lookbehind assertions must match fixed length strings, each
81  alternative branch of a lookbehind assertion can match a different length of  alternative branch of a lookbehind assertion can match a different length of
82  string. Perl requires them all to have the same length.  string. Perl requires them all to have the same length.
83    .sp
84  (b) If PCRE_DOLLAR_ENDONLY is set and PCRE_MULTILINE is not set, the $  (b) If PCRE_DOLLAR_ENDONLY is set and PCRE_MULTILINE is not set, the $
85  meta-character matches only at the very end of the string.  meta-character matches only at the very end of the string.
86    .sp
87  (c) If PCRE_EXTRA is set, a backslash followed by a letter with no special  (c) If PCRE_EXTRA is set, a backslash followed by a letter with no special
88  meaning is faulted.  meaning is faulted.
89    .sp
90  (d) If PCRE_UNGREEDY is set, the greediness of the repetition quantifiers is  (d) If PCRE_UNGREEDY is set, the greediness of the repetition quantifiers is
91  inverted, that is, by default they are not greedy, but if followed by a  inverted, that is, by default they are not greedy, but if followed by a
92  question mark they are.  question mark they are.
93    .sp
94  (e) PCRE_ANCHORED can be used to force a pattern to be tried only at the first  (e) PCRE_ANCHORED can be used at matching time to force a pattern to be tried
95  matching position in the subject string.  only at the first matching position in the subject string.
96    .sp
97  (f) The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAPTURE  (f) The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAPTURE
98  options for \fBpcre_exec()\fR have no Perl equivalents.  options for \fBpcre_exec()\fP have no Perl equivalents.
99    .sp
100  (g) The (?R), (?number), and (?P>name) constructs allows for recursive pattern  (g) The (?R), (?number), and (?P>name) constructs allows for recursive pattern
101  matching (Perl can do this using the (?p{code}) construct, which PCRE cannot  matching (Perl can do this using the (?p{code}) construct, which PCRE cannot
102  support.)  support.)
103    .sp
104  (h) PCRE supports named capturing substrings, using the Python syntax.  (h) PCRE supports named capturing substrings, using the Python syntax.
105    .sp
106  (i) PCRE supports the possessive quantifier "++" syntax, taken from Sun's Java  (i) PCRE supports the possessive quantifier "++" syntax, taken from Sun's Java
107  package.  package.
108    .sp
109  (j) The (R) condition, for testing recursion, is a PCRE extension.  (j) The (R) condition, for testing recursion, is a PCRE extension.
110    .sp
111  (k) The callout facility is PCRE-specific.  (k) The callout facility is PCRE-specific.
112    .sp
113    (l) The partial matching facility is PCRE-specific.
114    .sp
115    (m) Patterns compiled by PCRE can be saved and re-used at a later time, even on
116    different hosts that have the other endianness.
117    .P
118  .in 0  .in 0
119  Last updated: 09 December 2003  Last updated: 09 September 2004
120  .br  .br
121  Copyright (c) 1997-2003 University of Cambridge.  Copyright (c) 1997-2004 University of Cambridge.

Legend:
Removed from v.74  
changed lines
  Added in v.75

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12