/[pcre]/code/trunk/doc/pcrecpp.3
ViewVC logotype

Contents of /code/trunk/doc/pcrecpp.3

Parent Directory Parent Directory | Revision Log Revision Log


Revision 79 - (hide annotations) (download)
Sat Feb 24 21:40:52 2007 UTC (7 years, 9 months ago) by nigel
File size: 7432 byte(s)
Load pcre-6.1 into code/trunk.

1 nigel 79 .TH PCRECPP 3
2 nigel 77 .SH NAME
3     PCRE - Perl-compatible regular expressions.
4     .SH "SYNOPSIS OF C++ WRAPPER"
5     .rs
6     .sp
7     .B #include <pcrecpp.h>
8     .PP
9     .SM
10     .br
11     .SH DESCRIPTION
12     .rs
13     .sp
14     The C++ wrapper for PCRE was provided by Google Inc. This brief man page was
15     constructed from the notes in the \fIpcrecpp.h\fP file, which should be
16     consulted for further details.
17     .
18     .
19     .SH "MATCHING INTERFACE"
20     .rs
21     .sp
22     The "FullMatch" operation checks that supplied text matches a supplied pattern
23     exactly. If pointer arguments are supplied, it copies matched sub-strings that
24     match sub-patterns into them.
25     .sp
26     Example: successful match
27     pcrecpp::RE re("h.*o");
28     re.FullMatch("hello");
29     .sp
30     Example: unsuccessful match (requires full match):
31     pcrecpp::RE re("e");
32     !re.FullMatch("hello");
33     .sp
34     Example: creating a temporary RE object:
35     pcrecpp::RE("h.*o").FullMatch("hello");
36     .sp
37     You can pass in a "const char*" or a "string" for "text". The examples below
38     tend to use a const char*. You can, as in the different examples above, store
39     the RE object explicitly in a variable or use a temporary RE object. The
40     examples below use one mode or the other arbitrarily. Either could correctly be
41     used for any of these examples.
42     .P
43     You must supply extra pointer arguments to extract matched subpieces.
44     .sp
45     Example: extracts "ruby" into "s" and 1234 into "i"
46     int i;
47     string s;
48     pcrecpp::RE re("(\e\ew+):(\e\ed+)");
49     re.FullMatch("ruby:1234", &s, &i);
50     .sp
51     Example: does not try to extract any extra sub-patterns
52     re.FullMatch("ruby:1234", &s);
53     .sp
54     Example: does not try to extract into NULL
55     re.FullMatch("ruby:1234", NULL, &i);
56     .sp
57     Example: integer overflow causes failure
58     !re.FullMatch("ruby:1234567891234", NULL, &i);
59     .sp
60     Example: fails because there aren't enough sub-patterns:
61     !pcrecpp::RE("\e\ew+:\e\ed+").FullMatch("ruby:1234", &s);
62     .sp
63     Example: fails because string cannot be stored in integer
64     !pcrecpp::RE("(.*)").FullMatch("ruby", &i);
65     .sp
66     The provided pointer arguments can be pointers to any scalar numeric
67     type, or one of:
68     .sp
69     string (matched piece is copied to string)
70     StringPiece (StringPiece is mutated to point to matched piece)
71     T (where "bool T::ParseFrom(const char*, int)" exists)
72     NULL (the corresponding matched sub-pattern is not copied)
73     .sp
74     The function returns true iff all of the following conditions are satisfied:
75     .sp
76     a. "text" matches "pattern" exactly;
77     .sp
78     b. The number of matched sub-patterns is >= number of supplied
79     pointers;
80     .sp
81     c. The "i"th argument has a suitable type for holding the
82     string captured as the "i"th sub-pattern. If you pass in
83     NULL for the "i"th argument, or pass fewer arguments than
84     number of sub-patterns, "i"th captured sub-pattern is
85     ignored.
86     .sp
87     The matching interface supports at most 16 arguments per call.
88     If you need more, consider using the more general interface
89     \fBpcrecpp::RE::DoMatch\fP. See \fBpcrecpp.h\fP for the signature for
90     \fBDoMatch\fP.
91     .
92     .SH "PARTIAL MATCHES"
93     .rs
94     .sp
95     You can use the "PartialMatch" operation when you want the pattern
96     to match any substring of the text.
97     .sp
98     Example: simple search for a string:
99     pcrecpp::RE("ell").PartialMatch("hello");
100     .sp
101     Example: find first number in a string:
102     int number;
103     pcrecpp::RE re("(\e\ed+)");
104     re.PartialMatch("x*100 + 20", &number);
105     assert(number == 100);
106     .
107     .
108     .SH "UTF-8 AND THE MATCHING INTERFACE"
109     .rs
110     .sp
111     By default, pattern and text are plain text, one byte per character. The UTF8
112     flag, passed to the constructor, causes both pattern and string to be treated
113     as UTF-8 text, still a byte stream but potentially multiple bytes per
114     character. In practice, the text is likelier to be UTF-8 than the pattern, but
115     the match returned may depend on the UTF8 flag, so always use it when matching
116     UTF8 text. For example, "." will match one byte normally but with UTF8 set may
117     match up to three bytes of a multi-byte character.
118     .sp
119     Example:
120     pcrecpp::RE_Options options;
121     options.set_utf8();
122     pcrecpp::RE re(utf8_pattern, options);
123     re.FullMatch(utf8_string);
124     .sp
125     Example: using the convenience function UTF8():
126     pcrecpp::RE re(utf8_pattern, pcrecpp::UTF8());
127     re.FullMatch(utf8_string);
128     .sp
129     NOTE: The UTF8 flag is ignored if pcre was not configured with the
130     --enable-utf8 flag.
131     .
132     .
133     .SH "SCANNING TEXT INCREMENTALLY"
134     .rs
135     .sp
136     The "Consume" operation may be useful if you want to repeatedly
137     match regular expressions at the front of a string and skip over
138     them as they match. This requires use of the "StringPiece" type,
139     which represents a sub-range of a real string. Like RE, StringPiece
140     is defined in the pcrecpp namespace.
141     .sp
142     Example: read lines of the form "var = value" from a string.
143     string contents = ...; // Fill string somehow
144     pcrecpp::StringPiece input(contents); // Wrap in a StringPiece
145    
146     string var;
147     int value;
148     pcrecpp::RE re("(\e\ew+) = (\e\ed+)\en");
149     while (re.Consume(&input, &var, &value)) {
150     ...;
151     }
152     .sp
153     Each successful call to "Consume" will set "var/value", and also
154     advance "input" so it points past the matched text.
155     .P
156     The "FindAndConsume" operation is similar to "Consume" but does not
157     anchor your match at the beginning of the string. For example, you
158     could extract all words from a string by repeatedly calling
159     .sp
160     pcrecpp::RE("(\e\ew+)").FindAndConsume(&input, &word)
161     .
162     .
163     .SH "PARSING HEX/OCTAL/C-RADIX NUMBERS"
164     .rs
165     .sp
166     By default, if you pass a pointer to a numeric value, the
167     corresponding text is interpreted as a base-10 number. You can
168     instead wrap the pointer with a call to one of the operators Hex(),
169     Octal(), or CRadix() to interpret the text in another base. The
170     CRadix operator interprets C-style "0" (base-8) and "0x" (base-16)
171     prefixes, but defaults to base-10.
172     .sp
173     Example:
174     int a, b, c, d;
175     pcrecpp::RE re("(.*) (.*) (.*) (.*)");
176     re.FullMatch("100 40 0100 0x40",
177     pcrecpp::Octal(&a), pcrecpp::Hex(&b),
178     pcrecpp::CRadix(&c), pcrecpp::CRadix(&d));
179     .sp
180     will leave 64 in a, b, c, and d.
181     .
182     .
183     .SH "REPLACING PARTS OF STRINGS"
184     .rs
185     .sp
186     You can replace the first match of "pattern" in "str" with "rewrite".
187     Within "rewrite", backslash-escaped digits (\e1 to \e9) can be
188     used to insert text matching corresponding parenthesized group
189     from the pattern. \e0 in "rewrite" refers to the entire matching
190     text. For example:
191     .sp
192     string s = "yabba dabba doo";
193     pcrecpp::RE("b+").Replace("d", &s);
194     .sp
195     will leave "s" containing "yada dabba doo". The result is true if the pattern
196     matches and a replacement occurs, false otherwise.
197     .P
198     \fBGlobalReplace\fP is like \fBReplace\fP except that it replaces all
199     occurrences of the pattern in the string with the rewrite. Replacements are
200     not subject to re-matching. For example:
201     .sp
202     string s = "yabba dabba doo";
203     pcrecpp::RE("b+").GlobalReplace("d", &s);
204     .sp
205     will leave "s" containing "yada dada doo". It returns the number of
206     replacements made.
207     .P
208     \fBExtract\fP is like \fBReplace\fP, except that if the pattern matches,
209     "rewrite" is copied into "out" (an additional argument) with substitutions.
210     The non-matching portions of "text" are ignored. Returns true iff a match
211     occurred and the extraction happened successfully; if no match occurs, the
212     string is left unaffected.
213     .
214     .
215     .SH AUTHOR
216     .rs
217     .sp
218     The C++ wrapper was contributed by Google Inc.
219     .br
220     Copyright (c) 2005 Google Inc.

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12