| 7 |
<p> |
<p> |
| 8 |
Return to the <a href="index.html">PCRE index page</a>. |
Return to the <a href="index.html">PCRE index page</a>. |
| 9 |
</p> |
</p> |
| 10 |
<p> |
<p> |
| 11 |
This page is part of the PCRE HTML documentation. It was generated automatically |
This page is part of the PCRE HTML documentation. It was generated automatically |
| 12 |
from the original man page. If there is any nonsense in it, please consult the |
from the original man page. If there is any nonsense in it, please consult the |
| 13 |
man page, in case the conversion went wrong. |
man page, in case the conversion went wrong. |
| 14 |
<br> |
<br> |
| 15 |
<ul> |
<ul> |
| 16 |
<li><a name="TOC1" href="#SEC1">SYNOPSIS OF C++ WRAPPER</a> |
<li><a name="TOC1" href="#SEC1">SYNOPSIS OF C++ WRAPPER</a> |
| 17 |
<li><a name="TOC2" href="#SEC2">DESCRIPTION</a> |
<li><a name="TOC2" href="#SEC2">DESCRIPTION</a> |
| 18 |
<li><a name="TOC3" href="#SEC3">MATCHING INTERFACE</a> |
<li><a name="TOC3" href="#SEC3">MATCHING INTERFACE</a> |
| 19 |
<li><a name="TOC4" href="#SEC4">PARTIAL MATCHES</a> |
<li><a name="TOC4" href="#SEC4">QUOTING METACHARACTERS</a> |
| 20 |
<li><a name="TOC5" href="#SEC5">UTF-8 AND THE MATCHING INTERFACE</a> |
<li><a name="TOC5" href="#SEC5">PARTIAL MATCHES</a> |
| 21 |
<li><a name="TOC6" href="#SEC6">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a> |
<li><a name="TOC6" href="#SEC6">UTF-8 AND THE MATCHING INTERFACE</a> |
| 22 |
<li><a name="TOC7" href="#SEC7">SCANNING TEXT INCREMENTALLY</a> |
<li><a name="TOC7" href="#SEC7">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a> |
| 23 |
<li><a name="TOC8" href="#SEC8">PARSING HEX/OCTAL/C-RADIX NUMBERS</a> |
<li><a name="TOC8" href="#SEC8">SCANNING TEXT INCREMENTALLY</a> |
| 24 |
<li><a name="TOC9" href="#SEC9">REPLACING PARTS OF STRINGS</a> |
<li><a name="TOC9" href="#SEC9">PARSING HEX/OCTAL/C-RADIX NUMBERS</a> |
| 25 |
<li><a name="TOC10" href="#SEC10">AUTHOR</a> |
<li><a name="TOC10" href="#SEC10">REPLACING PARTS OF STRINGS</a> |
| 26 |
|
<li><a name="TOC11" href="#SEC11">AUTHOR</a> |
| 27 |
|
<li><a name="TOC12" href="#SEC12">REVISION</a> |
| 28 |
</ul> |
</ul> |
| 29 |
<br><a name="SEC1" href="#TOC1">SYNOPSIS OF C++ WRAPPER</a><br> |
<br><a name="SEC1" href="#TOC1">SYNOPSIS OF C++ WRAPPER</a><br> |
| 30 |
<P> |
<P> |
| 31 |
<b>#include <pcrecpp.h></b> |
<b>#include <pcrecpp.h></b> |
| 32 |
</P> |
</P> |
|
<P> |
|
|
</P> |
|
| 33 |
<br><a name="SEC2" href="#TOC1">DESCRIPTION</a><br> |
<br><a name="SEC2" href="#TOC1">DESCRIPTION</a><br> |
| 34 |
<P> |
<P> |
| 35 |
The C++ wrapper for PCRE was provided by Google Inc. Some additional |
The C++ wrapper for PCRE was provided by Google Inc. Some additional |
| 105 |
number of sub-patterns, "i"th captured sub-pattern is |
number of sub-patterns, "i"th captured sub-pattern is |
| 106 |
ignored. |
ignored. |
| 107 |
</pre> |
</pre> |
| 108 |
|
CAVEAT: An optional sub-pattern that does not exist in the matched |
| 109 |
|
string is assigned the empty string. Therefore, the following will |
| 110 |
|
return false (because the empty string is not a valid number): |
| 111 |
|
<pre> |
| 112 |
|
int number; |
| 113 |
|
pcrecpp::RE::FullMatch("abc", "[a-z]+(\\d+)?", &number); |
| 114 |
|
</pre> |
| 115 |
The matching interface supports at most 16 arguments per call. |
The matching interface supports at most 16 arguments per call. |
| 116 |
If you need more, consider using the more general interface |
If you need more, consider using the more general interface |
| 117 |
<b>pcrecpp::RE::DoMatch</b>. See <b>pcrecpp.h</b> for the signature for |
<b>pcrecpp::RE::DoMatch</b>. See <b>pcrecpp.h</b> for the signature for |
| 118 |
<b>DoMatch</b>. |
<b>DoMatch</b>. |
| 119 |
</P> |
</P> |
| 120 |
<br><a name="SEC4" href="#TOC1">PARTIAL MATCHES</a><br> |
<br><a name="SEC4" href="#TOC1">QUOTING METACHARACTERS</a><br> |
| 121 |
|
<P> |
| 122 |
|
You can use the "QuoteMeta" operation to insert backslashes before all |
| 123 |
|
potentially meaningful characters in a string. The returned string, used as a |
| 124 |
|
regular expression, will exactly match the original string. |
| 125 |
|
<pre> |
| 126 |
|
Example: |
| 127 |
|
string quoted = RE::QuoteMeta(unquoted); |
| 128 |
|
</pre> |
| 129 |
|
Note that it's legal to escape a character even if it has no special meaning in |
| 130 |
|
a regular expression -- so this function does that. (This also makes it |
| 131 |
|
identical to the perl function of the same name; see "perldoc -f quotemeta".) |
| 132 |
|
For example, "1.5-2.0?" becomes "1\.5\-2\.0\?". |
| 133 |
|
</P> |
| 134 |
|
<br><a name="SEC5" href="#TOC1">PARTIAL MATCHES</a><br> |
| 135 |
<P> |
<P> |
| 136 |
You can use the "PartialMatch" operation when you want the pattern |
You can use the "PartialMatch" operation when you want the pattern |
| 137 |
to match any substring of the text. |
to match any substring of the text. |
| 146 |
assert(number == 100); |
assert(number == 100); |
| 147 |
</PRE> |
</PRE> |
| 148 |
</P> |
</P> |
| 149 |
<br><a name="SEC5" href="#TOC1">UTF-8 AND THE MATCHING INTERFACE</a><br> |
<br><a name="SEC6" href="#TOC1">UTF-8 AND THE MATCHING INTERFACE</a><br> |
| 150 |
<P> |
<P> |
| 151 |
By default, pattern and text are plain text, one byte per character. The UTF8 |
By default, pattern and text are plain text, one byte per character. The UTF8 |
| 152 |
flag, passed to the constructor, causes both pattern and string to be treated |
flag, passed to the constructor, causes both pattern and string to be treated |
| 171 |
--enable-utf8 flag. |
--enable-utf8 flag. |
| 172 |
</PRE> |
</PRE> |
| 173 |
</P> |
</P> |
| 174 |
<br><a name="SEC6" href="#TOC1">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a><br> |
<br><a name="SEC7" href="#TOC1">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a><br> |
| 175 |
<P> |
<P> |
| 176 |
PCRE defines some modifiers to change the behavior of the regular expression |
PCRE defines some modifiers to change the behavior of the regular expression |
| 177 |
engine. The C++ wrapper defines an auxiliary class, RE_Options, as a vehicle to |
engine. The C++ wrapper defines an auxiliary class, RE_Options, as a vehicle to |
| 209 |
<pre> |
<pre> |
| 210 |
RE_Options & set_caseless(bool) |
RE_Options & set_caseless(bool) |
| 211 |
</pre> |
</pre> |
| 212 |
which sets or unsets the modifier. Moreover, PCRE_CONFIG_MATCH_LIMIT can be |
which sets or unsets the modifier. Moreover, PCRE_EXTRA_MATCH_LIMIT can be |
| 213 |
accessed through the <b>set_match_limit()</b> and <b>match_limit()</b> member |
accessed through the <b>set_match_limit()</b> and <b>match_limit()</b> member |
| 214 |
functions. Setting <i>match_limit</i> to a non-zero value will limit the |
functions. Setting <i>match_limit</i> to a non-zero value will limit the |
| 215 |
execution of pcre to keep it from doing bad things like blowing the stack or |
execution of pcre to keep it from doing bad things like blowing the stack or |
| 216 |
taking an eternity to return a result. A value of 5000 is good enough to stop |
taking an eternity to return a result. A value of 5000 is good enough to stop |
| 217 |
stack blowup in a 2MB thread stack. Setting <i>match_limit</i> to zero disables |
stack blowup in a 2MB thread stack. Setting <i>match_limit</i> to zero disables |
| 218 |
match limiting. |
match limiting. Alternatively, you can call <b>match_limit_recursion()</b> |
| 219 |
|
which uses PCRE_EXTRA_MATCH_LIMIT_RECURSION to limit how much PCRE |
| 220 |
|
recurses. <b>match_limit()</b> limits the number of matches PCRE does; |
| 221 |
|
<b>match_limit_recursion()</b> limits the depth of internal recursion, and |
| 222 |
|
therefore the amount of stack that is used. |
| 223 |
</P> |
</P> |
| 224 |
<P> |
<P> |
| 225 |
Normally, to pass one or more modifiers to a RE class, you declare |
Normally, to pass one or more modifiers to a RE class, you declare |
| 265 |
|
|
| 266 |
</PRE> |
</PRE> |
| 267 |
</P> |
</P> |
| 268 |
<br><a name="SEC7" href="#TOC1">SCANNING TEXT INCREMENTALLY</a><br> |
<br><a name="SEC8" href="#TOC1">SCANNING TEXT INCREMENTALLY</a><br> |
| 269 |
<P> |
<P> |
| 270 |
The "Consume" operation may be useful if you want to repeatedly |
The "Consume" operation may be useful if you want to repeatedly |
| 271 |
match regular expressions at the front of a string and skip over |
match regular expressions at the front of a string and skip over |
| 298 |
pcrecpp::RE("(\\w+)").FindAndConsume(&input, &word) |
pcrecpp::RE("(\\w+)").FindAndConsume(&input, &word) |
| 299 |
</PRE> |
</PRE> |
| 300 |
</P> |
</P> |
| 301 |
<br><a name="SEC8" href="#TOC1">PARSING HEX/OCTAL/C-RADIX NUMBERS</a><br> |
<br><a name="SEC9" href="#TOC1">PARSING HEX/OCTAL/C-RADIX NUMBERS</a><br> |
| 302 |
<P> |
<P> |
| 303 |
By default, if you pass a pointer to a numeric value, the |
By default, if you pass a pointer to a numeric value, the |
| 304 |
corresponding text is interpreted as a base-10 number. You can |
corresponding text is interpreted as a base-10 number. You can |
| 316 |
</pre> |
</pre> |
| 317 |
will leave 64 in a, b, c, and d. |
will leave 64 in a, b, c, and d. |
| 318 |
</P> |
</P> |
| 319 |
<br><a name="SEC9" href="#TOC1">REPLACING PARTS OF STRINGS</a><br> |
<br><a name="SEC10" href="#TOC1">REPLACING PARTS OF STRINGS</a><br> |
| 320 |
<P> |
<P> |
| 321 |
You can replace the first match of "pattern" in "str" with "rewrite". |
You can replace the first match of "pattern" in "str" with "rewrite". |
| 322 |
Within "rewrite", backslash-escaped digits (\1 to \9) can be |
Within "rewrite", backslash-escaped digits (\1 to \9) can be |
| 348 |
occurred and the extraction happened successfully; if no match occurs, the |
occurred and the extraction happened successfully; if no match occurs, the |
| 349 |
string is left unaffected. |
string is left unaffected. |
| 350 |
</P> |
</P> |
| 351 |
<br><a name="SEC10" href="#TOC1">AUTHOR</a><br> |
<br><a name="SEC11" href="#TOC1">AUTHOR</a><br> |
| 352 |
<P> |
<P> |
| 353 |
The C++ wrapper was contributed by Google Inc. |
The C++ wrapper was contributed by Google Inc. |
| 354 |
<br> |
<br> |
| 355 |
Copyright © 2005 Google Inc. |
Copyright © 2006 Google Inc. |
| 356 |
|
<br> |
| 357 |
|
</P> |
| 358 |
|
<br><a name="SEC12" href="#TOC1">REVISION</a><br> |
| 359 |
|
<P> |
| 360 |
|
Last updated: 06 March 2007 |
| 361 |
|
<br> |
| 362 |
<p> |
<p> |
| 363 |
Return to the <a href="index.html">PCRE index page</a>. |
Return to the <a href="index.html">PCRE index page</a>. |
| 364 |
</p> |
</p> |