| 16 |
<li><a name="TOC1" href="#SEC1">SYNOPSIS OF C++ WRAPPER</a> |
<li><a name="TOC1" href="#SEC1">SYNOPSIS OF C++ WRAPPER</a> |
| 17 |
<li><a name="TOC2" href="#SEC2">DESCRIPTION</a> |
<li><a name="TOC2" href="#SEC2">DESCRIPTION</a> |
| 18 |
<li><a name="TOC3" href="#SEC3">MATCHING INTERFACE</a> |
<li><a name="TOC3" href="#SEC3">MATCHING INTERFACE</a> |
| 19 |
<li><a name="TOC4" href="#SEC4">PARTIAL MATCHES</a> |
<li><a name="TOC4" href="#SEC4">QUOTING METACHARACTERS</a> |
| 20 |
<li><a name="TOC5" href="#SEC5">UTF-8 AND THE MATCHING INTERFACE</a> |
<li><a name="TOC5" href="#SEC5">PARTIAL MATCHES</a> |
| 21 |
<li><a name="TOC6" href="#SEC6">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a> |
<li><a name="TOC6" href="#SEC6">UTF-8 AND THE MATCHING INTERFACE</a> |
| 22 |
<li><a name="TOC7" href="#SEC7">SCANNING TEXT INCREMENTALLY</a> |
<li><a name="TOC7" href="#SEC7">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a> |
| 23 |
<li><a name="TOC8" href="#SEC8">PARSING HEX/OCTAL/C-RADIX NUMBERS</a> |
<li><a name="TOC8" href="#SEC8">SCANNING TEXT INCREMENTALLY</a> |
| 24 |
<li><a name="TOC9" href="#SEC9">REPLACING PARTS OF STRINGS</a> |
<li><a name="TOC9" href="#SEC9">PARSING HEX/OCTAL/C-RADIX NUMBERS</a> |
| 25 |
<li><a name="TOC10" href="#SEC10">AUTHOR</a> |
<li><a name="TOC10" href="#SEC10">REPLACING PARTS OF STRINGS</a> |
| 26 |
|
<li><a name="TOC11" href="#SEC11">AUTHOR</a> |
| 27 |
</ul> |
</ul> |
| 28 |
<br><a name="SEC1" href="#TOC1">SYNOPSIS OF C++ WRAPPER</a><br> |
<br><a name="SEC1" href="#TOC1">SYNOPSIS OF C++ WRAPPER</a><br> |
| 29 |
<P> |
<P> |
| 106 |
number of sub-patterns, "i"th captured sub-pattern is |
number of sub-patterns, "i"th captured sub-pattern is |
| 107 |
ignored. |
ignored. |
| 108 |
</pre> |
</pre> |
| 109 |
|
CAVEAT: An optional sub-pattern that does not exist in the matched |
| 110 |
|
string is assigned the empty string. Therefore, the following will |
| 111 |
|
return false (because the empty string is not a valid number): |
| 112 |
|
<pre> |
| 113 |
|
int number; |
| 114 |
|
pcrecpp::RE::FullMatch("abc", "[a-z]+(\\d+)?", &number); |
| 115 |
|
</pre> |
| 116 |
The matching interface supports at most 16 arguments per call. |
The matching interface supports at most 16 arguments per call. |
| 117 |
If you need more, consider using the more general interface |
If you need more, consider using the more general interface |
| 118 |
<b>pcrecpp::RE::DoMatch</b>. See <b>pcrecpp.h</b> for the signature for |
<b>pcrecpp::RE::DoMatch</b>. See <b>pcrecpp.h</b> for the signature for |
| 119 |
<b>DoMatch</b>. |
<b>DoMatch</b>. |
| 120 |
</P> |
</P> |
| 121 |
<br><a name="SEC4" href="#TOC1">PARTIAL MATCHES</a><br> |
<br><a name="SEC4" href="#TOC1">QUOTING METACHARACTERS</a><br> |
| 122 |
|
<P> |
| 123 |
|
You can use the "QuoteMeta" operation to insert backslashes before all |
| 124 |
|
potentially meaningful characters in a string. The returned string, used as a |
| 125 |
|
regular expression, will exactly match the original string. |
| 126 |
|
<pre> |
| 127 |
|
Example: |
| 128 |
|
string quoted = RE::QuoteMeta(unquoted); |
| 129 |
|
</pre> |
| 130 |
|
Note that it's legal to escape a character even if it has no special meaning in |
| 131 |
|
a regular expression -- so this function does that. (This also makes it |
| 132 |
|
identical to the perl function of the same name; see "perldoc -f quotemeta".) |
| 133 |
|
For example, "1.5-2.0?" becomes "1\.5\-2\.0\?". |
| 134 |
|
</P> |
| 135 |
|
<br><a name="SEC5" href="#TOC1">PARTIAL MATCHES</a><br> |
| 136 |
<P> |
<P> |
| 137 |
You can use the "PartialMatch" operation when you want the pattern |
You can use the "PartialMatch" operation when you want the pattern |
| 138 |
to match any substring of the text. |
to match any substring of the text. |
| 147 |
assert(number == 100); |
assert(number == 100); |
| 148 |
</PRE> |
</PRE> |
| 149 |
</P> |
</P> |
| 150 |
<br><a name="SEC5" href="#TOC1">UTF-8 AND THE MATCHING INTERFACE</a><br> |
<br><a name="SEC6" href="#TOC1">UTF-8 AND THE MATCHING INTERFACE</a><br> |
| 151 |
<P> |
<P> |
| 152 |
By default, pattern and text are plain text, one byte per character. The UTF8 |
By default, pattern and text are plain text, one byte per character. The UTF8 |
| 153 |
flag, passed to the constructor, causes both pattern and string to be treated |
flag, passed to the constructor, causes both pattern and string to be treated |
| 172 |
--enable-utf8 flag. |
--enable-utf8 flag. |
| 173 |
</PRE> |
</PRE> |
| 174 |
</P> |
</P> |
| 175 |
<br><a name="SEC6" href="#TOC1">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a><br> |
<br><a name="SEC7" href="#TOC1">PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE</a><br> |
| 176 |
<P> |
<P> |
| 177 |
PCRE defines some modifiers to change the behavior of the regular expression |
PCRE defines some modifiers to change the behavior of the regular expression |
| 178 |
engine. The C++ wrapper defines an auxiliary class, RE_Options, as a vehicle to |
engine. The C++ wrapper defines an auxiliary class, RE_Options, as a vehicle to |
| 266 |
|
|
| 267 |
</PRE> |
</PRE> |
| 268 |
</P> |
</P> |
| 269 |
<br><a name="SEC7" href="#TOC1">SCANNING TEXT INCREMENTALLY</a><br> |
<br><a name="SEC8" href="#TOC1">SCANNING TEXT INCREMENTALLY</a><br> |
| 270 |
<P> |
<P> |
| 271 |
The "Consume" operation may be useful if you want to repeatedly |
The "Consume" operation may be useful if you want to repeatedly |
| 272 |
match regular expressions at the front of a string and skip over |
match regular expressions at the front of a string and skip over |
| 299 |
pcrecpp::RE("(\\w+)").FindAndConsume(&input, &word) |
pcrecpp::RE("(\\w+)").FindAndConsume(&input, &word) |
| 300 |
</PRE> |
</PRE> |
| 301 |
</P> |
</P> |
| 302 |
<br><a name="SEC8" href="#TOC1">PARSING HEX/OCTAL/C-RADIX NUMBERS</a><br> |
<br><a name="SEC9" href="#TOC1">PARSING HEX/OCTAL/C-RADIX NUMBERS</a><br> |
| 303 |
<P> |
<P> |
| 304 |
By default, if you pass a pointer to a numeric value, the |
By default, if you pass a pointer to a numeric value, the |
| 305 |
corresponding text is interpreted as a base-10 number. You can |
corresponding text is interpreted as a base-10 number. You can |
| 317 |
</pre> |
</pre> |
| 318 |
will leave 64 in a, b, c, and d. |
will leave 64 in a, b, c, and d. |
| 319 |
</P> |
</P> |
| 320 |
<br><a name="SEC9" href="#TOC1">REPLACING PARTS OF STRINGS</a><br> |
<br><a name="SEC10" href="#TOC1">REPLACING PARTS OF STRINGS</a><br> |
| 321 |
<P> |
<P> |
| 322 |
You can replace the first match of "pattern" in "str" with "rewrite". |
You can replace the first match of "pattern" in "str" with "rewrite". |
| 323 |
Within "rewrite", backslash-escaped digits (\1 to \9) can be |
Within "rewrite", backslash-escaped digits (\1 to \9) can be |
| 349 |
occurred and the extraction happened successfully; if no match occurs, the |
occurred and the extraction happened successfully; if no match occurs, the |
| 350 |
string is left unaffected. |
string is left unaffected. |
| 351 |
</P> |
</P> |
| 352 |
<br><a name="SEC10" href="#TOC1">AUTHOR</a><br> |
<br><a name="SEC11" href="#TOC1">AUTHOR</a><br> |
| 353 |
<P> |
<P> |
| 354 |
The C++ wrapper was contributed by Google Inc. |
The C++ wrapper was contributed by Google Inc. |
| 355 |
<br> |
<br> |
| 356 |
Copyright © 2005 Google Inc. |
Copyright © 2006 Google Inc. |
| 357 |
<p> |
<p> |
| 358 |
Return to the <a href="index.html">PCRE index page</a>. |
Return to the <a href="index.html">PCRE index page</a>. |
| 359 |
</p> |
</p> |