/[pcre]/code/trunk/doc/html/pcreapi.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcreapi.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 345 by ph10, Mon Apr 28 15:10:02 2008 UTC revision 371 by ph10, Mon Aug 25 18:28:05 2008 UTC
# Line 1376  The string to be matched by pcre_exec Line 1376  The string to be matched by pcre_exec
1376  </b><br>  </b><br>
1377  <P>  <P>
1378  The subject string is passed to <b>pcre_exec()</b> as a pointer in  The subject string is passed to <b>pcre_exec()</b> as a pointer in
1379  <i>subject</i>, a length in <i>length</i>, and a starting byte offset in  <i>subject</i>, a length (in bytes) in <i>length</i>, and a starting byte offset
1380  <i>startoffset</i>. In UTF-8 mode, the byte offset must point to the start of a  in <i>startoffset</i>. In UTF-8 mode, the byte offset must point to the start of
1381  UTF-8 character. Unlike the pattern string, the subject may contain binary zero  a UTF-8 character. Unlike the pattern string, the subject may contain binary
1382  bytes. When the starting offset is zero, the search for a match starts at the  zero bytes. When the starting offset is zero, the search for a match starts at
1383  beginning of the subject, and this is by far the most common case.  the beginning of the subject, and this is by far the most common case.
1384  </P>  </P>
1385  <P>  <P>
1386  A non-zero starting offset is useful when searching for another match in the  A non-zero starting offset is useful when searching for another match in the
# Line 1418  a fragment of a pattern that picks out a Line 1418  a fragment of a pattern that picks out a
1418  kinds of parenthesized subpattern that do not cause substrings to be captured.  kinds of parenthesized subpattern that do not cause substrings to be captured.
1419  </P>  </P>
1420  <P>  <P>
1421  Captured substrings are returned to the caller via a vector of integer offsets  Captured substrings are returned to the caller via a vector of integers whose
1422  whose address is passed in <i>ovector</i>. The number of elements in the vector  address is passed in <i>ovector</i>. The number of elements in the vector is
1423  is passed in <i>ovecsize</i>, which must be a non-negative number. <b>Note</b>:  passed in <i>ovecsize</i>, which must be a non-negative number. <b>Note</b>: this
1424  this argument is NOT the size of <i>ovector</i> in bytes.  argument is NOT the size of <i>ovector</i> in bytes.
1425  </P>  </P>
1426  <P>  <P>
1427  The first two-thirds of the vector is used to pass back captured substrings,  The first two-thirds of the vector is used to pass back captured substrings,
1428  each substring using a pair of integers. The remaining third of the vector is  each substring using a pair of integers. The remaining third of the vector is
1429  used as workspace by <b>pcre_exec()</b> while matching capturing subpatterns,  used as workspace by <b>pcre_exec()</b> while matching capturing subpatterns,
1430  and is not available for passing back information. The length passed in  and is not available for passing back information. The number passed in
1431  <i>ovecsize</i> should always be a multiple of three. If it is not, it is  <i>ovecsize</i> should always be a multiple of three. If it is not, it is
1432  rounded down.  rounded down.
1433  </P>  </P>
1434  <P>  <P>
1435  When a match is successful, information about captured substrings is returned  When a match is successful, information about captured substrings is returned
1436  in pairs of integers, starting at the beginning of <i>ovector</i>, and  in pairs of integers, starting at the beginning of <i>ovector</i>, and
1437  continuing up to two-thirds of its length at the most. The first element of a  continuing up to two-thirds of its length at the most. The first element of
1438  pair is set to the offset of the first character in a substring, and the second  each pair is set to the byte offset of the first character in a substring, and
1439  is set to the offset of the first character after the end of a substring. The  the second is set to the byte offset of the first character after the end of a
1440  first pair, <i>ovector[0]</i> and <i>ovector[1]</i>, identify the portion of the  substring. <b>Note</b>: these values are always byte offsets, even in UTF-8
1441  subject string matched by the entire pattern. The next pair is used for the  mode. They are not character counts.
1442  first capturing subpattern, and so on. The value returned by <b>pcre_exec()</b>  </P>
1443  is one more than the highest numbered pair that has been set. For example, if  <P>
1444  two substrings have been captured, the returned value is 3. If there are no  The first pair of integers, <i>ovector[0]</i> and <i>ovector[1]</i>, identify the
1445  capturing subpatterns, the return value from a successful match is 1,  portion of the subject string matched by the entire pattern. The next pair is
1446  indicating that just the first pair of offsets has been set.  used for the first capturing subpattern, and so on. The value returned by
1447    <b>pcre_exec()</b> is one more than the highest numbered pair that has been set.
1448    For example, if two substrings have been captured, the returned value is 3. If
1449    there are no capturing subpatterns, the return value from a successful match is
1450    1, indicating that just the first pair of offsets has been set.
1451  </P>  </P>
1452  <P>  <P>
1453  If a capturing subpattern is matched repeatedly, it is the last portion of the  If a capturing subpattern is matched repeatedly, it is the last portion of the
# Line 1452  string that it matched that is returned. Line 1456  string that it matched that is returned.
1456  <P>  <P>
1457  If the vector is too small to hold all the captured substring offsets, it is  If the vector is too small to hold all the captured substring offsets, it is
1458  used as far as possible (up to two-thirds of its length), and the function  used as far as possible (up to two-thirds of its length), and the function
1459  returns a value of zero. In particular, if the substring offsets are not of  returns a value of zero. If the substring offsets are not of interest,
1460  interest, <b>pcre_exec()</b> may be called with <i>ovector</i> passed as NULL and  <b>pcre_exec()</b> may be called with <i>ovector</i> passed as NULL and
1461  <i>ovecsize</i> as zero. However, if the pattern contains back references and  <i>ovecsize</i> as zero. However, if the pattern contains back references and
1462  the <i>ovector</i> is not big enough to remember the related substrings, PCRE  the <i>ovector</i> is not big enough to remember the related substrings, PCRE
1463  has to get additional memory for use during matching. Thus it is usually  has to get additional memory for use during matching. Thus it is usually
# Line 1972  Cambridge CB2 3QH, England. Line 1976  Cambridge CB2 3QH, England.
1976  </P>  </P>
1977  <br><a name="SEC22" href="#TOC1">REVISION</a><br>  <br><a name="SEC22" href="#TOC1">REVISION</a><br>
1978  <P>  <P>
1979  Last updated: 12 April 2008  Last updated: 24 August 2008
1980  <br>  <br>
1981  Copyright &copy; 1997-2008 University of Cambridge.  Copyright &copy; 1997-2008 University of Cambridge.
1982  <br>  <br>

Legend:
Removed from v.345  
changed lines
  Added in v.371

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12