/[pcre]/code/trunk/doc/html/pcreposix.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcreposix.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 77 by nigel, Sat Feb 24 21:40:45 2007 UTC revision 453 by ph10, Fri Sep 18 19:12:35 2009 UTC
# Line 21  man page, in case the conversion went wr Line 21  man page, in case the conversion went wr
21  <li><a name="TOC6" href="#SEC6">ERROR MESSAGES</a>  <li><a name="TOC6" href="#SEC6">ERROR MESSAGES</a>
22  <li><a name="TOC7" href="#SEC7">MEMORY USAGE</a>  <li><a name="TOC7" href="#SEC7">MEMORY USAGE</a>
23  <li><a name="TOC8" href="#SEC8">AUTHOR</a>  <li><a name="TOC8" href="#SEC8">AUTHOR</a>
24    <li><a name="TOC9" href="#SEC9">REVISION</a>
25  </ul>  </ul>
26  <br><a name="SEC1" href="#TOC1">SYNOPSIS OF POSIX API</a><br>  <br><a name="SEC1" href="#TOC1">SYNOPSIS OF POSIX API</a><br>
27  <P>  <P>
# Line 58  command for linking an application that Line 59  command for linking an application that
59  call the native ones, it is also necessary to add <b>-lpcre</b>.  call the native ones, it is also necessary to add <b>-lpcre</b>.
60  </P>  </P>
61  <P>  <P>
62  I have implemented only those option bits that can be reasonably mapped to PCRE  I have implemented only those POSIX option bits that can be reasonably mapped
63  native options. In addition, the options REG_EXTENDED and REG_NOSUB are defined  to PCRE native options. In addition, the option REG_EXTENDED is defined with
64  with the value zero. They have no effect, but since programs that are written  the value zero. This has no effect, but since programs that are written to the
65  to the POSIX interface often use them, this makes it easier to slot in PCRE as  POSIX interface often use it, this makes it easier to slot in PCRE as a
66  a replacement library. Other POSIX options are not even defined.  replacement library. Other POSIX options are not even defined.
67    </P>
68    <P>
69    There are also some other options that are not defined by POSIX. These have
70    been added at the request of users who want to make use of certain
71    PCRE-specific features via the POSIX calling interface.
72  </P>  </P>
73  <P>  <P>
74  When PCRE is called via these functions, it is only the API that is POSIX-like  When PCRE is called via these functions, it is only the API that is POSIX-like
# Line 89  The function regcomp() is called Line 95  The function regcomp() is called
95  internal form. The pattern is a C string terminated by a binary zero, and  internal form. The pattern is a C string terminated by a binary zero, and
96  is passed in the argument <i>pattern</i>. The <i>preg</i> argument is a pointer  is passed in the argument <i>pattern</i>. The <i>preg</i> argument is a pointer
97  to a <b>regex_t</b> structure that is used as a base for storing information  to a <b>regex_t</b> structure that is used as a base for storing information
98  about the compiled expression.  about the compiled regular expression.
99  </P>  </P>
100  <P>  <P>
101  The argument <i>cflags</i> is either zero, or contains one or more of the bits  The argument <i>cflags</i> is either zero, or contains one or more of the bits
# Line 97  defined by the following macros: Line 103  defined by the following macros:
103  <pre>  <pre>
104    REG_DOTALL    REG_DOTALL
105  </pre>  </pre>
106  The PCRE_DOTALL option is set when the expression is passed for compilation to  The PCRE_DOTALL option is set when the regular expression is passed for
107  the native function. Note that REG_DOTALL is not part of the POSIX standard.  compilation to the native function. Note that REG_DOTALL is not part of the
108    POSIX standard.
109  <pre>  <pre>
110    REG_ICASE    REG_ICASE
111  </pre>  </pre>
112  The PCRE_CASELESS option is set when the expression is passed for compilation  The PCRE_CASELESS option is set when the regular expression is passed for
113  to the native function.  compilation to the native function.
114  <pre>  <pre>
115    REG_NEWLINE    REG_NEWLINE
116  </pre>  </pre>
117  The PCRE_MULTILINE option is set when the expression is passed for compilation  The PCRE_MULTILINE option is set when the regular expression is passed for
118  to the native function. Note that this does <i>not</i> mimic the defined POSIX  compilation to the native function. Note that this does <i>not</i> mimic the
119  behaviour for REG_NEWLINE (see the following section).  defined POSIX behaviour for REG_NEWLINE (see the following section).
120    <pre>
121      REG_NOSUB
122    </pre>
123    The PCRE_NO_AUTO_CAPTURE option is set when the regular expression is passed
124    for compilation to the native function. In addition, when a pattern that is
125    compiled with this flag is passed to <b>regexec()</b> for matching, the
126    <i>nmatch</i> and <i>pmatch</i> arguments are ignored, and no captured strings
127    are returned.
128    <pre>
129      REG_UNGREEDY
130    </pre>
131    The PCRE_UNGREEDY option is set when the regular expression is passed for
132    compilation to the native function. Note that REG_UNGREEDY is not part of the
133    POSIX standard.
134    <pre>
135      REG_UTF8
136    </pre>
137    The PCRE_UTF8 option is set when the regular expression is passed for
138    compilation to the native function. This causes the pattern itself and all data
139    strings used for matching it to be treated as UTF-8 strings. Note that REG_UTF8
140    is not part of the POSIX standard.
141  </P>  </P>
142  <P>  <P>
143  In the absence of these flags, no options are passed to the native function.  In the absence of these flags, no options are passed to the native function.
# Line 117  This means the the regex is compiled wit Line 145  This means the the regex is compiled wit
145  particular, the way it handles newline characters in the subject string is the  particular, the way it handles newline characters in the subject string is the
146  Perl way, not the POSIX way. Note that setting PCRE_MULTILINE has only  Perl way, not the POSIX way. Note that setting PCRE_MULTILINE has only
147  <i>some</i> of the effects specified for REG_NEWLINE. It does not affect the way  <i>some</i> of the effects specified for REG_NEWLINE. It does not affect the way
148  newlines are matched by . (they aren't) or by a negative class such as [^a]  newlines are matched by . (they are not) or by a negative class such as [^a]
149  (they are).  (they are).
150  </P>  </P>
151  <P>  <P>
# Line 126  The yield of regcomp() is zero on Line 154  The yield of regcomp() is zero on
154  is public: <i>re_nsub</i> contains the number of capturing subpatterns in  is public: <i>re_nsub</i> contains the number of capturing subpatterns in
155  the regular expression. Various error codes are defined in the header file.  the regular expression. Various error codes are defined in the header file.
156  </P>  </P>
157    <P>
158    NOTE: If the yield of <b>regcomp()</b> is non-zero, you must not attempt to
159    use the contents of the <i>preg</i> structure. If, for example, you pass it to
160    <b>regexec()</b>, the result is undefined and your program is likely to crash.
161    </P>
162  <br><a name="SEC4" href="#TOC1">MATCHING NEWLINE CHARACTERS</a><br>  <br><a name="SEC4" href="#TOC1">MATCHING NEWLINE CHARACTERS</a><br>
163  <P>  <P>
164  This area is not simple, because POSIX and Perl take different views of things.  This area is not simple, because POSIX and Perl take different views of things.
# Line 163  REG_NEWLINE action. Line 196  REG_NEWLINE action.
196  <br><a name="SEC5" href="#TOC1">MATCHING A PATTERN</a><br>  <br><a name="SEC5" href="#TOC1">MATCHING A PATTERN</a><br>
197  <P>  <P>
198  The function <b>regexec()</b> is called to match a compiled pattern <i>preg</i>  The function <b>regexec()</b> is called to match a compiled pattern <i>preg</i>
199  against a given <i>string</i>, which is terminated by a zero byte, subject to  against a given <i>string</i>, which is by default terminated by a zero byte
200  the options in <i>eflags</i>. These can be:  (but see REG_STARTEND below), subject to the options in <i>eflags</i>. These can
201    be:
202  <pre>  <pre>
203    REG_NOTBOL    REG_NOTBOL
204  </pre>  </pre>
205  The PCRE_NOTBOL option is set when calling the underlying PCRE matching  The PCRE_NOTBOL option is set when calling the underlying PCRE matching
206  function.  function.
207  <pre>  <pre>
208      REG_NOTEMPTY
209    </pre>
210    The PCRE_NOTEMPTY option is set when calling the underlying PCRE matching
211    function. Note that REG_NOTEMPTY is not part of the POSIX standard. However,
212    setting this option can give more POSIX-like behaviour in some situations.
213    <pre>
214    REG_NOTEOL    REG_NOTEOL
215  </pre>  </pre>
216  The PCRE_NOTEOL option is set when calling the underlying PCRE matching  The PCRE_NOTEOL option is set when calling the underlying PCRE matching
217  function.  function.
218  </P>  <pre>
219  <P>    REG_STARTEND
220  The portion of the string that was matched, and also any captured substrings,  </pre>
221  are returned via the <i>pmatch</i> argument, which points to an array of  The string is considered to start at <i>string</i> + <i>pmatch[0].rm_so</i> and
222  <i>nmatch</i> structures of type <i>regmatch_t</i>, containing the members  to have a terminating NUL located at <i>string</i> + <i>pmatch[0].rm_eo</i>
223  <i>rm_so</i> and <i>rm_eo</i>. These contain the offset to the first character of  (there need not actually be a NUL at that location), regardless of the value of
224  each substring and the offset to the first character after the end of each  <i>nmatch</i>. This is a BSD extension, compatible with but not specified by
225  substring, respectively. The 0th element of the vector relates to the entire  IEEE Standard 1003.2 (POSIX.2), and should be used with caution in software
226  portion of <i>string</i> that was matched; subsequent elements relate to the  intended to be portable to other systems. Note that a non-zero <i>rm_so</i> does
227  capturing subpatterns of the regular expression. Unused entries in the array  not imply REG_NOTBOL; REG_STARTEND affects only the location of the string, not
228  have both structure members set to -1.  how it is matched.
229    </P>
230    <P>
231    If the pattern was compiled with the REG_NOSUB flag, no data about any matched
232    strings is returned. The <i>nmatch</i> and <i>pmatch</i> arguments of
233    <b>regexec()</b> are ignored.
234    </P>
235    <P>
236    If the value of <i>nmatch</i> is zero, or if the value <i>pmatch</i> is NULL,
237    no data about any matched strings is returned.
238    </P>
239    <P>
240    Otherwise,the portion of the string that was matched, and also any captured
241    substrings, are returned via the <i>pmatch</i> argument, which points to an
242    array of <i>nmatch</i> structures of type <i>regmatch_t</i>, containing the
243    members <i>rm_so</i> and <i>rm_eo</i>. These contain the offset to the first
244    character of each substring and the offset to the first character after the end
245    of each substring, respectively. The 0th element of the vector relates to the
246    entire portion of <i>string</i> that was matched; subsequent elements relate to
247    the capturing subpatterns of the regular expression. Unused entries in the
248    array have both structure members set to -1.
249  </P>  </P>
250  <P>  <P>
251  A successful match yields a zero return; various error codes are defined in the  A successful match yields a zero return; various error codes are defined in the
# Line 210  memory, after which preg may no l Line 270  memory, after which preg may no l
270  <P>  <P>
271  Philip Hazel  Philip Hazel
272  <br>  <br>
273  University Computing Service,  University Computing Service
274    <br>
275    Cambridge CB2 3QH, England.
276  <br>  <br>
 Cambridge CB2 3QG, England.  
277  </P>  </P>
278    <br><a name="SEC9" href="#TOC1">REVISION</a><br>
279  <P>  <P>
280  Last updated: 28 February 2005  Last updated: 02 September 2009
281    <br>
282    Copyright &copy; 1997-2009 University of Cambridge.
283  <br>  <br>
 Copyright &copy; 1997-2005 University of Cambridge.  
284  <p>  <p>
285  Return to the <a href="index.html">PCRE index page</a>.  Return to the <a href="index.html">PCRE index page</a>.
286  </p>  </p>

Legend:
Removed from v.77  
changed lines
  Added in v.453

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12