/[pcre]/code/trunk/doc/pcreposix.3
ViewVC logotype

Diff of /code/trunk/doc/pcreposix.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 49 by nigel, Sat Feb 24 21:39:33 2007 UTC revision 63 by nigel, Sat Feb 24 21:40:03 2007 UTC
# Line 1  Line 1 
1  .TH PCRE 3  .TH PCRE 3
2  .SH NAME  .SH NAME
3  pcreposix - POSIX API for Perl-compatible regular expressions.  PCRE - Perl-compatible regular expressions.
4  .SH SYNOPSIS  .SH SYNOPSIS OF POSIX API
5  .B #include <pcreposix.h>  .B #include <pcreposix.h>
6  .PP  .PP
7  .SM  .SM
# Line 23  pcreposix - POSIX API for Perl-compatibl Line 23  pcreposix - POSIX API for Perl-compatibl
23  .br  .br
24  .B void regfree(regex_t *\fIpreg\fR);  .B void regfree(regex_t *\fIpreg\fR);
25    
   
26  .SH DESCRIPTION  .SH DESCRIPTION
27    .rs
28    .sp
29  This set of functions provides a POSIX-style API to the PCRE regular expression  This set of functions provides a POSIX-style API to the PCRE regular expression
30  package. See the \fBpcre\fR documentation for a description of the native API,  package. See the
31  which contains additional functionality.  .\" HREF
32    \fBpcreapi\fR
33    .\"
34    documentation for a description of the native API, which contains additional
35    functionality.
36    
37  The functions described here are just wrapper functions that ultimately call  The functions described here are just wrapper functions that ultimately call
38  the native API. Their prototypes are defined in the \fBpcreposix.h\fR header  the PCRE native API. Their prototypes are defined in the \fBpcreposix.h\fR
39  file, and on Unix systems the library itself is called \fBpcreposix.a\fR, so  header file, and on Unix systems the library itself is called
40  can be accessed by adding \fB-lpcreposix\fR to the command for linking an  \fBpcreposix.a\fR, so can be accessed by adding \fB-lpcreposix\fR to the
41  application which uses them. Because the POSIX functions call the native ones,  command for linking an application which uses them. Because the POSIX functions
42  it is also necessary to add \fR-lpcre\fR.  call the native ones, it is also necessary to add \fR-lpcre\fR.
43    
44  I have implemented only those option bits that can be reasonably mapped to PCRE  I have implemented only those option bits that can be reasonably mapped to PCRE
45  native options. In addition, the options REG_EXTENDED and REG_NOSUB are defined  native options. In addition, the options REG_EXTENDED and REG_NOSUB are defined
# Line 55  structure types, \fIregex_t\fR for compi Line 60  structure types, \fIregex_t\fR for compi
60  constants whose names start with "REG_"; these are used for setting options and  constants whose names start with "REG_"; these are used for setting options and
61  identifying error codes.  identifying error codes.
62    
   
63  .SH COMPILING A PATTERN  .SH COMPILING A PATTERN
64    .rs
65    .sp
66  The function \fBregcomp()\fR is called to compile a pattern into an  The function \fBregcomp()\fR is called to compile a pattern into an
67  internal form. The pattern is a C string terminated by a binary zero, and  internal form. The pattern is a C string terminated by a binary zero, and
68  is passed in the argument \fIpattern\fR. The \fIpreg\fR argument is a pointer  is passed in the argument \fIpattern\fR. The \fIpreg\fR argument is a pointer
# Line 75  to the native function. Line 80  to the native function.
80    REG_NEWLINE    REG_NEWLINE
81    
82  The PCRE_MULTILINE option is set when the expression is passed for compilation  The PCRE_MULTILINE option is set when the expression is passed for compilation
83  to the native function.  to the native function. Note that this does \fInot\fR mimic the defined POSIX
84    behaviour for REG_NEWLINE (see the following section).
85    
86  In the absence of these flags, no options are passed to the native function.  In the absence of these flags, no options are passed to the native function.
87  This means the the regex is compiled with PCRE default semantics. In  This means the the regex is compiled with PCRE default semantics. In
88  particular, the way it handles newline characters in the subject string is the  particular, the way it handles newline characters in the subject string is the
89  Perl way, not the POSIX way. Note that setting PCRE_MULTILINE has only  Perl way, not the POSIX way. Note that setting PCRE_MULTILINE has only
90  \fIsome\fR of the effects specified for REG_NEWLINE. It does not affect the way  \fIsome\fR of the effects specified for REG_NEWLINE. It does not affect the way
91  newlines are matched by . (they aren't) or a negative class such as [^a] (they  newlines are matched by . (they aren't) or by a negative class such as [^a]
92  are).  (they are).
93    
94  The yield of \fBregcomp()\fR is zero on success, and non-zero otherwise. The  The yield of \fBregcomp()\fR is zero on success, and non-zero otherwise. The
95  \fIpreg\fR structure is filled in on success, and one member of the structure  \fIpreg\fR structure is filled in on success, and one member of the structure
96  is publicized: \fIre_nsub\fR contains the number of capturing subpatterns in  is public: \fIre_nsub\fR contains the number of capturing subpatterns in
97  the regular expression. Various error codes are defined in the header file.  the regular expression. Various error codes are defined in the header file.
98    
99    .SH MATCHING NEWLINE CHARACTERS
100    .rs
101    .sp
102    This area is not simple, because POSIX and Perl take different views of things.
103    It is not possible to get PCRE to obey POSIX semantics, but then PCRE was never
104    intended to be a POSIX engine. The following table lists the different
105    possibilities for matching newline characters in PCRE:
106    
107                              Default   Change with
108    
109      . matches newline          no     PCRE_DOTALL
110      newline matches [^a]       yes    not changeable
111      $ matches \\n at end        yes    PCRE_DOLLARENDONLY
112      $ matches \\n in middle     no     PCRE_MULTILINE
113      ^ matches \\n in middle     no     PCRE_MULTILINE
114    
115    This is the equivalent table for POSIX:
116    
117                              Default   Change with
118    
119      . matches newline          yes      REG_NEWLINE
120      newline matches [^a]       yes      REG_NEWLINE
121      $ matches \\n at end        no       REG_NEWLINE
122      $ matches \\n in middle     no       REG_NEWLINE
123      ^ matches \\n in middle     no       REG_NEWLINE
124    
125    PCRE's behaviour is the same as Perl's, except that there is no equivalent for
126    PCRE_DOLLARENDONLY in Perl. In both PCRE and Perl, there is no way to stop
127    newline from matching [^a].
128    
129    The default POSIX newline handling can be obtained by setting PCRE_DOTALL and
130    PCRE_DOLLARENDONLY, but there is no way to make PCRE behave exactly as for the
131    REG_NEWLINE action.
132    
133  .SH MATCHING A PATTERN  .SH MATCHING A PATTERN
134    .rs
135    .sp
136  The function \fBregexec()\fR is called to match a pre-compiled pattern  The function \fBregexec()\fR is called to match a pre-compiled pattern
137  \fIpreg\fR against a given \fIstring\fR, which is terminated by a zero byte,  \fIpreg\fR against a given \fIstring\fR, which is terminated by a zero byte,
138  subject to the options in \fIeflags\fR. These can be:  subject to the options in \fIeflags\fR. These can be:
# Line 119  have both structure members set to -1. Line 160  have both structure members set to -1.
160  A successful match yields a zero return; various error codes are defined in the  A successful match yields a zero return; various error codes are defined in the
161  header file, of which REG_NOMATCH is the "expected" failure code.  header file, of which REG_NOMATCH is the "expected" failure code.
162    
   
163  .SH ERROR MESSAGES  .SH ERROR MESSAGES
164    .rs
165    .sp
166  The \fBregerror()\fR function maps a non-zero errorcode from either  The \fBregerror()\fR function maps a non-zero errorcode from either
167  \fBregcomp\fR or \fBregexec\fR to a printable message. If \fIpreg\fR is not  \fBregcomp()\fR or \fBregexec()\fR to a printable message. If \fIpreg\fR is not
168  NULL, the error should have arisen from the use of that structure. A message  NULL, the error should have arisen from the use of that structure. A message
169  terminated by a binary zero is placed in \fIerrbuf\fR. The length of the  terminated by a binary zero is placed in \fIerrbuf\fR. The length of the
170  message, including the zero, is limited to \fIerrbuf_size\fR. The yield of the  message, including the zero, is limited to \fIerrbuf_size\fR. The yield of the
171  function is the size of buffer needed to hold the whole message.  function is the size of buffer needed to hold the whole message.
172    
   
173  .SH STORAGE  .SH STORAGE
174    .rs
175    .sp
176  Compiling a regular expression causes memory to be allocated and associated  Compiling a regular expression causes memory to be allocated and associated
177  with the \fIpreg\fR structure. The function \fBregfree()\fR frees all such  with the \fIpreg\fR structure. The function \fBregfree()\fR frees all such
178  memory, after which \fIpreg\fR may no longer be used as a compiled expression.  memory, after which \fIpreg\fR may no longer be used as a compiled expression.
179    
   
180  .SH AUTHOR  .SH AUTHOR
181    .rs
182    .sp
183  Philip Hazel <ph10@cam.ac.uk>  Philip Hazel <ph10@cam.ac.uk>
184  .br  .br
185  University Computing Service,  University Computing Service,
186  .br  .br
 New Museums Site,  
 .br  
187  Cambridge CB2 3QG, England.  Cambridge CB2 3QG, England.
 .br  
 Phone: +44 1223 334714  
188    
189  Copyright (c) 1997-2000 University of Cambridge.  .in 0
190    Last updated: 03 February 2003
191    .br
192    Copyright (c) 1997-2003 University of Cambridge.

Legend:
Removed from v.49  
changed lines
  Added in v.63

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12