/[pcre]/code/trunk/doc/pcreposix.txt
ViewVC logotype

Contents of /code/trunk/doc/pcreposix.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 41 - (show annotations) (download)
Sat Feb 24 21:39:17 2007 UTC (7 years, 6 months ago) by nigel
File MIME type: text/plain
File size: 5646 byte(s)
Load pcre-2.08a into code/trunk.

1 NAME
2 pcreposix - POSIX API for Perl-compatible regular expres-
3 sions.
4
5
6
7 SYNOPSIS
8 #include <pcreposix.h>
9
10 int regcomp(regex_t *preg, const char *pattern,
11 int cflags);
12
13 int regexec(regex_t *preg, const char *string,
14 size_t nmatch, regmatch_t pmatch[], int eflags);
15
16 size_t regerror(int errcode, const regex_t *preg,
17 char *errbuf, size_t errbuf_size);
18
19 void regfree(regex_t *preg);
20
21
22
23 DESCRIPTION
24 This set of functions provides a POSIX-style API to the PCRE
25 regular expression package. See the pcre documentation for a
26 description of the native API, which contains additional
27 functionality.
28
29 The functions described here are just wrapper functions that
30 ultimately call the native API. Their prototypes are defined
31 in the pcreposix.h header file, and on Unix systems the
32 library itself is called pcreposix.a, so can be accessed by
33 adding -lpcreposix to the command for linking an application
34 which uses them. Because the POSIX functions call the native
35 ones, it is also necessary to add -lpcre.
36
37 As I am pretty ignorant about POSIX, these functions must be
38 considered as experimental. I have implemented only those
39 option bits that can be reasonably mapped to PCRE native
40 options. Other POSIX options are not even defined. It may be
41 that it is useful to define, but ignore, other options.
42 Feedback from more knowledgeable folk may cause this kind of
43 detail to change.
44
45 When PCRE is called via these functions, it is only the API
46 that is POSIX-like in style. The syntax and semantics of the
47 regular expressions themselves are still those of Perl, sub-
48 ject to the setting of various PCRE options, as described
49 below.
50
51 The header for these functions is supplied as pcreposix.h to
52 avoid any potential clash with other POSIX libraries. It
53 can, of course, be renamed or aliased as regex.h, which is
54 the "correct" name. It provides two structure types, regex_t
55 for compiled internal forms, and regmatch_t for returning
56 captured substrings. It also defines some constants whose
57 names start with "REG_"; these are used for setting options
58 and identifying error codes.
59
60
61
62 COMPILING A PATTERN
63 The function regcomp() is called to compile a pattern into
64 an internal form. The pattern is a C string terminated by a
65 binary zero, and is passed in the argument pattern. The preg
66 argument is a pointer to a regex_t structure which is used
67 as a base for storing information about the compiled expres-
68 sion.
69
70 The argument cflags is either zero, or contains one or more
71 of the bits defined by the following macros:
72
73 REG_ICASE
74
75 The PCRE_CASELESS option is set when the expression is
76 passed for compilation to the native function.
77
78 REG_NEWLINE
79
80 The PCRE_MULTILINE option is set when the expression is
81 passed for compilation to the native function.
82
83 The yield of regcomp() is zero on success, and non-zero oth-
84 erwise. The preg structure is filled in on success, and one
85 member of the structure is publicized: re_nsub contains the
86 number of capturing subpatterns in the regular expression.
87 Various error codes are defined in the header file.
88
89
90
91 MATCHING A PATTERN
92 The function regexec() is called to match a pre-compiled
93 pattern preg against a given string, which is terminated by
94 a zero byte, subject to the options in eflags. These can be:
95
96 REG_NOTBOL
97
98 The PCRE_NOTBOL option is set when calling the underlying
99 PCRE matching function.
100
101 REG_NOTEOL
102
103 The PCRE_NOTEOL option is set when calling the underlying
104 PCRE matching function.
105
106 The portion of the string that was matched, and also any
107 captured substrings, are returned via the pmatch argument,
108 which points to an array of nmatch structures of type
109 regmatch_t, containing the members rm_so and rm_eo. These
110 contain the offset to the first character of each substring
111 and the offset to the first character after the end of each
112 substring, respectively. The 0th element of the vector
113 relates to the entire portion of string that was matched;
114 subsequent elements relate to the capturing subpatterns of
115 the regular expression. Unused entries in the array have
116 both structure members set to -1.
117
118 A successful match yields a zero return; various error codes
119 are defined in the header file, of which REG_NOMATCH is the
120 "expected" failure code.
121
122
123
124 ERROR MESSAGES
125 The regerror() function maps a non-zero errorcode from
126 either regcomp or regexec to a printable message. If preg is
127 not NULL, the error should have arisen from the use of that
128 structure. A message terminated by a binary zero is placed
129 in errbuf. The length of the message, including the zero, is
130 limited to errbuf_size. The yield of the function is the
131 size of buffer needed to hold the whole message.
132
133
134
135 STORAGE
136 Compiling a regular expression causes memory to be allocated
137 and associated with the preg structure. The function reg-
138 free() frees all such memory, after which preg may no longer
139 be used as a compiled expression.
140
141
142
143 AUTHOR
144 Philip Hazel <ph10@cam.ac.uk>
145 University Computing Service,
146 New Museums Site,
147 Cambridge CB2 3QG, England.
148 Phone: +44 1223 334714
149
150 Copyright (c) 1997-1999 University of Cambridge.

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12