/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 23 by nigel, Sat Feb 24 21:38:41 2007 UTC revision 25 by nigel, Sat Feb 24 21:38:45 2007 UTC
# Line 8  README file for PCRE (Perl-compatible re Line 8  README file for PCRE (Perl-compatible re
8  * ovector is required at matching time, to provide some additional workspace. *  * ovector is required at matching time, to provide some additional workspace. *
9  * The new man page has details. This change was necessary in order to support *  * The new man page has details. This change was necessary in order to support *
10  * some of the new functionality in Perl 5.005.                                *  * some of the new functionality in Perl 5.005.                                *
11    *                                                                             *
12    *           IMPORTANT FOR THOSE UPGRADING FROM VERSION 2.00                   *
13    *                                                                             *
14    * Another (I hope this is the last!) change has been made to the API for the  *
15    * pcre_compile() function. An additional argument has been added to make it   *
16    * possible to pass over a pointer to character tables built in the current    *
17    * locale by pcre_maketables(). To use the default tables, this new arguement  *
18    * should be passed as NULL.                                                   *
19  *******************************************************************************  *******************************************************************************
20    
21  The distribution should contain the following files:  The distribution should contain the following files:
# Line 19  The distribution should contain the foll Line 27  The distribution should contain the foll
27    Tech.Notes        notes on the encoding    Tech.Notes        notes on the encoding
28    pcre.3            man page for the functions    pcre.3            man page for the functions
29    pcreposix.3       man page for the POSIX wrapper API    pcreposix.3       man page for the POSIX wrapper API
30    maketables.c      auxiliary program for building chartables.c    deftables.c       auxiliary program for building chartables.c
31      maketables.c      )
32    study.c           ) source of    study.c           ) source of
33    pcre.c            )   the functions    pcre.c            )   the functions
34    pcreposix.c       )    pcreposix.c       )
# Line 33  The distribution should contain the foll Line 42  The distribution should contain the foll
42    testinput         test data, compatible with Perl 5.004 and 5.005    testinput         test data, compatible with Perl 5.004 and 5.005
43    testinput2        test data for error messages and non-Perl things    testinput2        test data for error messages and non-Perl things
44    testinput3        test data, compatible with Perl 5.005    testinput3        test data, compatible with Perl 5.005
45      testinput4        test data for locale-specific tests
46    testoutput        test results corresponding to testinput    testoutput        test results corresponding to testinput
47    testoutput2       test results corresponding to testinput2    testoutput2       test results corresponding to testinput2
48    testoutput3       test results corresponding to testinpug3    testoutput3       test results corresponding to testinput3
49      testoutput4       test results corresponding to testinput4
50    
51  To build PCRE, edit Makefile for your system (it is a fairly simple make file,  To build PCRE, edit Makefile for your system (it is a fairly simple make file,
52  and there are some comments at the top) and then run it. It builds two  and there are some comments at the top) and then run it. It builds two
# Line 61  widespread, these two test files may get Line 72  widespread, these two test files may get
72  The second set of tests check pcre_info(), pcre_study(), error detection and  The second set of tests check pcre_info(), pcre_study(), error detection and
73  run-time flags that are specific to PCRE, as well as the POSIX wrapper API.  run-time flags that are specific to PCRE, as well as the POSIX wrapper API.
74    
75    The fourth set of tests checks pcre_maketables(), the facility for building a
76    set of character tables for a specific locale and using them instead of the
77    default tables. The tests make use of the "fr" (French) locale. Before running
78    the test, the script checks for the presence of this locale by running the
79    "locale" command. If that command fails, or if it doesn't include "fr" in the
80    list of available locales, the fourth test cannot be run, and a comment is
81    output to say why. If running this test produces instances of the error
82    
83      ** Failed to set locale "fr"
84    
85    in the comparison output, it means that locale is not available on your system,
86    despite being listed by "locale". This does not mean that PCRE is broken.
87    
88  To install PCRE, copy libpcre.a to any suitable library directory (e.g.  To install PCRE, copy libpcre.a to any suitable library directory (e.g.
89  /usr/local/lib), pcre.h to any suitable include directory (e.g.  /usr/local/lib), pcre.h to any suitable include directory (e.g.
90  /usr/local/include), and pcre.3 to any suitable man directory (e.g.  /usr/local/include), and pcre.3 to any suitable man directory (e.g.
# Line 83  uses the POSIX API, it will have to be r Line 107  uses the POSIX API, it will have to be r
107  Character tables  Character tables
108  ----------------  ----------------
109    
110  PCRE uses four tables for manipulating and identifying characters. These are  PCRE uses four tables for manipulating and identifying characters. The final
111  compiled from a source file called chartables.c. This is not supplied in  argument of the pcre_compile() function is a pointer to a block of memory
112  the distribution, but is built by the program maketables (compiled from  containing the concatenated tables. A call to pcre_maketables() is used to
113  maketables.c), which uses the ANSI C character handling functions such as  generate a set of tables in the current locale. However, if the final argument
114  isalnum(), isalpha(), isupper(), islower(), etc. to build the table sources.  is passed as NULL, a set of default tables that is built into the binary is
115  This means that the default C locale set in your system may affect the contents  used.
116  of the tables. You can change the tables by editing chartables.c and then  
117  re-building PCRE. If you do this, you should probably also edit Makefile to  The source file called chartables.c contains the default set of tables. This is
118  ensure that the file doesn't ever get re-generated.  not supplied in the distribution, but is built by the program deftables
119    (compiled from deftables.c), which uses the ANSI C character handling functions
120  The first two tables pcre_lcc[] and pcre_fcc[] provide lower casing and a  such as isalnum(), isalpha(), isupper(), islower(), etc. to build the table
121  case flipping functions, respectively. The pcre_cbits[] table consists of four  sources. This means that the default C locale set your system will control the
122  32-byte bit maps which identify digits, letters, "word" characters, and white  contents of the tables. You can change the default tables by editing
123  space, respectively. These are used when building 32-byte bit maps that  chartables.c and then re-building PCRE. If you do this, you should probably
124  represent character classes.  also edit Makefile to ensure that the file doesn't ever get re-generated.
125    
126    The first two 256-byte tables provide lower casing and case flipping functions,
127    respectively. The next table consists of three 32-byte bit maps which identify
128    digits, "word" characters, and white space, respectively. These are used when
129    building 32-byte bit maps that represent character classes.
130    
131  The pcre_ctypes[] table has bits indicating various character types, as  The final 256-byte table has bits indicating various character types, as
132  follows:  follows:
133    
134      1   white space character      1   white space character
# Line 138  same effect as they do in Perl. Line 167  same effect as they do in Perl.
167    
168  There are also some upper case options that do not match Perl options: /A, /E,  There are also some upper case options that do not match Perl options: /A, /E,
169  and /X set PCRE_ANCHORED, PCRE_DOLLAR_ENDONLY, and PCRE_EXTRA respectively.  and /X set PCRE_ANCHORED, PCRE_DOLLAR_ENDONLY, and PCRE_EXTRA respectively.
170  The /D option is a PCRE debugging feature. It causes the internal form of  
171  compiled regular expressions to be output after compilation. The /S option  The /L option must be followed directly by the name of a locale, for example,
172  causes pcre_study() to be called after the expression has been compiled, and  
173  the results used when the expression is matched.    /pattern/Lfr
174    
175    For this reason, it must be the last option letter. The given locale is set,
176    pcre_maketables() is called to build a set of character tables for the locale,
177    and this is then passed to pcre_compile() when compiling the regular
178    expression. Without an /L option, NULL is passed as the tables pointer; that
179    is, /L applies only to the expression on which it appears.
180    
181    The /I option requests that pcretest output information about the compiled
182    expression (whether it is anchored, has a fixed first character, and so on). It
183    does this by calling pcre_info() after compiling an expression, and outputting
184    the information it gets back. If the pattern is studied, the results of that
185    are also output.
186    
187    The /D option is a PCRE debugging feature, which also assumes /I. It causes the
188    internal form of compiled regular expressions to be output after compilation.
189    
190    The /S option causes pcre_study() to be called after the expression has been
191    compiled, and the results used when the expression is matched.
192    
193  Finally, the /P option causes pcretest to call PCRE via the POSIX wrapper API  Finally, the /P option causes pcretest to call PCRE via the POSIX wrapper API
194  rather than its native API. When this is done, all other options except /i and  rather than its native API. When this is done, all other options except /i and
# Line 206  following flags has any effect in this c Line 253  following flags has any effect in this c
253  If the option -d is given to pcretest, it is equivalent to adding /D to each  If the option -d is given to pcretest, it is equivalent to adding /D to each
254  regular expression: the internal form is output after compilation.  regular expression: the internal form is output after compilation.
255    
256  If the option -i (for "information") is given to pcretest, it calls pcre_info()  If the option -i is given to pcretest, it is equivalent to adding /I to each
257  after compiling an expression, and outputs the information it gets back. If the  regular expression: information about the compiled pattern is given after
258  pattern is studied, the results of that are also output.  compilation.
259    
260  If the option -s is given to pcretest, it outputs the size of each compiled  If the option -s is given to pcretest, it outputs the size of each compiled
261  pattern after it has been compiled.  pattern after it has been compiled.
# Line 237  for pcretest, and the special upper case Line 284  for pcretest, and the special upper case
284  recognizes are not used in this file. The output should be identical, apart  recognizes are not used in this file. The output should be identical, apart
285  from the initial identifying banner.  from the initial identifying banner.
286    
287  The testinput2 file is not suitable for feeding to Perltest, since it does  The testinput2 and testinput4 files are not suitable for feeding to Perltest,
288  make use of the special upper case options and escapes that pcretest uses to  since they do make use of the special upper case options and escapes that
289  test some features of PCRE. It also contains malformed regular expressions, in  pcretest uses to test some features of PCRE. The first of these files also
290  order to check that PCRE diagnoses them correctly.  contains malformed regular expressions, in order to check that PCRE diagnoses
291    them correctly.
292    
293  Philip Hazel <ph10@cam.ac.uk>  Philip Hazel <ph10@cam.ac.uk>
294  September 1998  October 1998

Legend:
Removed from v.23  
changed lines
  Added in v.25

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12