/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 22 by nigel, Sat Feb 24 21:38:09 2007 UTC revision 23 by nigel, Sat Feb 24 21:38:41 2007 UTC
# Line 1  Line 1 
1  README file for PCRE (Perl-compatible regular expressions)  README file for PCRE (Perl-compatible regular expressions)
2  ----------------------------------------------------------  ----------------------------------------------------------
3    
4    *******************************************************************************
5    *           IMPORTANT FOR THOSE UPGRADING FROM VERSIONS BEFORE 2.00           *
6    *                                                                             *
7    * Please note that there has been a change in the API such that a larger      *
8    * ovector is required at matching time, to provide some additional workspace. *
9    * The new man page has details. This change was necessary in order to support *
10    * some of the new functionality in Perl 5.005.                                *
11    *******************************************************************************
12    
13  The distribution should contain the following files:  The distribution should contain the following files:
14    
15    ChangeLog         log of changes to the code    ChangeLog         log of changes to the code
16    Makefile          for building PCRE    Makefile          for building PCRE
   Performance       notes on performance  
17    README            this file    README            this file
18      RunTest           a shell script for running tests
19    Tech.Notes        notes on the encoding    Tech.Notes        notes on the encoding
20    pcre.3            man page for the functions    pcre.3            man page for the functions
21    pcreposix.3       man page for the POSIX wrapper API    pcreposix.3       man page for the POSIX wrapper API
# Line 21  The distribution should contain the foll Line 30  The distribution should contain the foll
30    pgrep.1           man page for pgrep    pgrep.1           man page for pgrep
31    pgrep.c           source of a grep utility that uses PCRE    pgrep.c           source of a grep utility that uses PCRE
32    perltest          Perl test program    perltest          Perl test program
33    testinput         test data, compatible with Perl    testinput         test data, compatible with Perl 5.004 and 5.005
34    testinput2        test data for error messages and non-Perl things    testinput2        test data for error messages and non-Perl things
35      testinput3        test data, compatible with Perl 5.005
36    testoutput        test results corresponding to testinput    testoutput        test results corresponding to testinput
37    testoutput2       test results corresponding to testinput2    testoutput2       test results corresponding to testinput2
38      testoutput3       test results corresponding to testinpug3
39    
40  To build PCRE, edit Makefile for your system (it is a fairly simple make file)  To build PCRE, edit Makefile for your system (it is a fairly simple make file,
41  and then run it. It builds a two libraries called libpcre.a and libpcreposix.a,  and there are some comments at the top) and then run it. It builds two
42  a test program called pcretest, and the pgrep command.  libraries called libpcre.a and libpcreposix.a, a test program called pcretest,
43    and the pgrep command.
44  To test PCRE, run pcretest on the file testinput, and compare the output with  
45  the contents of testoutput. There should be no differences. For example:  To test PCRE, run the RunTest script in the pcre directory. This runs pcretest
46    on each of the testinput files in turn, and compares the output with the
47    pcretest testinput some.file  contents of the corresponding testoutput file. A file called testtry is used to
48    diff some.file testoutput  hold the output from pcretest (which is documented below).
49    
50  Do the same with testinput2, comparing the output with testoutput2, but this  To run pcretest on just one of the test files, give its number as an argument
51  time using the -i flag for pcretest, i.e.  to RunTest, for example:
52    
53    pcretest -i testinput2 some.file    RunTest 3
54    diff some.file testoutput2  
55    The first and third test files can also be fed directly into the perltest
56  The make target "runtest" runs both these tests, using the file "testtry" to  program to check that Perl gives the same results. The third file requires the
57  store the intermediate output, deleting it at the end if all goes well.  additional features of release 5.005, which is why it is kept separate from the
58    main test input, which needs only Perl 5.004. In the long run, when 5.005 is
59    widespread, these two test files may get amalgamated.
60    
61  There are two sets of tests because the first set can also be fed directly into  The second set of tests check pcre_info(), pcre_study(), error detection and
62  the perltest program to check that Perl gives the same results. The second set  run-time flags that are specific to PCRE, as well as the POSIX wrapper API.
 of tests check pcre_info(), pcre_study(), error detection and run-time flags  
 that are specific to PCRE, as well as the POSIX wrapper API.  
63    
64  To install PCRE, copy libpcre.a to any suitable library directory (e.g.  To install PCRE, copy libpcre.a to any suitable library directory (e.g.
65  /usr/local/lib), pcre.h to any suitable include directory (e.g.  /usr/local/lib), pcre.h to any suitable include directory (e.g.
# Line 66  themselves still follow Perl syntax and Line 77  themselves still follow Perl syntax and
77  for the POSIX-style functions is called pcreposix.h. The official POSIX name is  for the POSIX-style functions is called pcreposix.h. The official POSIX name is
78  regex.h, but I didn't want to risk possible problems with existing files of  regex.h, but I didn't want to risk possible problems with existing files of
79  that name by distributing it that way. To use it with an existing program that  that name by distributing it that way. To use it with an existing program that
80  uses the POSIX API it will have to be renamed or pointed at by a link.  uses the POSIX API, it will have to be renamed or pointed at by a link.
81    
82    
83  Character tables  Character tables
# Line 130  and /X set PCRE_ANCHORED, PCRE_DOLLAR_EN Line 141  and /X set PCRE_ANCHORED, PCRE_DOLLAR_EN
141  The /D option is a PCRE debugging feature. It causes the internal form of  The /D option is a PCRE debugging feature. It causes the internal form of
142  compiled regular expressions to be output after compilation. The /S option  compiled regular expressions to be output after compilation. The /S option
143  causes pcre_study() to be called after the expression has been compiled, and  causes pcre_study() to be called after the expression has been compiled, and
144  the results used when the expression is matched. If /I is present as well as  the results used when the expression is matched.
 /S, then pcre_study() is called with the PCRE_CASELESS option.  
145    
146  Finally, the /P option causes pcretest to call PCRE via the POSIX wrapper API  Finally, the /P option causes pcretest to call PCRE via the POSIX wrapper API
147  rather than its native API. When this is done, all other options except /i and  rather than its native API. When this is done, all other options except /i and
# Line 140  is present. The wrapper functions force Line 150  is present. The wrapper functions force
150  PCRE_DOTALL unless REG_NEWLINE is set.  PCRE_DOTALL unless REG_NEWLINE is set.
151    
152  A regular expression can extend over several lines of input; the newlines are  A regular expression can extend over several lines of input; the newlines are
153  included in it. See the testinput file for many examples.  included in it. See the testinput files for many examples.
154    
155  Before each data line is passed to pcre_exec(), leading and trailing whitespace  Before each data line is passed to pcre_exec(), leading and trailing whitespace
156  is removed, and it is then scanned for \ escapes. The following are recognized:  is removed, and it is then scanned for \ escapes. The following are recognized:
# Line 158  is removed, and it is then scanned for \ Line 168  is removed, and it is then scanned for \
168    
169    \A     pass the PCRE_ANCHORED option to pcre_exec()    \A     pass the PCRE_ANCHORED option to pcre_exec()
170    \B     pass the PCRE_NOTBOL option to pcre_exec()    \B     pass the PCRE_NOTBOL option to pcre_exec()
   \E     pass the PCRE_DOLLAR_ENDONLY option to pcre_exec()  
   \I     pass the PCRE_CASELESS option to pcre_exec()  
   \M     pass the PCRE_MULTILINE option to pcre_exec()  
   \S     pass the PCRE_DOTALL option to pcre_exec()  
171    \Odd   set the size of the output vector passed to pcre_exec() to dd    \Odd   set the size of the output vector passed to pcre_exec() to dd
172             (any number of decimal digits)             (any number of decimal digits)
173    \Z     pass the PCRE_NOTEOL option to pcre_exec()    \Z     pass the PCRE_NOTEOL option to pcre_exec()
# Line 182  whole pattern. Here is an example of an Line 188  whole pattern. Here is an example of an
188    Testing Perl-Compatible Regular Expressions    Testing Perl-Compatible Regular Expressions
189    PCRE version 0.90 08-Sep-1997    PCRE version 0.90 08-Sep-1997
190    
191        re> /^abc(\d+)/      re> /^abc(\d+)/
192      data> abc123    data> abc123
193     0: abc123      0: abc123
194     1: 123      1: 123
195      data> xyz    data> xyz
196    No match    No match
197    
198  Note that while patterns can be continued over several lines (a plain ">"  Note that while patterns can be continued over several lines (a plain ">"
# Line 207  pattern is studied, the results of that Line 213  pattern is studied, the results of that
213  If the option -s is given to pcretest, it outputs the size of each compiled  If the option -s is given to pcretest, it outputs the size of each compiled
214  pattern after it has been compiled.  pattern after it has been compiled.
215    
216  If the -t option is given, each compile, study, and match is run 2000 times  If the -t option is given, each compile, study, and match is run 10000 times
217  while being timed, and the resulting time per compile or match is output in  while being timed, and the resulting time per compile or match is output in
218  milliseconds. Do not set -t with -s, because you will then get the size output  milliseconds. Do not set -t with -s, because you will then get the size output
219  2000 times and the timing will be distorted.  10000 times and the timing will be distorted. If you want to change the number
220    of repetitions used for timing, edit the definition of LOOPREPEAT at the top of
221    pcretest.c
222    
223    
224    
# Line 219  The perltest program Line 227  The perltest program
227    
228  The perltest program tests Perl's regular expressions; it has the same  The perltest program tests Perl's regular expressions; it has the same
229  specification as pcretest, and so can be given identical input, except that  specification as pcretest, and so can be given identical input, except that
230  input patterns can be followed only by Perl's lower case options.  input patterns can be followed only by Perl's lower case options. The contents
231    of testinput and testinput3 meet this condition.
232    
233  The data lines are processed as Perl strings, so if they contain $ or @  The data lines are processed as Perl strings, so if they contain $ or @
234  characters, these have to be escaped. For this reason, all such characters in  characters, these have to be escaped. For this reason, all such characters in
# Line 230  from the initial identifying banner. Line 239  from the initial identifying banner.
239    
240  The testinput2 file is not suitable for feeding to Perltest, since it does  The testinput2 file is not suitable for feeding to Perltest, since it does
241  make use of the special upper case options and escapes that pcretest uses to  make use of the special upper case options and escapes that pcretest uses to
242  test additional features of PCRE.  test some features of PCRE. It also contains malformed regular expressions, in
243    order to check that PCRE diagnoses them correctly.
244    
245  Philip Hazel <ph10@cam.ac.uk>  Philip Hazel <ph10@cam.ac.uk>
246  October 1997  September 1998

Legend:
Removed from v.22  
changed lines
  Added in v.23

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12