/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 131 by ph10, Mon Mar 26 15:10:12 2007 UTC revision 489 by ph10, Tue Jan 19 16:42:21 2010 UTC
# Line 1  Line 1 
1  README file for PCRE (Perl-compatible regular expression library)  README file for PCRE (Perl-compatible regular expression library)
2  -----------------------------------------------------------------  -----------------------------------------------------------------
3    
4  The latest release of PCRE is always available from  The latest release of PCRE is always available in three alternative formats
5    from:
6    
7    ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz    ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
8      ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.bz2
9      ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.zip
10    
11  There is a mailing list for discussion about the development of PCRE at  There is a mailing list for discussion about the development of PCRE at
12    
# Line 21  The contents of this README file are: Line 24  The contents of this README file are:
24    Shared libraries on Unix-like systems    Shared libraries on Unix-like systems
25    Cross-compiling on Unix-like systems    Cross-compiling on Unix-like systems
26    Using HP's ANSI C++ compiler (aCC)    Using HP's ANSI C++ compiler (aCC)
27      Using PCRE from MySQL
28    Making new tarballs    Making new tarballs
29    Testing PCRE    Testing PCRE
30    Character tables    Character tables
# Line 82  documentation is supplied in two other f Line 86  documentation is supplied in two other f
86       in various ways, and rooted in a file called index.html, is distributed in       in various ways, and rooted in a file called index.html, is distributed in
87       doc/html and installed in <prefix>/share/doc/pcre/html.       doc/html and installed in <prefix>/share/doc/pcre/html.
88    
89    Users of PCRE have contributed files containing the documentation for various
90    releases in CHM format. These can be found in the Contrib directory of the FTP
91    site (see next section).
92    
93    
94  Contributions by users of PCRE  Contributions by users of PCRE
95  ------------------------------  ------------------------------
# Line 103  Building PCRE on non-Unix systems Line 111  Building PCRE on non-Unix systems
111    
112  For a non-Unix system, please read the comments in the file NON-UNIX-USE,  For a non-Unix system, please read the comments in the file NON-UNIX-USE,
113  though if your system supports the use of "configure" and "make" you may be  though if your system supports the use of "configure" and "make" you may be
114  able to build PCRE in the same way as for Unix-like systems.  able to build PCRE in the same way as for Unix-like systems. PCRE can also be
115    configured in many platform environments using the GUI facility provided by
116    CMake's cmake-gui command. This creates Makefiles, solution files, etc.
117    
118  PCRE has been compiled on many different operating systems. It should be  PCRE has been compiled on many different operating systems. It should be
119  straightforward to build PCRE on any system that has a Standard C compiler and  straightforward to build PCRE on any system that has a Standard C compiler and
# Line 116  Building PCRE on Unix-like systems Line 126  Building PCRE on Unix-like systems
126  If you are using HP's ANSI C++ compiler (aCC), please see the special note  If you are using HP's ANSI C++ compiler (aCC), please see the special note
127  in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.  in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
128    
129    The following instructions assume the use of the widely used "configure, make,
130    make install" process. There is also support for CMake in the PCRE
131    distribution; there are some comments about using CMake in the NON-UNIX-USE
132    file, though it can also be used in Unix-like systems.
133    
134  To build PCRE on a Unix-like system, first run the "configure" command from the  To build PCRE on a Unix-like system, first run the "configure" command from the
135  PCRE distribution directory, with your current directory set to the directory  PCRE distribution directory, with your current directory set to the directory
136  where you want the files to be created. This command is a standard GNU  where you want the files to be created. This command is a standard GNU
# Line 151  library. You can read more about them in Line 166  library. You can read more about them in
166    it will try to find a C++ compiler and C++ header files, and if it succeeds,    it will try to find a C++ compiler and C++ header files, and if it succeeds,
167    it will try to build the C++ wrapper.    it will try to build the C++ wrapper.
168    
169  . If you want to make use of the support for UTF-8 character strings in PCRE,  . If you want to make use of the support for UTF-8 Unicode character strings in
170    you must add --enable-utf8 to the "configure" command. Without it, the code    PCRE, you must add --enable-utf8 to the "configure" command. Without it, the
171    for handling UTF-8 is not included in the library. (Even when included, it    code for handling UTF-8 is not included in the library. Even when included,
172    still has to be enabled by an option at run time.)    it still has to be enabled by an option at run time. When PCRE is compiled
173      with this option, its input can only either be ASCII or UTF-8, even when
174      running on EBCDIC platforms. It is not possible to use both --enable-utf8 and
175      --enable-ebcdic at the same time.
176    
177  . If, in addition to support for UTF-8 character strings, you want to include  . If, in addition to support for UTF-8 character strings, you want to include
178    support for the \P, \p, and \X sequences that recognize Unicode character    support for the \P, \p, and \X sequences that recognize Unicode character
# Line 164  library. You can read more about them in Line 182  library. You can read more about them in
182    supported.    supported.
183    
184  . You can build PCRE to recognize either CR or LF or the sequence CRLF or any  . You can build PCRE to recognize either CR or LF or the sequence CRLF or any
185    of the Unicode newline sequences as indicating the end of a line. Whatever    of the preceding, or any of the Unicode newline sequences as indicating the
186    you specify at build time is the default; the caller of PCRE can change the    end of a line. Whatever you specify at build time is the default; the caller
187    selection at run time. The default newline indicator is a single LF character    of PCRE can change the selection at run time. The default newline indicator
188    (the Unix standard). You can specify the default newline indicator by adding    is a single LF character (the Unix standard). You can specify the default
189    --newline-is-cr or --newline-is-lf or --newline-is-crlf or --newline-is-any    newline indicator by adding --enable-newline-is-cr or --enable-newline-is-lf
190    to the "configure" command, respectively.    or --enable-newline-is-crlf or --enable-newline-is-anycrlf or
191      --enable-newline-is-any to the "configure" command, respectively.
192    If you specify --newline-is-cr or --newline-is-crlf, some of the standard  
193    tests will fail, because the lines in the test files end with LF. Even if    If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
194    the files are edited to change the line endings, there are likely to be some    the standard tests will fail, because the lines in the test files end with
195    failures. With --newline-is-any, many tests should succeed, but there may be    LF. Even if the files are edited to change the line endings, there are likely
196    some failures.    to be some failures. With --enable-newline-is-anycrlf or
197      --enable-newline-is-any, many tests should succeed, but there may be some
198      failures.
199    
200    . By default, the sequence \R in a pattern matches any Unicode line ending
201      sequence. This is independent of the option specifying what PCRE considers to
202      be the end of a line (see above). However, the caller of PCRE can restrict \R
203      to match only CR, LF, or CRLF. You can make this the default by adding
204      --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R").
205    
206  . When called via the POSIX interface, PCRE uses malloc() to get additional  . When called via the POSIX interface, PCRE uses malloc() to get additional
207    storage for processing capturing parentheses if there are more than 10 of    storage for processing capturing parentheses if there are more than 10 of
# Line 237  library. You can read more about them in Line 263  library. You can read more about them in
263    pcre_chartables.c.dist. See "Character tables" below for further information.    pcre_chartables.c.dist. See "Character tables" below for further information.
264    
265  . It is possible to compile PCRE for use on systems that use EBCDIC as their  . It is possible to compile PCRE for use on systems that use EBCDIC as their
266    default character code (as opposed to ASCII) by specifying    character code (as opposed to ASCII) by specifying
267    
268    --enable-ebcdic    --enable-ebcdic
269    
270    This automatically implies --enable-rebuild-chartables (see above).    This automatically implies --enable-rebuild-chartables (see above). However,
271      when PCRE is built this way, it always operates in EBCDIC. It cannot support
272      both EBCDIC and UTF-8.
273    
274    . It is possible to compile pcregrep to use libz and/or libbz2, in order to
275      read .gz and .bz2 files (respectively), by specifying one or both of
276    
277      --enable-pcregrep-libz
278      --enable-pcregrep-libbz2
279    
280      Of course, the relevant libraries must be installed on your system.
281    
282    . It is possible to compile pcretest so that it links with the libreadline
283      library, by specifying
284    
285      --enable-pcretest-libreadline
286    
287      If this is done, when pcretest's input is from a terminal, it reads it using
288      the readline() function. This provides line-editing and history facilities.
289      Note that libreadline is GPL-licenced, so if you distribute a binary of
290      pcretest linked in this way, there may be licensing issues.
291    
292      Setting this option causes the -lreadline option to be added to the pcretest
293      build. In many operating environments with a sytem-installed readline
294      library this is sufficient. However, in some environments (e.g. if an
295      unmodified distribution version of readline is in use), it may be necessary
296      to specify something like LIBS="-lncurses" as well. This is because, to quote
297      the readline INSTALL, "Readline uses the termcap functions, but does not link
298      with the termcap or curses library itself, allowing applications which link
299      with readline the to choose an appropriate library." If you get error
300      messages about missing functions tgetstr, tgetent, tputs, tgetflag, or tgoto,
301      this is the problem, and linking with the ncurses library should fix it.
302    
303  The "configure" script builds the following files for the basic C library:  The "configure" script builds the following files for the basic C library:
304    
# Line 254  The "configure" script builds the follow Line 311  The "configure" script builds the follow
311  . RunTest is a script for running tests on the basic C library  . RunTest is a script for running tests on the basic C library
312  . RunGrepTest is a script for running tests on the pcregrep command  . RunGrepTest is a script for running tests on the pcregrep command
313    
314  Versions of config.h and pcre.h are distributed in the PCRE tarballs under  Versions of config.h and pcre.h are distributed in the PCRE tarballs under the
315  the names config.h.generic and pcre.h.generic. These are provided for the  names config.h.generic and pcre.h.generic. These are provided for those who
316  benefit of those who have to built PCRE without the benefit of "configure". If  have to built PCRE without using "configure" or CMake. If you use "configure"
317  you use "configure", the .generic versions are not used.  or CMake, the .generic versions are not used.
318    
319  If a C++ compiler is found, the following files are also built:  If a C++ compiler is found, the following files are also built:
320    
# Line 270  script that can be run to recreate the c Line 327  script that can be run to recreate the c
327  contains compiler output from tests that "configure" runs.  contains compiler output from tests that "configure" runs.
328    
329  Once "configure" has run, you can run "make". It builds two libraries, called  Once "configure" has run, you can run "make". It builds two libraries, called
330  libpcre and libpcreposix, a test program called pcretest, a demonstration  libpcre and libpcreposix, a test program called pcretest, and the pcregrep
331  program called pcredemo, and the pcregrep command. If a C++ compiler was found  command. If a C++ compiler was found on your system, "make" also builds the C++
332  on your system, "make" also builds the C++ wrapper library, which is called  wrapper library, which is called libpcrecpp, and some test programs called
333  libpcrecpp, and some test programs called pcrecpp_unittest,  pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest.
334  pcre_scanner_unittest, and pcre_stringpiece_unittest. Building the C++ wrapper  Building the C++ wrapper can be disabled by adding --disable-cpp to the
335  can be disabled by adding --disable-cpp to the "configure" command.  "configure" command.
336    
337  The command "make check" runs all the appropriate tests. Details of the PCRE  The command "make check" runs all the appropriate tests. Details of the PCRE
338  tests are given below in a separate section of this document.  tests are given below in a separate section of this document.
# Line 327  system. The following are installed (fil Line 384  system. The following are installed (fil
384      pcretest.txt   the pcretest man page      pcretest.txt   the pcretest man page
385      pcregrep.txt   the pcregrep man page      pcregrep.txt   the pcregrep man page
386    
 Note that the pcredemo program that is built by "configure" is *not* installed  
 anywhere. It is a demonstration for programmers wanting to use PCRE.  
   
387  If you want to remove PCRE from your system, you can run "make uninstall".  If you want to remove PCRE from your system, you can run "make uninstall".
388  This removes all the files that "make install" installed. However, it does not  This removes all the files that "make install" installed. However, it does not
389  remove any directories, because these are often shared with other programs.  remove any directories, because these are often shared with other programs.
# Line 425  running the "configure" script: Line 479  running the "configure" script:
479    CXXLDFLAGS="-lstd_v2 -lCsup_v2"    CXXLDFLAGS="-lstd_v2 -lCsup_v2"
480    
481    
482    Using Sun's compilers for Solaris
483    ---------------------------------
484    
485    A user reports that the following configurations work on Solaris 9 sparcv9 and
486    Solaris 9 x86 (32-bit):
487    
488      Solaris 9 sparcv9: ./configure --disable-cpp CC=/bin/cc CFLAGS="-m64 -g"
489      Solaris 9 x86:     ./configure --disable-cpp CC=/bin/cc CFLAGS="-g"
490    
491    
492    Using PCRE from MySQL
493    ---------------------
494    
495    On systems where both PCRE and MySQL are installed, it is possible to make use
496    of PCRE from within MySQL, as an alternative to the built-in pattern matching.
497    There is a web page that tells you how to do this:
498    
499      http://www.mysqludf.org/lib_mysqludf_preg/index.php
500    
501    
502  Making new tarballs  Making new tarballs
503  -------------------  -------------------
504    
505  The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and  The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and
506  zip formats. However, if you have modified any of the man page sources in the  zip formats. The command "make distcheck" does the same, but then does a trial
507  doc directory, you should first run the PrepareRelease script. This re-creates  build of the new distribution to ensure that it works.
508  the .txt and HTML forms of the documentation from the man pages.  
509    If you have modified any of the man page sources in the doc directory, you
510    should first run the PrepareRelease script before making a distribution. This
511    script creates the .txt and HTML forms of the documentation from the man pages.
512    
513    
514  Testing PCRE  Testing PCRE
# Line 489  is output to say why. If running this te Line 566  is output to say why. If running this te
566  in the comparison output, it means that locale is not available on your system,  in the comparison output, it means that locale is not available on your system,
567  despite being listed by "locale". This does not mean that PCRE is broken.  despite being listed by "locale". This does not mean that PCRE is broken.
568    
569    [If you are trying to run this test on Windows, you may be able to get it to
570    work by changing "fr_FR" to "french" everywhere it occurs. Alternatively, use
571    RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
572    Windows versions of test 2. More info on using RunTest.bat is included in the
573    document entitled NON-UNIX-USE.]
574    
575  The fourth test checks the UTF-8 support. It is not run automatically unless  The fourth test checks the UTF-8 support. It is not run automatically unless
576  PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when  PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when
577  running "configure". This file can be also fed directly to the perltest script,  running "configure". This file can be also fed directly to the perltest.pl
578  provided you are running Perl 5.8 or higher. (For Perl 5.6, a small patch,  script, provided you are running Perl 5.8 or higher.
 commented in the script, can be be used.)  
579    
580  The fifth test checks error handling with UTF-8 encoding, and internal UTF-8  The fifth test checks error handling with UTF-8 encoding, and internal UTF-8
581  features of PCRE that are not relevant to Perl.  features of PCRE that are not relevant to Perl.
582    
583  The sixth test checks the support for Unicode character properties. It it not  The sixth test (which is Perl-5.10 compatible) checks the support for Unicode
584  run automatically unless PCRE is built with Unicode property support. To to  character properties. It it not run automatically unless PCRE is built with
585  this you must set --enable-unicode-properties when running "configure".  Unicode property support. To to this you must set --enable-unicode-properties
586    when running "configure".
587    
588  The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative  The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative
589  matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode  matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode
590  property support, respectively. The eighth and ninth tests are not run  property support, respectively. The eighth and ninth tests are not run
591  automatically unless PCRE is build with the relevant support.  automatically unless PCRE is build with the relevant support.
592    
593    The tenth test checks some internal offsets and code size features; it is run
594    only when the default "link size" of 2 is set (in other cases the sizes
595    change).
596    
597    The eleventh test checks out features that are new in Perl 5.10, and the
598    twelfth test checks a number internals and non-Perl features concerned with
599    Unicode property support. It it not run automatically unless PCRE is built with
600    Unicode property support. To to this you must set --enable-unicode-properties
601    when running "configure".
602    
603    
604  Character tables  Character tables
605  ----------------  ----------------
# Line 592  The distribution should contain the foll Line 685  The distribution should contain the foll
685    pcre_study.c            )    pcre_study.c            )
686    pcre_tables.c           )    pcre_tables.c           )
687    pcre_try_flipped.c      )    pcre_try_flipped.c      )
688    pcre_ucp_searchfuncs.c  )    pcre_ucd.c              )
689    pcre_valid_utf8.c       )    pcre_valid_utf8.c       )
690    pcre_version.c          )    pcre_version.c          )
691    pcre_xclass.c           )    pcre_xclass.c           )
# Line 601  The distribution should contain the foll Line 694  The distribution should contain the foll
694    pcre.h.in               template for pcre.h when built by "configure"    pcre.h.in               template for pcre.h when built by "configure"
695    pcreposix.h             header for the external POSIX wrapper API    pcreposix.h             header for the external POSIX wrapper API
696    pcre_internal.h         header for internal use    pcre_internal.h         header for internal use
697    ucp.h                   ) headers concerned with    ucp.h                   header for Unicode property handling
   ucpinternal.h           )   Unicode property handling  
   ucptable.h              ) (this one is the data table)  
698    
699    config.h.in             template for config.h, which is built by "configure"    config.h.in             template for config.h, which is built by "configure"
700    
# Line 642  The distribution should contain the foll Line 733  The distribution should contain the foll
733    NON-UNIX-USE            notes on building PCRE on non-Unix systems    NON-UNIX-USE            notes on building PCRE on non-Unix systems
734    PrepareRelease          script to make preparations for "make dist"    PrepareRelease          script to make preparations for "make dist"
735    README                  this file    README                  this file
736    RunTest.in              template for a Unix shell script for running tests    RunTest                 a Unix shell script for running tests
737    RunGrepTest.in          template for a Unix shell script for pcregrep tests    RunGrepTest             a Unix shell script for pcregrep tests
738    aclocal.m4              m4 macros (generated by "aclocal")    aclocal.m4              m4 macros (generated by "aclocal")
739    config.guess            ) files used by libtool,    config.guess            ) files used by libtool,
740    config.sub              )   used only when building a shared library    config.sub              )   used only when building a shared library
# Line 652  The distribution should contain the foll Line 743  The distribution should contain the foll
743                            )   "configure" and config.h                            )   "configure" and config.h
744    depcomp                 ) script to find program dependencies, generated by    depcomp                 ) script to find program dependencies, generated by
745                            )   automake                            )   automake
746    doc/*.3                 man page sources for the PCRE functions    doc/*.3                 man page sources for PCRE
747    doc/*.1                 man page sources for pcregrep and pcretest    doc/*.1                 man page sources for pcregrep and pcretest
748    doc/index.html.src      the base HTML page    doc/index.html.src      the base HTML page
749    doc/html/*              HTML documentation    doc/html/*              HTML documentation
# Line 661  The distribution should contain the foll Line 752  The distribution should contain the foll
752    doc/perltest.txt        plain text documentation of Perl test program    doc/perltest.txt        plain text documentation of Perl test program
753    install-sh              a shell script for installing files    install-sh              a shell script for installing files
754    libpcre.pc.in           template for libpcre.pc for pkg-config    libpcre.pc.in           template for libpcre.pc for pkg-config
755      libpcreposix.pc.in      template for libpcreposix.pc for pkg-config
756    libpcrecpp.pc.in        template for libpcrecpp.pc for pkg-config    libpcrecpp.pc.in        template for libpcrecpp.pc for pkg-config
757    ltmain.sh               file used to build a libtool script    ltmain.sh               file used to build a libtool script
758    missing                 ) common stub for a few missing GNU programs while    missing                 ) common stub for a few missing GNU programs while
# Line 677  The distribution should contain the foll Line 769  The distribution should contain the foll
769    
770  (D) Auxiliary files for cmake support  (D) Auxiliary files for cmake support
771    
772      cmake/COPYING-CMAKE-SCRIPTS
773      cmake/FindPackageHandleStandardArgs.cmake
774      cmake/FindReadline.cmake
775    CMakeLists.txt    CMakeLists.txt
776    config-cmake.h.in    config-cmake.h.in
777    
778  (E) Auxiliary files for VPASCAL  (E) Auxiliary files for VPASCAL
779    
780    makevp.bat    makevp.bat
781    makevp-c.txt    makevp_c.txt
782    makevp-l.txt    makevp_l.txt
783    pcregexp.pas    pcregexp.pas
784    
785  (F) Auxiliary files for building PCRE "by hand"  (F) Auxiliary files for building PCRE "by hand"
# Line 701  The distribution should contain the foll Line 796  The distribution should contain the foll
796  Philip Hazel  Philip Hazel
797  Email local part: ph10  Email local part: ph10
798  Email domain: cam.ac.uk  Email domain: cam.ac.uk
799  Last updated: 26 March 2007  Last updated: 19 January 2010

Legend:
Removed from v.131  
changed lines
  Added in v.489

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12