/[pcre]/code/trunk/doc/pcregrep.txt
ViewVC logotype

Diff of /code/trunk/doc/pcregrep.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 86 by nigel, Sat Feb 24 21:40:52 2007 UTC revision 87 by nigel, Sat Feb 24 21:41:21 2007 UTC
# Line 6  NAME Line 6  NAME
6    
7    
8  SYNOPSIS  SYNOPSIS
9         pcregrep [options] [long options] [pattern] [file1 file2 ...]         pcregrep [options] [long options] [pattern] [path1 path2 ...]
10    
11    
12  DESCRIPTION  DESCRIPTION
# Line 17  DESCRIPTION Line 17  DESCRIPTION
17         Perl 5. See pcrepattern for a full description of syntax and  semantics         Perl 5. See pcrepattern for a full description of syntax and  semantics
18         of the regular expressions that PCRE supports.         of the regular expressions that PCRE supports.
19    
20         A pattern must be specified on the command line unless the -f option is         Patterns,  whether  supplied on the command line or in a separate file,
21         used (see below).         are given without delimiters. For example:
22    
23             pcregrep Thursday /etc/motd
24    
25           If you attempt to use delimiters (for example, by surrounding a pattern
26           with  slashes,  as  is common in Perl scripts), they are interpreted as
27           part of the pattern. Quotes can of course be used on the  command  line
28           because they are interpreted by the shell, and indeed they are required
29           if a pattern contains white space or shell metacharacters.
30    
31           The first argument that follows any option settings is treated  as  the
32           single  pattern  to be matched when neither -e nor -f is present.  Con-
33           versely, when one or both of these options are  used  to  specify  pat-
34           terns, all arguments are treated as path names. At least one of -e, -f,
35           or an argument pattern must be provided.
36    
37         If no files are specified, pcregrep reads the standard input. The stan-         If no files are specified, pcregrep reads the standard input. The stan-
38         dard  input  can  also  be  referenced by a name consisting of a single         dard  input  can  also  be  referenced by a name consisting of a single
# Line 27  DESCRIPTION Line 41  DESCRIPTION
41           pcregrep some-pattern /file1 - /file3           pcregrep some-pattern /file1 - /file3
42    
43         By default, each line that matches the pattern is copied to  the  stan-         By default, each line that matches the pattern is copied to  the  stan-
44         dard  output,  and  if  there  is  more than one file, the file name is         dard  output, and if there is more than one file, the file name is out-
45         printed before each line of output. However, there are options that can         put at the start of each line. However,  there  are  options  that  can
46         change how pcregrep behaves. In particular, the -M option makes it pos-         change how pcregrep behaves. In particular, the -M option makes it pos-
47         sible to search for patterns that span line boundaries.         sible to search for patterns that span line boundaries.
48    
49         Patterns are limited to 8K  or  BUFSIZ  characters,  whichever  is  the         Patterns are limited to 8K  or  BUFSIZ  characters,  whichever  is  the
50         greater.  BUFSIZ is defined in <stdio.h>.         greater.  BUFSIZ is defined in <stdio.h>.
51    
52           If  the  LC_ALL  or LC_CTYPE environment variable is set, pcregrep uses
53           the value to set a locale when calling the PCRE library.  The  --locale
54           option can be used to override this.
55    
56    
57  OPTIONS  OPTIONS
58    
59         --        This  terminate the list of options. It is useful if the next         --        This  terminate the list of options. It is useful if the next
60                   item on the command line starts with a hyphen, but is not  an                   item on the command line starts with a hyphen but is  not  an
61                   option.                   option.  This allows for the processing of patterns and file-
62                     names that start with hyphens.
63    
64           -A number, --after-context=number
65                     Output number lines of context after each matching  line.  If
66                     filenames and/or line numbers are being output, a hyphen sep-
67                     arator is used instead of a colon for the  context  lines.  A
68                     line  containing  "--" is output between each group of lines,
69                     unless they are in fact contiguous in  the  input  file.  The
70                     value  of number is expected to be relatively small. However,
71                     pcregrep guarantees to have up to 8K of following text avail-
72                     able for context output.
73    
74         -A number Print  number  lines  of context after each matching line. If         -B number, --before-context=number
75                   file names and/or line numbers are being  printed,  a  hyphen                   Output  number lines of context before each matching line. If
76                   separator is used instead of a colon for the context lines. A                   filenames and/or line numbers are being output, a hyphen sep-
77                   line containing "--" is printed between each group of  lines,                   arator  is  used  instead of a colon for the context lines. A
78                     line containing "--" is output between each group  of  lines,
79                   unless  they  are  in  fact contiguous in the input file. The                   unless  they  are  in  fact contiguous in the input file. The
80                   value of number is expected to be relatively small.  However,                   value of number is expected to be relatively small.  However,
                  pcregrep guarantees to have up to 8K of following text avail-  
                  able for context printing.  
   
        -B number Print number lines of context before each matching  line.  If  
                  file  names  and/or  line numbers are being printed, a hyphen  
                  separator is used instead of a colon for the context lines. A  
                  line  containing "--" is printed between each group of lines,  
                  unless they are in fact contiguous in  the  input  file.  The  
                  value  of number is expected to be relatively small. However,  
81                   pcregrep guarantees to have up to 8K of preceding text avail-                   pcregrep guarantees to have up to 8K of preceding text avail-
82                   able for context printing.                   able for context output.
83    
84         -C number Print  number  lines  of  context  both before and after each         -C number, --context=number
85                   matching line.  This is equivalent to setting both -A and  -B                   Output number lines of context both  before  and  after  each
86                     matching  line.  This is equivalent to setting both -A and -B
87                   to the same value.                   to the same value.
88    
89         -c        Do  not print individual lines; instead just print a count of         -c, --count
90                   the number of lines that would otherwise have  been  printed.                   Do not output individual lines; instead just output  a  count
91                   If  several  files  are given, a count is printed for each of                   of the number of lines that would otherwise have been output.
92                   them.                   If several files are given, a count is  output  for  each  of
93                     them. In this mode, the -A, -B, and -C options are ignored.
94    
95           --colour, --color
96                     If this option is given without any data, it is equivalent to
97                     "--colour=auto".  If data is required, it must  be  given  in
98                     the same shell item, separated by an equals sign.
99    
100           --colour=value, --color=value
101                     This  option specifies under what circumstances the part of a
102                     line that matched a pattern should be coloured in the output.
103                     The  value may be "never" (the default), "always", or "auto".
104                     In the latter case, colouring happens only  if  the  standard
105                     output  is  connected to a terminal. The colour can be speci-
106                     fied by setting the environment variable  PCREGREP_COLOUR  or
107                     PCREGREP_COLOR. The value of this variable should be a string
108                     of two numbers, separated by a semicolon.   They  are  copied
109                     directly into the control string for setting colour on a ter-
110                     minal, so it is your responsibility to ensure that they  make
111                     sense.  If  neither  of the environment variables is set, the
112                     default is "1;31", which gives red.
113    
114           -D action, --devices=action
115                     If an input path is  not  a  regular  file  or  a  directory,
116                     "action"  specifies  how  it is to be processed. Valid values
117                     are "read" (the default) or "skip" (silently skip the  path).
118    
119           -d action, --directories=action
120                     If an input path is a directory, "action" specifies how it is
121                     to be processed.  Valid  values  are  "read"  (the  default),
122                     "recurse"  (equivalent to the -r option), or "skip" (silently
123                     skip the path). In the default case, directories are read  as
124                     if  they  were  ordinary files. In some operating systems the
125                     effect of reading a directory like this is an immediate  end-
126                     of-file.
127    
128           -e pattern, --regex=pattern,
129                     --regexp=pattern Specify a pattern to be matched. This option
130                     can be used multiple times in order to specify  several  pat-
131                     terns.  It  can  also be used as a way of specifying a single
132                     pattern that starts with a hyphen. When -e is used, no  argu-
133                     ment  pattern  is  taken from the command line; all arguments
134                     are treated as file names. There is an overall maximum of 100
135                     patterns. They are applied to each line in the order in which
136                     they are defined until one matches (or fails to match  if  -v
137                     is  used).  If  -f is used with -e, the command line patterns
138                     are matched first, followed by the patterns  from  the  file,
139                     independent  of  the  order in which these options are speci-
140                     fied. Note that multiple use of -e is not the same as a  sin-
141                     gle  pattern  with  alternatives.  For example, X|Y finds the
142                     first character in a line that is X or Y, whereas if the  two
143                     patterns  are  given  separately,  pcregrep  finds X if it is
144                     present, even if it follows Y in the line. It finds Y only if
145                     there  is  no  X in the line. This really matters only if you
146                     are using -o to show the portion of the line that matched.
147    
148         --exclude=pattern         --exclude=pattern
149                   When pcregrep is searching the files in a directory as a con-                   When pcregrep is searching the files in a directory as a con-
# Line 77  OPTIONS Line 153  OPTIONS
153                   --exclude, it is excluded. There is no short  form  for  this                   --exclude, it is excluded. There is no short  form  for  this
154                   option.                   option.
155    
156         -ffilename         -F, --fixed-strings
157                   Read  a  number  of patterns from the file, one per line, and                   Interpret  each pattern as a list of fixed strings, separated
158                   match all of them against each line of input. A line is  out-                   by newlines, instead of  as  a  regular  expression.  The  -w
159                   put  if  any  of  the patterns match it.  When -f is used, no                   (match  as  a  word) and -x (match whole line) options can be
160                     used with -F. They apply to each of the fixed strings. A line
161                     is selected if any of the fixed strings are found in it (sub-
162                     ject to -w or -x, if present).
163    
164           -f filename, --file=filename
165                     Read a number of patterns from the file, one  per  line,  and
166                     match  them against each line of input. A data line is output
167                     if any of the patterns match it. The filename can be given as
168                     "-" to refer to the standard input. When -f is used, patterns
169                     specified on the command line using -e may also  be  present;
170                     they are tested before the file's patterns. However, no other
171                   pattern is taken from the command  line;  all  arguments  are                   pattern is taken from the command  line;  all  arguments  are
172                   treated  as  file  names. There is a maximum of 100 patterns.                   treated  as  file  names.  There is an overall maximum of 100
173                   Trailing white space is removed, and blank lines are ignored.                   patterns. Trailing white space is removed from each line, and
174                   An  empty  file  contains  no  patterns and therefore matches                   blank  lines  are ignored. An empty file contains no patterns
175                   nothing.                   and therefore matches nothing.
176    
177           -H, --with-filename
178                     Force the inclusion of the filename at the  start  of  output
179                     lines  when searching a single file. By default, the filename
180                     is not shown in this case. For matching lines,  the  filename
181                     is  followed  by  a  colon  and a space; for context lines, a
182                     hyphen separator is used. If a line number is also being out-
183                     put, it follows the file name without a space.
184    
185           -h, --no-filename
186                     Suppress  the output filenames when searching multiple files.
187                     By default, filenames  are  shown  when  multiple  files  are
188                     searched.  For  matching lines, the filename is followed by a
189                     colon and a space; for context lines, a hyphen  separator  is
190                     used.  If  a line number is also being output, it follows the
191                     file name without a space.
192    
193         -h        Suppress printing of filenames when searching multiple files.         --help    Output a brief help message and exit.
194    
195         -i        Ignore upper/lower case distinctions during comparisons.         -i, --ignore-case
196                     Ignore upper/lower case distinctions during comparisons.
197    
198         --include=pattern         --include=pattern
199                   When pcregrep is searching the files in a directory as a con-                   When pcregrep is searching the files in a directory as a con-
200                   sequence of the -r  (recursive  search)  option,  only  files                   sequence  of  the  -r  (recursive  search) option, only those
201                   whose  names match the pattern are included. The pattern is a                   files whose names match the pattern are included. The pattern
202                   PCRE  regular  expression.  If  a  file  name  matches   both                   is  a  PCRE  regular  expression. If a file name matches both
203                   --include  and  --exclude,  it is excluded. There is no short                   --include and --exclude, it is excluded. There  is  no  short
204                   form for this option.                   form for this option.
205    
206         -L        Instead of printing lines from  the  files,  just  print  the         -L, --files-without-match
207                   names  of  the files that do not contain any lines that would                   Instead  of  outputting lines from the files, just output the
208                   have been printed. Each file name is printed once, on a sepa-                   names of the files that do not contain any lines  that  would
209                     have  been  output. Each file name is output once, on a sepa-
210                   rate line.                   rate line.
211    
212         -l        Instead  of  printing  lines  from  the files, just print the         -l, --files-with-matches
213                   names of the files containing  lines  that  would  have  been                   Instead of outputting lines from the files, just  output  the
214                   printed.  Each file name is printed once, on a separate line.                   names of the files containing lines that would have been out-
215                     put. Each file name is  output  once,  on  a  separate  line.
216                     Searching  stops  as  soon  as  a matching line is found in a
217                     file.
218    
219         --label=name         --label=name
220                   This option supplies a name to be used for the standard input                   This option supplies a name to be used for the standard input
221                   when  file  names are being printed. If not supplied, "(stan-                   when file names are being output. If not supplied, "(standard
222                   dard input)" is used. There is no short form for this option.                   input)" is used. There is no short form for this option.
223    
224           --locale=locale-name
225                     This option specifies a locale to be used for pattern  match-
226                     ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-
227                     ronment variables.  If  no  locale  is  specified,  the  PCRE
228                     library's  default (usually the "C" locale) is used. There is
229                     no short form for this option.
230    
231         -M        Allow  patterns to match more than one line. When this option         -M, --multiline
232                     Allow patterns to match more than one line. When this  option
233                   is given, patterns may usefully contain literal newline char-                   is given, patterns may usefully contain literal newline char-
234                   acters  and  internal  occurrences of ^ and $ characters. The                   acters and internal occurrences of ^ and  $  characters.  The
235                   output for any one match may consist of more than  one  line.                   output  for  any one match may consist of more than one line.
236                   When  this option is set, the PCRE library is called in "mul-                   When this option is set, the PCRE library is called in  "mul-
237                   tiline" mode.  There is a limit to the number of  lines  that                   tiline"  mode.   There is a limit to the number of lines that
238                   can  be matched, imposed by the way that pcregrep buffers the                   can be matched, imposed by the way that pcregrep buffers  the
239                   input file as it scans it. However, pcregrep ensures that  at                   input  file as it scans it. However, pcregrep ensures that at
240                   least 8K characters or the rest of the document (whichever is                   least 8K characters or the rest of the document (whichever is
241                   the shorter) are available for forward  matching,  and  simi-                   the  shorter)  are  available for forward matching, and simi-
242                   larly the previous 8K characters (or all the previous charac-                   larly the previous 8K characters (or all the previous charac-
243                   ters, if fewer than 8K) are guaranteed to  be  available  for                   ters,  if  fewer  than 8K) are guaranteed to be available for
244                   lookbehind assertions.                   lookbehind assertions.
245    
246         -n        Precede each line by its line number in the file.         -n, --line-number
247                     Precede each output line by its line number in the file, fol-
248                     lowed  by  a colon and a space for matching lines or a hyphen
249                     and a space for context lines. If the filename is also  being
250                     output, it precedes the line number.
251    
252           -o, --only-matching
253                     Show  only  the  part  of the line that matched a pattern. In
254                     this mode, no context is shown. That is, the -A, -B,  and  -C
255                     options are ignored.
256    
257         -q        Work quietly, that is, display nothing except error messages.         -q, --quiet
258                     Work quietly, that is, display nothing except error messages.
259                   The exit status indicates whether or  not  any  matches  were                   The exit status indicates whether or  not  any  matches  were
260                   found.                   found.
261    
262         -r        If  any given path is a directory, recursively scan the files         -r, --recursive
263                     If  any given path is a directory, recursively scan the files
264                   it contains, taking note of any --include and --exclude  set-                   it contains, taking note of any --include and --exclude  set-
265                   tings. Without -r a directory is scanned as a normal file.                   tings.  By  default, a directory is read as a normal file; in
266                     some operating systems this gives an  immediate  end-of-file.
267         -s        Suppress  error  messages  about  non-existent  or unreadable                   This  option  is  a  shorthand  for  setting the -d option to
268                   files. Such files are quietly skipped.  However,  the  return                   "recurse".
269    
270           -s, --no-messages
271                     Suppress error  messages  about  non-existent  or  unreadable
272                     files.  Such  files  are quietly skipped. However, the return
273                   code is still 2, even if matches were found in other files.                   code is still 2, even if matches were found in other files.
274    
275         -u        Operate  in UTF-8 mode. This option is available only if PCRE         -u, --utf-8
276                   has been compiled with UTF-8 support. Both  the  pattern  and                   Operate in UTF-8 mode. This option is available only if  PCRE
277                   each  subject line must be valid strings of UTF-8 characters.                   has  been compiled with UTF-8 support. Both patterns and sub-
278                     ject lines must be valid strings of UTF-8 characters.
279    
280         -V        Write the version numbers of pcregrep and  the  PCRE  library         -V, --version
281                     Write the version numbers of pcregrep and  the  PCRE  library
282                   that is being used to the standard error stream.                   that is being used to the standard error stream.
283    
284         -v        Invert  the  sense  of  the match, so that lines which do not         -v, --invert-match
285                   match the pattern are the ones that are found.                   Invert  the  sense  of  the match, so that lines which do not
286                     match any of the patterns are the ones that are found.
287    
288         -w        Force the pattern to match only whole words. This is  equiva-         -w, --word-regex, --word-regexp
289                     Force the patterns to match only whole words. This is equiva-
290                   lent to having \b at the start and end of the pattern.                   lent to having \b at the start and end of the pattern.
291    
292         -x        Force  the  pattern to be anchored (it must start matching at         -x, --line-regex, --line-regexp
293                   the beginning of the line) and in  addition,  require  it  to                   Force  the  patterns to be anchored (each must start matching
294                   match  the  entire line. This is equivalent to having ^ and $                   at the beginning of a line) and in addition, require them  to
295                     match  entire  lines.  This  is  equivalent to having ^ and $
296                   characters at the start and end of each alternative branch in                   characters at the start and end of each alternative branch in
297                   the regular expression.                   every pattern.
298    
299    
300    ENVIRONMENT VARIABLES
301    
302  LONG OPTIONS         The  environment  variables  LC_ALL  and LC_CTYPE are examined, in that
303           order, for a locale. The first one that is set is  used.  This  can  be
304           overridden  by  the  --locale  option.  If  no  locale is set, the PCRE
305           library's default (usually the "C" locale) is used.
306    
        Long  forms  of all the options are available, as in GNU grep. They are  
        shown in the following table:  
307    
308           -A   --after-context  OPTIONS COMPATIBILITY
309           -B   --before-context  
310           -C   --context         The majority of short and long forms of pcregrep's options are the same
311           -c   --count         as  in  the  GNU grep program. Any long option of the form --xxx-regexp
312                --exclude (no short form)         (GNU terminology) is also available as --xxx-regex (PCRE  terminology).
313           -f   --file         However,  the  --locale,  -M,  --multiline, -u, and --utf-8 options are
314           -h   --no-filename         specific to pcregrep.
               --help (no short form)  
          -i   --ignore-case  
               --include (no short form)  
          -L   --files-without-match  
          -l   --files-with-matches  
               --label (no short form)  
          -n   --line-number  
          -r   --recursive  
          -q   --quiet  
          -s   --no-messages  
          -u   --utf-8  
          -V   --version  
          -v   --invert-match  
          -x   --line-regex  
          -x   --line-regexp  
315    
316    
317  OPTIONS WITH DATA  OPTIONS WITH DATA
# Line 200  OPTIONS WITH DATA Line 324  OPTIONS WITH DATA
324           -f /some/file           -f /some/file
325    
326         If a long form option is used, the data may appear in the same  command         If a long form option is used, the data may appear in the same  command
327         line  item,  separated  by an = character, or it may appear in the next         line item, separated by an equals character, or (with one exception) it
328         command line item. For example:         may appear in the next command line item. For example:
329    
330           --file=/some/file           --file=/some/file
331           --file /some/file           --file /some/file
332    
333           Note, however, that if you want to supply a file name beginning with  ~
334           as  data  in  a  shell  command,  and have the shell expand ~ to a home
335           directory, you must separate the file name from the option, because the
336           shell  does not treat ~ specially unless it is at the start of an item.
337    
338           The exception to the above is the --colour  (or  --color)  option,  for
339           which  the  data is optional. If this option does have data, it must be
340           given in the first form, using an equals character. Otherwise  it  will
341           be assumed that it has no data.
342    
343    
344    MATCHING ERRORS
345    
346           It  is  possible  to supply a regular expression that takes a very long
347           time to fail to match certain lines.  Such  patterns  normally  involve
348           nested  indefinite repeats, for example: (a+)*\d when matched against a
349           line of a's with no final digit.  The  PCRE  matching  function  has  a
350           resource  limit that causes it to abort in these circumstances. If this
351           happens, pcregrep outputs an error message and the line that caused the
352           problem  to  the  standard error stream. If there are more than 20 such
353           errors, pcregrep gives up.
354    
355    
356  DIAGNOSTICS  DIAGNOSTICS
357    
358         Exit status is 0 if any matches were found, 1 if no matches were found,         Exit status is 0 if any matches were found, 1 if no matches were found,
359         and  2 for syntax errors and non-existent or inacessible files (even if         and  2 for syntax errors and non-existent or inacessible files (even if
360         matches were found in other files). Using the  -s  option  to  suppress         matches were found in other files) or too many matching  errors.  Using
361         error messages about inaccessble files does not affect the return code.         the  -s  option to suppress error messages about inaccessble files does
362           not affect the return code.
363    
364    
365  AUTHOR  AUTHOR
# Line 221  AUTHOR Line 368  AUTHOR
368         University Computing Service         University Computing Service
369         Cambridge CB2 3QG, England.         Cambridge CB2 3QG, England.
370    
371  Last updated: 16 May 2005  Last updated: 23 January 2006
372  Copyright (c) 1997-2005 University of Cambridge.  Copyright (c) 1997-2006 University of Cambridge.

Legend:
Removed from v.86  
changed lines
  Added in v.87

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12