/[pcre]/code/trunk/doc/pcregrep.txt
ViewVC logotype

Diff of /code/trunk/doc/pcregrep.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 92 by nigel, Sat Feb 24 21:41:34 2007 UTC revision 93 by nigel, Sat Feb 24 21:41:42 2007 UTC
# Line 14  DESCRIPTION Line 14  DESCRIPTION
14         pcregrep  searches  files  for  character  patterns, in the same way as         pcregrep  searches  files  for  character  patterns, in the same way as
15         other grep commands do, but it uses the PCRE regular expression library         other grep commands do, but it uses the PCRE regular expression library
16         to support patterns that are compatible with the regular expressions of         to support patterns that are compatible with the regular expressions of
17         Perl 5. See pcrepattern for a full description of syntax and  semantics         Perl 5. See pcrepattern(3) for a full description of syntax and  seman-
18         of the regular expressions that PCRE supports.         tics of the regular expressions that PCRE supports.
19    
20         Patterns,  whether  supplied on the command line or in a separate file,         Patterns,  whether  supplied on the command line or in a separate file,
21         are given without delimiters. For example:         are given without delimiters. For example:
# Line 245  OPTIONS Line 245  OPTIONS
245                   lookbehind assertions.                   lookbehind assertions.
246    
247         -N newline-type, --newline=newline-type         -N newline-type, --newline=newline-type
248                   The PCRE library supports three different character sequences                   The  PCRE  library  supports  four  different conventions for
249                   for indicating the ends of lines. They are the single-charac-                   indicating the ends of lines. They are  the  single-character
250                   ter sequences CR (carriage return) and LF (linefeed), and the                   sequences  CR  (carriage  return) and LF (linefeed), the two-
251                   two-character sequence CR, LF. When the library is  built,  a                   character sequence CRLF, and an "any"  convention,  in  which
252                   default  line-ending  sequence is specified. This is normally                   any  Unicode  line  ending sequence is assumed to end a line.
253                   the standard sequence for the operating system. Unless other-                   The Unicode sequences are the three just mentioned,  plus  VT
254                   wise specified by this option, pcregrep uses the default. The                   (vertical  tab,  U+000B),  FF  (formfeed,  U+000C), NEL (next
255                   possible values for this option are CR,  LF,  or  CRLF.  This                   line, U+0085), LS (line separator, U+2028), and PS (paragraph
256                   makes  it  possible  to  use pcregrep on files that have come                   separator, U+0029).
257                   from other environments without having to modify  their  line  
258                   endings.   If  the  data that is being scanned does not agree                   When  the  PCRE  library  is  built,  a  default  line-ending
259                   with the convention set by this option, pcregrep  may  behave                   sequence  is  specified.   This  is  normally  the   standard
260                   in strange ways.                   sequence for the operating system. Unless otherwise specified
261                     by this option, pcregrep uses  the  library's  default.   The
262                     possible  values  for  this  option are CR, LF, CRLF, or ANY.
263                     This makes it possible to use pcregrep  on  files  that  have
264                     come  from  other environments without having to modify their
265                     line endings. If the data that  is  being  scanned  does  not
266                     agree  with  the  convention set by this option, pcregrep may
267                     behave in strange ways.
268    
269         -n, --line-number         -n, --line-number
270                   Precede each output line by its line number in the file, fol-                   Precede each output line by its line number in the file, fol-
271                   lowed by a colon and a space for matching lines or  a  hyphen                   lowed  by  a colon and a space for matching lines or a hyphen
272                   and  a space for context lines. If the filename is also being                   and a space for context lines. If the filename is also  being
273                   output, it precedes the line number.                   output, it precedes the line number.
274    
275         -o, --only-matching         -o, --only-matching
276                   Show only the part of the line that  matched  a  pattern.  In                   Show  only  the  part  of the line that matched a pattern. In
277                   this  mode,  no context is shown. That is, the -A, -B, and -C                   this mode, no context is shown. That is, the -A, -B,  and  -C
278                   options are ignored.                   options are ignored.
279    
280         -q, --quiet         -q, --quiet
281                   Work quietly, that is, display nothing except error messages.                   Work quietly, that is, display nothing except error messages.
282                   The  exit  status  indicates  whether or not any matches were                   The exit status indicates whether or  not  any  matches  were
283                   found.                   found.
284    
285         -r, --recursive         -r, --recursive
286                   If any given path is a directory, recursively scan the  files                   If  any given path is a directory, recursively scan the files
287                   it  contains, taking note of any --include and --exclude set-                   it contains, taking note of any --include and --exclude  set-
288                   tings. By default, a directory is read as a normal  file;  in                   tings.  By  default, a directory is read as a normal file; in
289                   some  operating  systems this gives an immediate end-of-file.                   some operating systems this gives an  immediate  end-of-file.
290                   This option is a shorthand  for  setting  the  -d  option  to                   This  option  is  a  shorthand  for  setting the -d option to
291                   "recurse".                   "recurse".
292    
293         -s, --no-messages         -s, --no-messages
294                   Suppress  error  messages  about  non-existent  or unreadable                   Suppress error  messages  about  non-existent  or  unreadable
295                   files. Such files are quietly skipped.  However,  the  return                   files.  Such  files  are quietly skipped. However, the return
296                   code is still 2, even if matches were found in other files.                   code is still 2, even if matches were found in other files.
297    
298         -u, --utf-8         -u, --utf-8
299                   Operate  in UTF-8 mode. This option is available only if PCRE                   Operate in UTF-8 mode. This option is available only if  PCRE
300                   has been compiled with UTF-8 support. Both patterns and  sub-                   has  been compiled with UTF-8 support. Both patterns and sub-
301                   ject lines must be valid strings of UTF-8 characters.                   ject lines must be valid strings of UTF-8 characters.
302    
303         -V, --version         -V, --version
304                   Write  the  version  numbers of pcregrep and the PCRE library                   Write the version numbers of pcregrep and  the  PCRE  library
305                   that is being used to the standard error stream.                   that is being used to the standard error stream.
306    
307         -v, --invert-match         -v, --invert-match
308                   Invert the sense of the match, so that  lines  which  do  not                   Invert  the  sense  of  the match, so that lines which do not
309                   match any of the patterns are the ones that are found.                   match any of the patterns are the ones that are found.
310    
311         -w, --word-regex, --word-regexp         -w, --word-regex, --word-regexp
# Line 306  OPTIONS Line 313  OPTIONS
313                   lent to having \b at the start and end of the pattern.                   lent to having \b at the start and end of the pattern.
314    
315         -x, --line-regex, --line-regexp         -x, --line-regex, --line-regexp
316                   Force the patterns to be anchored (each must  start  matching                   Force  the  patterns to be anchored (each must start matching
317                   at  the beginning of a line) and in addition, require them to                   at the beginning of a line) and in addition, require them  to
318                   match entire lines. This is equivalent  to  having  ^  and  $                   match  entire  lines.  This  is  equivalent to having ^ and $
319                   characters at the start and end of each alternative branch in                   characters at the start and end of each alternative branch in
320                   every pattern.                   every pattern.
321    
322    
323  ENVIRONMENT VARIABLES  ENVIRONMENT VARIABLES
324    
325         The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that         The  environment  variables  LC_ALL  and LC_CTYPE are examined, in that
326         order,  for  a  locale.  The first one that is set is used. This can be         order, for a locale. The first one that is set is  used.  This  can  be
327         overridden by the --locale option.  If  no  locale  is  set,  the  PCRE         overridden  by  the  --locale  option.  If  no  locale is set, the PCRE
328         library's default (usually the "C" locale) is used.         library's default (usually the "C" locale) is used.
329    
330    
331  NEWLINES  NEWLINES
332    
333         The  -N (--newline) option allows pcregrep to scan files with different         The -N (--newline) option allows pcregrep to scan files with  different
334         newline conventions from the default.  However,  the  setting  of  this         newline  conventions  from  the  default.  However, the setting of this
335         option  does not affect the way in which pcregrep writes information to         option does not affect the way in which pcregrep writes information  to
336         the standard error and output streams. It uses the  string  "\n"  in  C         the  standard  error  and  output streams. It uses the string "\n" in C
337         printf()  calls  to  indicate newlines, relying on the C I/O library to         printf() calls to indicate newlines, relying on the C  I/O  library  to
338         convert this to an appropriate sequence if the  output  is  sent  to  a         convert  this  to  an  appropriate  sequence if the output is sent to a
339         file.         file.
340    
341    
342  OPTIONS COMPATIBILITY  OPTIONS COMPATIBILITY
343    
344         The majority of short and long forms of pcregrep's options are the same         The majority of short and long forms of pcregrep's options are the same
345         as in the GNU grep program. Any long option of  the  form  --xxx-regexp         as  in  the  GNU grep program. Any long option of the form --xxx-regexp
346         (GNU  terminology) is also available as --xxx-regex (PCRE terminology).         (GNU terminology) is also available as --xxx-regex (PCRE  terminology).
347         However, the --locale, -M, --multiline, -u,  and  --utf-8  options  are         However,  the  --locale,  -M,  --multiline, -u, and --utf-8 options are
348         specific to pcregrep.         specific to pcregrep.
349    
350    
351  OPTIONS WITH DATA  OPTIONS WITH DATA
352    
353         There are four different ways in which an option with data can be spec-         There are four different ways in which an option with data can be spec-
354         ified.  If a short form option is used, the  data  may  follow  immedi-         ified.   If  a  short  form option is used, the data may follow immedi-
355         ately, or in the next command line item. For example:         ately, or in the next command line item. For example:
356    
357           -f/some/file           -f/some/file
358           -f /some/file           -f /some/file
359    
360         If  a long form option is used, the data may appear in the same command         If a long form option is used, the data may appear in the same  command
361         line item, separated by an equals character, or (with one exception) it         line item, separated by an equals character, or (with one exception) it
362         may appear in the next command line item. For example:         may appear in the next command line item. For example:
363    
364           --file=/some/file           --file=/some/file
365           --file /some/file           --file /some/file
366    
367         Note,  however, that if you want to supply a file name beginning with ~         Note, however, that if you want to supply a file name beginning with  ~
368         as data in a shell command, and have the  shell  expand  ~  to  a  home         as  data  in  a  shell  command,  and have the shell expand ~ to a home
369         directory, you must separate the file name from the option, because the         directory, you must separate the file name from the option, because the
370         shell does not treat ~ specially unless it is at the start of an  item.         shell  does not treat ~ specially unless it is at the start of an item.
371    
372         The  exception  to  the  above is the --colour (or --color) option, for         The exception to the above is the --colour  (or  --color)  option,  for
373         which the data is optional. If this option does have data, it  must  be         which  the  data is optional. If this option does have data, it must be
374         given  in  the first form, using an equals character. Otherwise it will         given in the first form, using an equals character. Otherwise  it  will
375         be assumed that it has no data.         be assumed that it has no data.
376    
377    
378  MATCHING ERRORS  MATCHING ERRORS
379    
380         It is possible to supply a regular expression that takes  a  very  long         It  is  possible  to supply a regular expression that takes a very long
381         time  to  fail  to  match certain lines. Such patterns normally involve         time to fail to match certain lines.  Such  patterns  normally  involve
382         nested indefinite repeats, for example: (a+)*\d when matched against  a         nested  indefinite repeats, for example: (a+)*\d when matched against a
383         line  of  a's  with  no  final  digit. The PCRE matching function has a         line of a's with no final digit.  The  PCRE  matching  function  has  a
384         resource limit that causes it to abort in these circumstances. If  this         resource  limit that causes it to abort in these circumstances. If this
385         happens, pcregrep outputs an error message and the line that caused the         happens, pcregrep outputs an error message and the line that caused the
386         problem to the standard error stream. If there are more  than  20  such         problem  to  the  standard error stream. If there are more than 20 such
387         errors, pcregrep gives up.         errors, pcregrep gives up.
388    
389    
390  DIAGNOSTICS  DIAGNOSTICS
391    
392         Exit status is 0 if any matches were found, 1 if no matches were found,         Exit status is 0 if any matches were found, 1 if no matches were found,
393         and 2 for syntax errors and non-existent or inacessible files (even  if         and  2 for syntax errors and non-existent or inacessible files (even if
394         matches  were  found in other files) or too many matching errors. Using         matches were found in other files) or too many matching  errors.  Using
395         the -s option to suppress error messages about inaccessble  files  does         the  -s  option to suppress error messages about inaccessble files does
396         not affect the return code.         not affect the return code.
397    
398    
399    SEE ALSO
400    
401           pcrepattern(3), pcretest(1).
402    
403    
404  AUTHOR  AUTHOR
405    
406         Philip Hazel         Philip Hazel
407         University Computing Service         University Computing Service
408         Cambridge CB2 3QG, England.         Cambridge CB2 3QH, England.
409    
410  Last updated: 06 June 2006  Last updated: 29 November 2006
411  Copyright (c) 1997-2006 University of Cambridge.  Copyright (c) 1997-2006 University of Cambridge.

Legend:
Removed from v.92  
changed lines
  Added in v.93

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12