/[pcre]/code/trunk/doc/pcregrep.txt
ViewVC logotype

Contents of /code/trunk/doc/pcregrep.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 691 - (hide annotations) (download)
Sun Sep 11 14:31:21 2011 UTC (20 months, 1 week ago) by ph10
File MIME type: text/plain
File size: 35375 byte(s)
Final source and document tidies for 8.20-RC1.

1 nigel 73 PCREGREP(1) PCREGREP(1)
2 nigel 49
3    
4 nigel 73 NAME
5     pcregrep - a grep with Perl-compatible regular expressions.
6    
7 nigel 79
8 nigel 49 SYNOPSIS
9 nigel 87 pcregrep [options] [long options] [pattern] [path1 path2 ...]
10 nigel 49
11    
12 nigel 63 DESCRIPTION
13 nigel 49
14 nigel 73 pcregrep searches files for character patterns, in the same way as
15     other grep commands do, but it uses the PCRE regular expression library
16     to support patterns that are compatible with the regular expressions of
17 nigel 93 Perl 5. See pcrepattern(3) for a full description of syntax and seman-
18     tics of the regular expressions that PCRE supports.
19 nigel 49
20 nigel 87 Patterns, whether supplied on the command line or in a separate file,
21     are given without delimiters. For example:
22 nigel 63
23 nigel 87 pcregrep Thursday /etc/motd
24    
25     If you attempt to use delimiters (for example, by surrounding a pattern
26     with slashes, as is common in Perl scripts), they are interpreted as
27 ph10 286 part of the pattern. Quotes can of course be used to delimit patterns
28     on the command line because they are interpreted by the shell, and
29     indeed they are required if a pattern contains white space or shell
30     metacharacters.
31 nigel 87
32 ph10 286 The first argument that follows any option settings is treated as the
33     single pattern to be matched when neither -e nor -f is present. Con-
34     versely, when one or both of these options are used to specify pat-
35 nigel 87 terns, all arguments are treated as path names. At least one of -e, -f,
36     or an argument pattern must be provided.
37    
38 nigel 77 If no files are specified, pcregrep reads the standard input. The stan-
39 ph10 286 dard input can also be referenced by a name consisting of a single
40 nigel 77 hyphen. For example:
41 nigel 49
42 nigel 77 pcregrep some-pattern /file1 - /file3
43 nigel 49
44 ph10 286 By default, each line that matches a pattern is copied to the standard
45     output, and if there is more than one file, the file name is output at
46     the start of each line, followed by a colon. However, there are options
47     that can change how pcregrep behaves. In particular, the -M option
48     makes it possible to search for patterns that span line boundaries.
49     What defines a line boundary is controlled by the -N (--newline)
50     option.
51 nigel 49
52 ph10 654 The amount of memory used for buffering files that are being scanned is
53     controlled by a parameter that can be set by the --buffer-size option.
54     The default value for this parameter is specified when pcregrep is
55     built, with the default default being 20K. A block of memory three
56     times this size is used (to allow for buffering "before" and "after"
57     lines). An error occurs if a line overflows the buffer.
58 nigel 77
59 ph10 654 Patterns are limited to 8K or BUFSIZ bytes, whichever is the greater.
60     BUFSIZ is defined in <stdio.h>. When there is more than one pattern
61     (specified by the use of -e and/or -f), each pattern is applied to each
62     line in the order in which they are defined, except that all the -e
63     patterns are tried before the -f patterns.
64    
65     By default, as soon as one pattern matches (or fails to match when -v
66     is used), no further patterns are considered. However, if --colour (or
67 ph10 392 --color) is used to colour the matching substrings, or if --only-match-
68 ph10 654 ing, --file-offsets, or --line-offsets is used to output only the part
69     of the line that matched (either shown literally, or as an offset),
70     scanning resumes immediately following the match, so that further
71     matches on the same line can be found. If there are multiple patterns,
72 ph10 392 they are all tried on the remainder of the line, but patterns that fol-
73     low the one that matched are not tried on the earlier part of the line.
74 ph10 286
75 ph10 392 This is the same behaviour as GNU grep, but it does mean that the order
76     in which multiple patterns are specified can affect the output when one
77     of the above options is used.
78    
79 ph10 654 Patterns that can match an empty string are accepted, but empty string
80 ph10 453 matches are never recognized. An example is the pattern
81 ph10 654 "(super)?(man)?", in which all components are optional. This pattern
82     finds all occurrences of both "super" and "man"; the output differs
83     from matching with "super|man" when only the matching substrings are
84 ph10 453 being shown.
85 ph10 392
86 ph10 654 If the LC_ALL or LC_CTYPE environment variable is set, pcregrep uses
87     the value to set a locale when calling the PCRE library. The --locale
88 nigel 87 option can be used to override this.
89 nigel 77
90 nigel 87
91 ph10 286 SUPPORT FOR COMPRESSED FILES
92    
93 ph10 654 It is possible to compile pcregrep so that it uses libz or libbz2 to
94     read files whose names end in .gz or .bz2, respectively. You can find
95 ph10 286 out whether your binary has support for one or both of these file types
96     by running it with the --help option. If the appropriate support is not
97 ph10 654 present, files are treated as plain text. The standard input is always
98 ph10 286 so treated.
99    
100    
101 nigel 63 OPTIONS
102 nigel 49
103 ph10 654 The order in which some of the options appear can affect the output.
104     For example, both the -h and -l options affect the printing of file
105     names. Whichever comes later in the command line will be the one that
106     takes effect. Numerical values for options may be followed by K or M,
107     to signify multiplication by 1024 or 1024*1024 respectively.
108 ph10 429
109 ph10 654 -- This terminates the list of options. It is useful if the next
110 ph10 453 item on the command line starts with a hyphen but is not an
111     option. This allows for the processing of patterns and file-
112 nigel 87 names that start with hyphens.
113 nigel 63
114 nigel 87 -A number, --after-context=number
115 ph10 453 Output number lines of context after each matching line. If
116 nigel 87 filenames and/or line numbers are being output, a hyphen sep-
117 ph10 453 arator is used instead of a colon for the context lines. A
118     line containing "--" is output between each group of lines,
119     unless they are in fact contiguous in the input file. The
120     value of number is expected to be relatively small. However,
121 nigel 87 pcregrep guarantees to have up to 8K of following text avail-
122     able for context output.
123    
124     -B number, --before-context=number
125 ph10 453 Output number lines of context before each matching line. If
126 nigel 87 filenames and/or line numbers are being output, a hyphen sep-
127 ph10 453 arator is used instead of a colon for the context lines. A
128     line containing "--" is output between each group of lines,
129     unless they are in fact contiguous in the input file. The
130     value of number is expected to be relatively small. However,
131 nigel 77 pcregrep guarantees to have up to 8K of preceding text avail-
132 nigel 87 able for context output.
133 nigel 77
134 ph10 654 --buffer-size=number
135     Set the parameter that controls how much memory is used for
136     buffering files that are being scanned.
137    
138 nigel 87 -C number, --context=number
139 ph10 654 Output number lines of context both before and after each
140     matching line. This is equivalent to setting both -A and -B
141 nigel 77 to the same value.
142    
143 nigel 87 -c, --count
144 ph10 654 Do not output individual lines from the files that are being
145 ph10 429 scanned; instead output the number of lines that would other-
146 ph10 654 wise have been shown. If no lines are selected, the number
147     zero is output. If several files are are being scanned, a
148     count is output for each of them. However, if the --files-
149     with-matches option is also used, only those files whose
150 ph10 429 counts are greater than zero are listed. When -c is used, the
151     -A, -B, and -C options are ignored.
152 nigel 49
153 nigel 87 --colour, --color
154     If this option is given without any data, it is equivalent to
155 ph10 654 "--colour=auto". If data is required, it must be given in
156 nigel 87 the same shell item, separated by an equals sign.
157    
158     --colour=value, --color=value
159 ph10 392 This option specifies under what circumstances the parts of a
160 nigel 87 line that matched a pattern should be coloured in the output.
161 ph10 654 By default, the output is not coloured. The value (which is
162     optional, see above) may be "never", "always", or "auto". In
163     the latter case, colouring happens only if the standard out-
164     put is connected to a terminal. More resources are used when
165     colouring is enabled, because pcregrep has to search for all
166     possible matches in a line, not just one, in order to colour
167 ph10 392 them all.
168 nigel 87
169 ph10 392 The colour that is used can be specified by setting the envi-
170     ronment variable PCREGREP_COLOUR or PCREGREP_COLOR. The value
171     of this variable should be a string of two numbers, separated
172 ph10 654 by a semicolon. They are copied directly into the control
173     string for setting colour on a terminal, so it is your
174     responsibility to ensure that they make sense. If neither of
175     the environment variables is set, the default is "1;31",
176 ph10 392 which gives red.
177    
178 nigel 87 -D action, --devices=action
179 ph10 654 If an input path is not a regular file or a directory,
180     "action" specifies how it is to be processed. Valid values
181 ph10 392 are "read" (the default) or "skip" (silently skip the path).
182 nigel 87
183     -d action, --directories=action
184     If an input path is a directory, "action" specifies how it is
185 ph10 654 to be processed. Valid values are "read" (the default),
186     "recurse" (equivalent to the -r option), or "skip" (silently
187     skip the path). In the default case, directories are read as
188     if they were ordinary files. In some operating systems the
189     effect of reading a directory like this is an immediate end-
190 nigel 87 of-file.
191    
192 ph10 286 -e pattern, --regex=pattern, --regexp=pattern
193     Specify a pattern to be matched. This option can be used mul-
194     tiple times in order to specify several patterns. It can also
195 ph10 654 be used as a way of specifying a single pattern that starts
196     with a hyphen. When -e is used, no argument pattern is taken
197     from the command line; all arguments are treated as file
198     names. There is an overall maximum of 100 patterns. They are
199     applied to each line in the order in which they are defined
200 ph10 286 until one matches (or fails to match if -v is used). If -f is
201 ph10 654 used with -e, the command line patterns are matched first,
202     followed by the patterns from the file, independent of the
203     order in which these options are specified. Note that multi-
204 ph10 286 ple use of -e is not the same as a single pattern with alter-
205     natives. For example, X|Y finds the first character in a line
206 ph10 654 that is X or Y, whereas if the two patterns are given sepa-
207 ph10 286 rately, pcregrep finds X if it is present, even if it follows
208 ph10 654 Y in the line. It finds Y only if there is no X in the line.
209     This really matters only if you are using -o to show the
210 ph10 286 part(s) of the line that matched.
211 nigel 87
212 nigel 77 --exclude=pattern
213     When pcregrep is searching the files in a directory as a con-
214 ph10 654 sequence of the -r (recursive search) option, any regular
215 ph10 345 files whose names match the pattern are excluded. Subdirecto-
216 ph10 654 ries are not excluded by this option; they are searched
217     recursively, subject to the --exclude-dir and --include_dir
218     options. The pattern is a PCRE regular expression, and is
219 ph10 345 matched against the final component of the file name (not the
220 ph10 654 entire path). If a file name matches both --include and
221     --exclude, it is excluded. There is no short form for this
222 nigel 77 option.
223    
224 ph10 572 --exclude-dir=pattern
225 ph10 654 When pcregrep is searching the contents of a directory as a
226     consequence of the -r (recursive search) option, any subdi-
227     rectories whose names match the pattern are excluded. (Note
228     that the --exclude option does not affect subdirectories.)
229     The pattern is a PCRE regular expression, and is matched
230     against the final component of the name (not the entire
231     path). If a subdirectory name matches both --include-dir and
232     --exclude-dir, it is excluded. There is no short form for
233 ph10 345 this option.
234    
235 nigel 87 -F, --fixed-strings
236 ph10 654 Interpret each pattern as a list of fixed strings, separated
237     by newlines, instead of as a regular expression. The -w
238     (match as a word) and -x (match whole line) options can be
239 nigel 87 used with -F. They apply to each of the fixed strings. A line
240     is selected if any of the fixed strings are found in it (sub-
241     ject to -w or -x, if present).
242    
243     -f filename, --file=filename
244 ph10 654 Read a number of patterns from the file, one per line, and
245     match them against each line of input. A data line is output
246 nigel 87 if any of the patterns match it. The filename can be given as
247     "-" to refer to the standard input. When -f is used, patterns
248 ph10 654 specified on the command line using -e may also be present;
249 nigel 87 they are tested before the file's patterns. However, no other
250 ph10 654 pattern is taken from the command line; all arguments are
251     treated as file names. There is an overall maximum of 100
252 nigel 87 patterns. Trailing white space is removed from each line, and
253 ph10 654 blank lines are ignored. An empty file contains no patterns
254     and therefore matches nothing. See also the comments about
255     multiple patterns versus a single pattern with alternatives
256 ph10 286 in the description of -e above.
257 nigel 53
258 ph10 286 --file-offsets
259 ph10 654 Instead of showing lines or parts of lines that match, show
260     each match as an offset from the start of the file and a
261     length, separated by a comma. In this mode, no context is
262     shown. That is, the -A, -B, and -C options are ignored. If
263 ph10 286 there is more than one match in a line, each of them is shown
264 ph10 654 separately. This option is mutually exclusive with --line-
265 ph10 286 offsets and --only-matching.
266    
267 nigel 87 -H, --with-filename
268 ph10 654 Force the inclusion of the filename at the start of output
269     lines when searching a single file. By default, the filename
270     is not shown in this case. For matching lines, the filename
271 ph10 392 is followed by a colon; for context lines, a hyphen separator
272 ph10 654 is used. If a line number is also being output, it follows
273 ph10 392 the file name.
274 nigel 49
275 nigel 87 -h, --no-filename
276 ph10 654 Suppress the output filenames when searching multiple files.
277     By default, filenames are shown when multiple files are
278     searched. For matching lines, the filename is followed by a
279     colon; for context lines, a hyphen separator is used. If a
280 ph10 392 line number is also being output, it follows the file name.
281 nigel 49
282 ph10 654 --help Output a help message, giving brief details of the command
283 ph10 286 options and file type support, and then exit.
284 nigel 87
285     -i, --ignore-case
286     Ignore upper/lower case distinctions during comparisons.
287    
288 nigel 77 --include=pattern
289     When pcregrep is searching the files in a directory as a con-
290 ph10 345 sequence of the -r (recursive search) option, only those reg-
291     ular files whose names match the pattern are included. Subdi-
292 ph10 654 rectories are always included and searched recursively, sub-
293 ph10 572 ject to the --include-dir and --exclude-dir options. The pat-
294 ph10 345 tern is a PCRE regular expression, and is matched against the
295 ph10 654 final component of the file name (not the entire path). If a
296 ph10 345 file name matches both --include and --exclude, it is
297     excluded. There is no short form for this option.
298 nigel 49
299 ph10 572 --include-dir=pattern
300 ph10 654 When pcregrep is searching the contents of a directory as a
301     consequence of the -r (recursive search) option, only those
302     subdirectories whose names match the pattern are included.
303     (Note that the --include option does not affect subdirecto-
304     ries.) The pattern is a PCRE regular expression, and is
305     matched against the final component of the name (not the
306     entire path). If a subdirectory name matches both --include-
307 ph10 572 dir and --exclude-dir, it is excluded. There is no short form
308     for this option.
309 ph10 345
310 nigel 87 -L, --files-without-match
311 ph10 654 Instead of outputting lines from the files, just output the
312     names of the files that do not contain any lines that would
313     have been output. Each file name is output once, on a sepa-
314 nigel 77 rate line.
315    
316 nigel 87 -l, --files-with-matches
317 ph10 654 Instead of outputting lines from the files, just output the
318 nigel 87 names of the files containing lines that would have been out-
319 ph10 654 put. Each file name is output once, on a separate line.
320     Searching normally stops as soon as a matching line is found
321     in a file. However, if the -c (count) option is also used,
322     matching continues in order to obtain the correct count, and
323     those files that have at least one match are listed along
324 ph10 429 with their counts. Using this option with -c is a way of sup-
325     pressing the listing of files with no matches.
326 nigel 77
327     --label=name
328     This option supplies a name to be used for the standard input
329 nigel 87 when file names are being output. If not supplied, "(standard
330     input)" is used. There is no short form for this option.
331 nigel 77
332 ph10 535 --line-buffered
333 ph10 654 When this option is given, input is read and processed line
334     by line, and the output is flushed after each write. By
335     default, input is read in large chunks, unless pcregrep can
336     determine that it is reading from a terminal (which is cur-
337     rently possible only in Unix environments). Output to termi-
338     nal is normally automatically flushed by the operating sys-
339     tem. This option can be useful when the input or output is
340     attached to a pipe and you do not want pcregrep to buffer up
341     large amounts of data. However, its use will affect perfor-
342 ph10 535 mance, and the -M (multiline) option ceases to work.
343    
344 ph10 286 --line-offsets
345 ph10 654 Instead of showing lines or parts of lines that match, show
346 ph10 286 each match as a line number, the offset from the start of the
347 ph10 654 line, and a length. The line number is terminated by a colon
348     (as usual; see the -n option), and the offset and length are
349     separated by a comma. In this mode, no context is shown.
350     That is, the -A, -B, and -C options are ignored. If there is
351     more than one match in a line, each of them is shown sepa-
352 ph10 286 rately. This option is mutually exclusive with --file-offsets
353     and --only-matching.
354    
355 nigel 87 --locale=locale-name
356 ph10 654 This option specifies a locale to be used for pattern match-
357     ing. It overrides the value in the LC_ALL or LC_CTYPE envi-
358     ronment variables. If no locale is specified, the PCRE
359     library's default (usually the "C" locale) is used. There is
360 nigel 87 no short form for this option.
361    
362 ph10 567 --match-limit=number
363 ph10 654 Processing some regular expression patterns can require a
364     very large amount of memory, leading in some cases to a pro-
365     gram crash if not enough is available. Other patterns may
366     take a very long time to search for all possible matching
367     strings. The pcre_exec() function that is called by pcregrep
368     to do the matching has two parameters that can limit the
369 ph10 567 resources that it uses.
370    
371 ph10 654 The --match-limit option provides a means of limiting
372 ph10 567 resource usage when processing patterns that are not going to
373     match, but which have a very large number of possibilities in
374 ph10 654 their search trees. The classic example is a pattern that
375     uses nested unlimited repeats. Internally, PCRE uses a func-
376     tion called match() which it calls repeatedly (sometimes
377     recursively). The limit set by --match-limit is imposed on
378     the number of times this function is called during a match,
379     which has the effect of limiting the amount of backtracking
380 ph10 567 that can take place.
381    
382     The --recursion-limit option is similar to --match-limit, but
383     instead of limiting the total number of times that match() is
384     called, it limits the depth of recursive calls, which in turn
385 ph10 654 limits the amount of memory that can be used. The recursion
386     depth is a smaller number than the total number of calls,
387 ph10 567 because not all calls to match() are recursive. This limit is
388     of use only if it is set smaller than --match-limit.
389    
390 ph10 654 There are no short forms for these options. The default set-
391     tings are specified when the PCRE library is compiled, with
392 ph10 567 the default default being 10 million.
393    
394 nigel 87 -M, --multiline
395 ph10 654 Allow patterns to match more than one line. When this option
396 nigel 77 is given, patterns may usefully contain literal newline char-
397 ph10 654 acters and internal occurrences of ^ and $ characters. The
398     output for a successful match may consist of more than one
399     line, the last of which is the one in which the match ended.
400 ph10 589 If the matched string ends with a newline sequence the output
401     ends at the end of that line.
402    
403 ph10 654 When this option is set, the PCRE library is called in "mul-
404     tiline" mode. There is a limit to the number of lines that
405     can be matched, imposed by the way that pcregrep buffers the
406     input file as it scans it. However, pcregrep ensures that at
407 nigel 77 least 8K characters or the rest of the document (whichever is
408 ph10 654 the shorter) are available for forward matching, and simi-
409 nigel 77 larly the previous 8K characters (or all the previous charac-
410 ph10 654 ters, if fewer than 8K) are guaranteed to be available for
411     lookbehind assertions. This option does not work when input
412 ph10 535 is read line by line (see --line-buffered.)
413 nigel 77
414 nigel 91 -N newline-type, --newline=newline-type
415 ph10 654 The PCRE library supports five different conventions for
416     indicating the ends of lines. They are the single-character
417     sequences CR (carriage return) and LF (linefeed), the two-
418     character sequence CRLF, an "anycrlf" convention, which rec-
419     ognizes any of the preceding three types, and an "any" con-
420 ph10 150 vention, in which any Unicode line ending sequence is assumed
421 ph10 654 to end a line. The Unicode sequences are the three just men-
422     tioned, plus VT (vertical tab, U+000B), FF (form feed,
423     U+000C), NEL (next line, U+0085), LS (line separator,
424 ph10 150 U+2028), and PS (paragraph separator, U+2029).
425 nigel 91
426 nigel 93 When the PCRE library is built, a default line-ending
427 ph10 654 sequence is specified. This is normally the standard
428 nigel 93 sequence for the operating system. Unless otherwise specified
429 ph10 654 by this option, pcregrep uses the library's default. The
430 ph10 150 possible values for this option are CR, LF, CRLF, ANYCRLF, or
431 ph10 654 ANY. This makes it possible to use pcregrep on files that
432     have come from other environments without having to modify
433     their line endings. If the data that is being scanned does
434     not agree with the convention set by this option, pcregrep
435 ph10 150 may behave in strange ways.
436 nigel 93
437 nigel 87 -n, --line-number
438     Precede each output line by its line number in the file, fol-
439 ph10 654 lowed by a colon for matching lines or a hyphen for context
440     lines. If the filename is also being output, it precedes the
441 ph10 392 line number. This option is forced if --line-offsets is used.
442 nigel 49
443 ph10 691 --no-jit If the PCRE library is built with support for just-in-time
444     compiling (which speeds up matching), pcregrep automatically
445     makes use of this, unless it was explicitly disabled at build
446     time. This option can be used to disable the use of JIT at
447     run time. It is provided for testing and working round prob-
448     lems. It should never be needed in normal use.
449    
450 nigel 87 -o, --only-matching
451 ph10 567 Show only the part of the line that matched a pattern instead
452 ph10 691 of the whole line. In this mode, no context is shown. That
453     is, the -A, -B, and -C options are ignored. If there is more
454     than one match in a line, each of them is shown separately.
455     If -o is combined with -v (invert the sense of the match to
456     find non-matching lines), no output is generated, but the
457     return code is set appropriately. If the matched portion of
458     the line is empty, nothing is output unless the file name or
459     line number are being printed, in which case they are shown
460 ph10 567 on an otherwise empty line. This option is mutually exclusive
461     with --file-offsets and --line-offsets.
462 nigel 87
463 ph10 567 -onumber, --only-matching=number
464 ph10 691 Show only the part of the line that matched the capturing
465 ph10 567 parentheses of the given number. Up to 32 capturing parenthe-
466     ses are supported. Because these options can be given without
467 ph10 691 an argument (see above), if an argument is present, it must
468     be given in the same shell item, for example, -o3 or --only-
469     matching=2. The comments given for the non-argument case
470     above also apply to this case. If the specified capturing
471     parentheses do not exist in the pattern, or were not set in
472     the match, nothing is output unless the file name or line
473 ph10 567 number are being printed.
474    
475 nigel 87 -q, --quiet
476     Work quietly, that is, display nothing except error messages.
477 ph10 691 The exit status indicates whether or not any matches were
478 nigel 73 found.
479 nigel 49
480 nigel 87 -r, --recursive
481 ph10 691 If any given path is a directory, recursively scan the files
482     it contains, taking note of any --include and --exclude set-
483     tings. By default, a directory is read as a normal file; in
484     some operating systems this gives an immediate end-of-file.
485     This option is a shorthand for setting the -d option to
486 nigel 87 "recurse".
487 nigel 77
488 ph10 567 --recursion-limit=number
489     See --match-limit above.
490    
491 nigel 87 -s, --no-messages
492 ph10 691 Suppress error messages about non-existent or unreadable
493     files. Such files are quietly skipped. However, the return
494 nigel 77 code is still 2, even if matches were found in other files.
495    
496 nigel 87 -u, --utf-8
497 ph10 691 Operate in UTF-8 mode. This option is available only if PCRE
498     has been compiled with UTF-8 support. Both patterns and sub-
499 nigel 87 ject lines must be valid strings of UTF-8 characters.
500 nigel 63
501 nigel 87 -V, --version
502 ph10 691 Write the version numbers of pcregrep and the PCRE library
503 nigel 77 that is being used to the standard error stream.
504 nigel 49
505 nigel 87 -v, --invert-match
506 ph10 691 Invert the sense of the match, so that lines which do not
507 nigel 87 match any of the patterns are the ones that are found.
508 nigel 77
509 nigel 87 -w, --word-regex, --word-regexp
510     Force the patterns to match only whole words. This is equiva-
511 nigel 77 lent to having \b at the start and end of the pattern.
512    
513 nigel 87 -x, --line-regex, --line-regexp
514 ph10 691 Force the patterns to be anchored (each must start matching
515     at the beginning of a line) and in addition, require them to
516     match entire lines. This is equivalent to having ^ and $
517 nigel 73 characters at the start and end of each alternative branch in
518 nigel 87 every pattern.
519 nigel 49
520    
521 nigel 87 ENVIRONMENT VARIABLES
522 nigel 49
523 ph10 691 The environment variables LC_ALL and LC_CTYPE are examined, in that
524     order, for a locale. The first one that is set is used. This can be
525     overridden by the --locale option. If no locale is set, the PCRE
526 nigel 87 library's default (usually the "C" locale) is used.
527 nigel 49
528    
529 nigel 91 NEWLINES
530    
531 ph10 691 The -N (--newline) option allows pcregrep to scan files with different
532     newline conventions from the default. However, the setting of this
533     option does not affect the way in which pcregrep writes information to
534     the standard error and output streams. It uses the string "\n" in C
535     printf() calls to indicate newlines, relying on the C I/O library to
536     convert this to an appropriate sequence if the output is sent to a
537 nigel 91 file.
538    
539    
540 nigel 87 OPTIONS COMPATIBILITY
541 nigel 49
542 ph10 691 Many of the short and long forms of pcregrep's options are the same as
543     in the GNU grep program (version 2.5.4). Any long option of the form
544     --xxx-regexp (GNU terminology) is also available as --xxx-regex (PCRE
545     terminology). However, the --file-offsets, --include-dir, --line-off-
546 ph10 572 sets, --locale, --match-limit, -M, --multiline, -N, --newline, --recur-
547     sion-limit, -u, and --utf-8 options are specific to pcregrep, as is the
548     use of the --only-matching option with a capturing parentheses number.
549 nigel 87
550 ph10 691 Although most of the common options work the same way, a few are dif-
551     ferent in pcregrep. For example, the --include option's argument is a
552     glob for GNU grep, but a regular expression for pcregrep. If both the
553     -c and -l options are given, GNU grep lists only file names, without
554 ph10 572 counts, but pcregrep gives the counts.
555 nigel 87
556 ph10 572
557 nigel 77 OPTIONS WITH DATA
558 nigel 49
559 nigel 77 There are four different ways in which an option with data can be spec-
560 ph10 691 ified. If a short form option is used, the data may follow immedi-
561 ph10 572 ately, or (with one exception) in the next command line item. For exam-
562     ple:
563 nigel 77
564     -f/some/file
565     -f /some/file
566    
567 ph10 691 The exception is the -o option, which may appear with or without data.
568     Because of this, if data is present, it must follow immediately in the
569 ph10 572 same item, for example -o3.
570    
571 ph10 691 If a long form option is used, the data may appear in the same command
572     line item, separated by an equals character, or (with two exceptions)
573 ph10 572 it may appear in the next command line item. For example:
574 nigel 77
575     --file=/some/file
576     --file /some/file
577    
578 ph10 691 Note, however, that if you want to supply a file name beginning with ~
579     as data in a shell command, and have the shell expand ~ to a home
580 nigel 87 directory, you must separate the file name from the option, because the
581 ph10 392 shell does not treat ~ specially unless it is at the start of an item.
582 nigel 77
583 ph10 691 The exceptions to the above are the --colour (or --color) and --only-
584     matching options, for which the data is optional. If one of these
585     options does have data, it must be given in the first form, using an
586 ph10 579 equals character. Otherwise pcregrep will assume that it has no data.
587 nigel 87
588    
589     MATCHING ERRORS
590    
591 ph10 691 It is possible to supply a regular expression that takes a very long
592     time to fail to match certain lines. Such patterns normally involve
593     nested indefinite repeats, for example: (a+)*\d when matched against a
594     line of a's with no final digit. The PCRE matching function has a
595     resource limit that causes it to abort in these circumstances. If this
596 nigel 87 happens, pcregrep outputs an error message and the line that caused the
597 ph10 691 problem to the standard error stream. If there are more than 20 such
598 nigel 87 errors, pcregrep gives up.
599    
600 ph10 691 The --match-limit option of pcregrep can be used to set the overall
601     resource limit; there is a second option called --recursion-limit that
602     sets a limit on the amount of memory (usually stack) that is used (see
603 ph10 572 the discussion of these options above).
604 nigel 87
605 ph10 572
606 nigel 63 DIAGNOSTICS
607 nigel 49
608 nigel 73 Exit status is 0 if any matches were found, 1 if no matches were found,
609 ph10 691 and 2 for syntax errors, overlong lines, non-existent or inaccessible
610     files (even if matches were found in other files) or too many matching
611 ph10 654 errors. Using the -s option to suppress error messages about inaccessi-
612     ble files does not affect the return code.
613 nigel 49
614    
615 nigel 93 SEE ALSO
616    
617     pcrepattern(3), pcretest(1).
618    
619    
620 nigel 49 AUTHOR
621 nigel 63
622 nigel 77 Philip Hazel
623 nigel 73 University Computing Service
624 nigel 93 Cambridge CB2 3QH, England.
625 nigel 49
626 ph10 99
627     REVISION
628    
629 ph10 691 Last updated: 06 September 2011
630 ph10 589 Copyright (c) 1997-2011 University of Cambridge.

Properties

Name Value
svn:eol-style native
svn:keywords "Author Date Id Revision Url"

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12