/[pcre]/code/trunk/doc/html/pcregrep.html
ViewVC logotype

Contents of /code/trunk/doc/html/pcregrep.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 691 - (hide annotations) (download) (as text)
Sun Sep 11 14:31:21 2011 UTC (20 months, 1 week ago) by ph10
File MIME type: text/html
File size: 32004 byte(s)
Final source and document tidies for 8.20-RC1.

1 nigel 63 <html>
2     <head>
3     <title>pcregrep specification</title>
4     </head>
5     <body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
6 nigel 75 <h1>pcregrep man page</h1>
7     <p>
8     Return to the <a href="index.html">PCRE index page</a>.
9     </p>
10 ph10 111 <p>
11 nigel 75 This page is part of the PCRE HTML documentation. It was generated automatically
12     from the original man page. If there is any nonsense in it, please consult the
13     man page, in case the conversion went wrong.
14 ph10 111 <br>
15 nigel 63 <ul>
16     <li><a name="TOC1" href="#SEC1">SYNOPSIS</a>
17     <li><a name="TOC2" href="#SEC2">DESCRIPTION</a>
18 ph10 286 <li><a name="TOC3" href="#SEC3">SUPPORT FOR COMPRESSED FILES</a>
19     <li><a name="TOC4" href="#SEC4">OPTIONS</a>
20     <li><a name="TOC5" href="#SEC5">ENVIRONMENT VARIABLES</a>
21     <li><a name="TOC6" href="#SEC6">NEWLINES</a>
22     <li><a name="TOC7" href="#SEC7">OPTIONS COMPATIBILITY</a>
23     <li><a name="TOC8" href="#SEC8">OPTIONS WITH DATA</a>
24     <li><a name="TOC9" href="#SEC9">MATCHING ERRORS</a>
25     <li><a name="TOC10" href="#SEC10">DIAGNOSTICS</a>
26     <li><a name="TOC11" href="#SEC11">SEE ALSO</a>
27     <li><a name="TOC12" href="#SEC12">AUTHOR</a>
28     <li><a name="TOC13" href="#SEC13">REVISION</a>
29 nigel 63 </ul>
30     <br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>
31     <P>
32 nigel 87 <b>pcregrep [options] [long options] [pattern] [path1 path2 ...]</b>
33 nigel 63 </P>
34     <br><a name="SEC2" href="#TOC1">DESCRIPTION</a><br>
35     <P>
36     <b>pcregrep</b> searches files for character patterns, in the same way as other
37     grep commands do, but it uses the PCRE regular expression library to support
38     patterns that are compatible with the regular expressions of Perl 5. See
39 nigel 93 <a href="pcrepattern.html"><b>pcrepattern</b>(3)</a>
40     for a full description of syntax and semantics of the regular expressions
41     that PCRE supports.
42 nigel 63 </P>
43     <P>
44 nigel 87 Patterns, whether supplied on the command line or in a separate file, are given
45     without delimiters. For example:
46     <pre>
47     pcregrep Thursday /etc/motd
48     </pre>
49     If you attempt to use delimiters (for example, by surrounding a pattern with
50     slashes, as is common in Perl scripts), they are interpreted as part of the
51 ph10 286 pattern. Quotes can of course be used to delimit patterns on the command line
52     because they are interpreted by the shell, and indeed they are required if a
53     pattern contains white space or shell metacharacters.
54 nigel 63 </P>
55     <P>
56 nigel 87 The first argument that follows any option settings is treated as the single
57     pattern to be matched when neither <b>-e</b> nor <b>-f</b> is present.
58     Conversely, when one or both of these options are used to specify patterns, all
59     arguments are treated as path names. At least one of <b>-e</b>, <b>-f</b>, or an
60     argument pattern must be provided.
61     </P>
62     <P>
63 nigel 77 If no files are specified, <b>pcregrep</b> reads the standard input. The
64     standard input can also be referenced by a name consisting of a single hyphen.
65     For example:
66     <pre>
67     pcregrep some-pattern /file1 - /file3
68     </pre>
69 ph10 286 By default, each line that matches a pattern is copied to the standard
70 nigel 87 output, and if there is more than one file, the file name is output at the
71 ph10 286 start of each line, followed by a colon. However, there are options that can
72     change how <b>pcregrep</b> behaves. In particular, the <b>-M</b> option makes it
73     possible to search for patterns that span line boundaries. What defines a line
74     boundary is controlled by the <b>-N</b> (<b>--newline</b>) option.
75 nigel 63 </P>
76     <P>
77 ph10 654 The amount of memory used for buffering files that are being scanned is
78     controlled by a parameter that can be set by the <b>--buffer-size</b> option.
79     The default value for this parameter is specified when <b>pcregrep</b> is built,
80     with the default default being 20K. A block of memory three times this size is
81     used (to allow for buffering "before" and "after" lines). An error occurs if a
82     line overflows the buffer.
83 nigel 63 </P>
84 nigel 87 <P>
85 ph10 654 Patterns are limited to 8K or BUFSIZ bytes, whichever is the greater. BUFSIZ is
86     defined in <b>&#60;stdio.h&#62;</b>. When there is more than one pattern (specified by
87     the use of <b>-e</b> and/or <b>-f</b>), each pattern is applied to each line in
88     the order in which they are defined, except that all the <b>-e</b> patterns are
89     tried before the <b>-f</b> patterns.
90     </P>
91     <P>
92 ph10 392 By default, as soon as one pattern matches (or fails to match when <b>-v</b> is
93     used), no further patterns are considered. However, if <b>--colour</b> (or
94     <b>--color</b>) is used to colour the matching substrings, or if
95     <b>--only-matching</b>, <b>--file-offsets</b>, or <b>--line-offsets</b> is used to
96     output only the part of the line that matched (either shown literally, or as an
97     offset), scanning resumes immediately following the match, so that further
98     matches on the same line can be found. If there are multiple patterns, they are
99     all tried on the remainder of the line, but patterns that follow the one that
100     matched are not tried on the earlier part of the line.
101 ph10 286 </P>
102     <P>
103 ph10 392 This is the same behaviour as GNU grep, but it does mean that the order in
104     which multiple patterns are specified can affect the output when one of the
105     above options is used.
106     </P>
107     <P>
108     Patterns that can match an empty string are accepted, but empty string
109 ph10 453 matches are never recognized. An example is the pattern "(super)?(man)?", in
110 ph10 392 which all components are optional. This pattern finds all occurrences of both
111     "super" and "man"; the output differs from matching with "super|man" when only
112     the matching substrings are being shown.
113     </P>
114     <P>
115 nigel 87 If the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variable is set,
116     <b>pcregrep</b> uses the value to set a locale when calling the PCRE library.
117     The <b>--locale</b> option can be used to override this.
118     </P>
119 ph10 286 <br><a name="SEC3" href="#TOC1">SUPPORT FOR COMPRESSED FILES</a><br>
120 nigel 63 <P>
121 ph10 286 It is possible to compile <b>pcregrep</b> so that it uses <b>libz</b> or
122     <b>libbz2</b> to read files whose names end in <b>.gz</b> or <b>.bz2</b>,
123     respectively. You can find out whether your binary has support for one or both
124     of these file types by running it with the <b>--help</b> option. If the
125     appropriate support is not present, files are treated as plain text. The
126     standard input is always so treated.
127     </P>
128     <br><a name="SEC4" href="#TOC1">OPTIONS</a><br>
129     <P>
130 ph10 461 The order in which some of the options appear can affect the output. For
131     example, both the <b>-h</b> and <b>-l</b> options affect the printing of file
132     names. Whichever comes later in the command line will be the one that takes
133 ph10 654 effect. Numerical values for options may be followed by K or M, to signify
134     multiplication by 1024 or 1024*1024 respectively.
135 ph10 429 </P>
136     <P>
137 nigel 77 <b>--</b>
138 ph10 654 This terminates the list of options. It is useful if the next item on the
139 nigel 87 command line starts with a hyphen but is not an option. This allows for the
140     processing of patterns and filenames that start with hyphens.
141 nigel 63 </P>
142     <P>
143 nigel 87 <b>-A</b> <i>number</i>, <b>--after-context=</b><i>number</i>
144     Output <i>number</i> lines of context after each matching line. If filenames
145     and/or line numbers are being output, a hyphen separator is used instead of a
146     colon for the context lines. A line containing "--" is output between each
147 nigel 77 group of lines, unless they are in fact contiguous in the input file. The value
148     of <i>number</i> is expected to be relatively small. However, <b>pcregrep</b>
149 nigel 87 guarantees to have up to 8K of following text available for context output.
150 nigel 77 </P>
151     <P>
152 nigel 87 <b>-B</b> <i>number</i>, <b>--before-context=</b><i>number</i>
153     Output <i>number</i> lines of context before each matching line. If filenames
154     and/or line numbers are being output, a hyphen separator is used instead of a
155     colon for the context lines. A line containing "--" is output between each
156 nigel 77 group of lines, unless they are in fact contiguous in the input file. The value
157     of <i>number</i> is expected to be relatively small. However, <b>pcregrep</b>
158 nigel 87 guarantees to have up to 8K of preceding text available for context output.
159 nigel 77 </P>
160     <P>
161 ph10 654 <b>--buffer-size=</b><i>number</i>
162     Set the parameter that controls how much memory is used for buffering files
163     that are being scanned.
164     </P>
165     <P>
166 nigel 87 <b>-C</b> <i>number</i>, <b>--context=</b><i>number</i>
167     Output <i>number</i> lines of context both before and after each matching line.
168 nigel 77 This is equivalent to setting both <b>-A</b> and <b>-B</b> to the same value.
169     </P>
170     <P>
171 nigel 87 <b>-c</b>, <b>--count</b>
172 ph10 429 Do not output individual lines from the files that are being scanned; instead
173     output the number of lines that would otherwise have been shown. If no lines
174     are selected, the number zero is output. If several files are are being
175     scanned, a count is output for each of them. However, if the
176     <b>--files-with-matches</b> option is also used, only those files whose counts
177     are greater than zero are listed. When <b>-c</b> is used, the <b>-A</b>,
178     <b>-B</b>, and <b>-C</b> options are ignored.
179 nigel 63 </P>
180     <P>
181 nigel 87 <b>--colour</b>, <b>--color</b>
182     If this option is given without any data, it is equivalent to "--colour=auto".
183     If data is required, it must be given in the same shell item, separated by an
184     equals sign.
185     </P>
186     <P>
187     <b>--colour=</b><i>value</i>, <b>--color=</b><i>value</i>
188 ph10 392 This option specifies under what circumstances the parts of a line that matched
189     a pattern should be coloured in the output. By default, the output is not
190     coloured. The value (which is optional, see above) may be "never", "always", or
191     "auto". In the latter case, colouring happens only if the standard output is
192     connected to a terminal. More resources are used when colouring is enabled,
193     because <b>pcregrep</b> has to search for all possible matches in a line, not
194     just one, in order to colour them all.
195 ph10 535 <br>
196     <br>
197 ph10 392 The colour that is used can be specified by setting the environment variable
198     PCREGREP_COLOUR or PCREGREP_COLOR. The value of this variable should be a
199     string of two numbers, separated by a semicolon. They are copied directly into
200     the control string for setting colour on a terminal, so it is your
201     responsibility to ensure that they make sense. If neither of the environment
202     variables is set, the default is "1;31", which gives red.
203     </P>
204     <P>
205 nigel 87 <b>-D</b> <i>action</i>, <b>--devices=</b><i>action</i>
206     If an input path is not a regular file or a directory, "action" specifies how
207     it is to be processed. Valid values are "read" (the default) or "skip"
208     (silently skip the path).
209     </P>
210     <P>
211     <b>-d</b> <i>action</i>, <b>--directories=</b><i>action</i>
212     If an input path is a directory, "action" specifies how it is to be processed.
213     Valid values are "read" (the default), "recurse" (equivalent to the <b>-r</b>
214     option), or "skip" (silently skip the path). In the default case, directories
215     are read as if they were ordinary files. In some operating systems the effect
216     of reading a directory like this is an immediate end-of-file.
217     </P>
218     <P>
219 ph10 286 <b>-e</b> <i>pattern</i>, <b>--regex=</b><i>pattern</i>, <b>--regexp=</b><i>pattern</i>
220     Specify a pattern to be matched. This option can be used multiple times in
221     order to specify several patterns. It can also be used as a way of specifying a
222     single pattern that starts with a hyphen. When <b>-e</b> is used, no argument
223     pattern is taken from the command line; all arguments are treated as file
224     names. There is an overall maximum of 100 patterns. They are applied to each
225     line in the order in which they are defined until one matches (or fails to
226     match if <b>-v</b> is used). If <b>-f</b> is used with <b>-e</b>, the command line
227     patterns are matched first, followed by the patterns from the file, independent
228     of the order in which these options are specified. Note that multiple use of
229     <b>-e</b> is not the same as a single pattern with alternatives. For example,
230     X|Y finds the first character in a line that is X or Y, whereas if the two
231     patterns are given separately, <b>pcregrep</b> finds X if it is present, even if
232     it follows Y in the line. It finds Y only if there is no X in the line. This
233     really matters only if you are using <b>-o</b> to show the part(s) of the line
234     that matched.
235 nigel 87 </P>
236     <P>
237 nigel 77 <b>--exclude</b>=<i>pattern</i>
238     When <b>pcregrep</b> is searching the files in a directory as a consequence of
239 ph10 345 the <b>-r</b> (recursive search) option, any regular files whose names match the
240     pattern are excluded. Subdirectories are not excluded by this option; they are
241 ph10 572 searched recursively, subject to the <b>--exclude-dir</b> and
242 ph10 345 <b>--include_dir</b> options. The pattern is a PCRE regular expression, and is
243     matched against the final component of the file name (not the entire path). If
244     a file name matches both <b>--include</b> and <b>--exclude</b>, it is excluded.
245     There is no short form for this option.
246 nigel 77 </P>
247     <P>
248 ph10 572 <b>--exclude-dir</b>=<i>pattern</i>
249 ph10 345 When <b>pcregrep</b> is searching the contents of a directory as a consequence
250     of the <b>-r</b> (recursive search) option, any subdirectories whose names match
251     the pattern are excluded. (Note that the \fP--exclude\fP option does not affect
252     subdirectories.) The pattern is a PCRE regular expression, and is matched
253     against the final component of the name (not the entire path). If a
254 ph10 572 subdirectory name matches both <b>--include-dir</b> and <b>--exclude-dir</b>, it
255 ph10 345 is excluded. There is no short form for this option.
256     </P>
257     <P>
258 nigel 87 <b>-F</b>, <b>--fixed-strings</b>
259     Interpret each pattern as a list of fixed strings, separated by newlines,
260     instead of as a regular expression. The <b>-w</b> (match as a word) and <b>-x</b>
261     (match whole line) options can be used with <b>-F</b>. They apply to each of the
262     fixed strings. A line is selected if any of the fixed strings are found in it
263     (subject to <b>-w</b> or <b>-x</b>, if present).
264 nigel 63 </P>
265     <P>
266 nigel 87 <b>-f</b> <i>filename</i>, <b>--file=</b><i>filename</i>
267     Read a number of patterns from the file, one per line, and match them against
268     each line of input. A data line is output if any of the patterns match it. The
269     filename can be given as "-" to refer to the standard input. When <b>-f</b> is
270     used, patterns specified on the command line using <b>-e</b> may also be
271     present; they are tested before the file's patterns. However, no other pattern
272     is taken from the command line; all arguments are treated as file names. There
273     is an overall maximum of 100 patterns. Trailing white space is removed from
274     each line, and blank lines are ignored. An empty file contains no patterns and
275 ph10 286 therefore matches nothing. See also the comments about multiple patterns versus
276     a single pattern with alternatives in the description of <b>-e</b> above.
277 nigel 63 </P>
278     <P>
279 ph10 286 <b>--file-offsets</b>
280     Instead of showing lines or parts of lines that match, show each match as an
281     offset from the start of the file and a length, separated by a comma. In this
282     mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b>
283     options are ignored. If there is more than one match in a line, each of them is
284     shown separately. This option is mutually exclusive with <b>--line-offsets</b>
285     and <b>--only-matching</b>.
286     </P>
287     <P>
288 nigel 87 <b>-H</b>, <b>--with-filename</b>
289     Force the inclusion of the filename at the start of output lines when searching
290     a single file. By default, the filename is not shown in this case. For matching
291 ph10 392 lines, the filename is followed by a colon; for context lines, a hyphen
292     separator is used. If a line number is also being output, it follows the file
293     name.
294 nigel 87 </P>
295     <P>
296     <b>-h</b>, <b>--no-filename</b>
297     Suppress the output filenames when searching multiple files. By default,
298     filenames are shown when multiple files are searched. For matching lines, the
299 ph10 392 filename is followed by a colon; for context lines, a hyphen separator is used.
300     If a line number is also being output, it follows the file name.
301 nigel 87 </P>
302     <P>
303     <b>--help</b>
304 ph10 286 Output a help message, giving brief details of the command options and file
305     type support, and then exit.
306 nigel 87 </P>
307     <P>
308     <b>-i</b>, <b>--ignore-case</b>
309 nigel 63 Ignore upper/lower case distinctions during comparisons.
310     </P>
311     <P>
312 nigel 77 <b>--include</b>=<i>pattern</i>
313     When <b>pcregrep</b> is searching the files in a directory as a consequence of
314 ph10 345 the <b>-r</b> (recursive search) option, only those regular files whose names
315     match the pattern are included. Subdirectories are always included and searched
316 ph10 572 recursively, subject to the \fP--include-dir\fP and <b>--exclude-dir</b>
317 ph10 345 options. The pattern is a PCRE regular expression, and is matched against the
318     final component of the file name (not the entire path). If a file name matches
319     both <b>--include</b> and <b>--exclude</b>, it is excluded. There is no short
320     form for this option.
321 nigel 77 </P>
322     <P>
323 ph10 572 <b>--include-dir</b>=<i>pattern</i>
324 ph10 345 When <b>pcregrep</b> is searching the contents of a directory as a consequence
325     of the <b>-r</b> (recursive search) option, only those subdirectories whose
326     names match the pattern are included. (Note that the <b>--include</b> option
327     does not affect subdirectories.) The pattern is a PCRE regular expression, and
328     is matched against the final component of the name (not the entire path). If a
329 ph10 572 subdirectory name matches both <b>--include-dir</b> and <b>--exclude-dir</b>, it
330 ph10 345 is excluded. There is no short form for this option.
331     </P>
332     <P>
333 nigel 87 <b>-L</b>, <b>--files-without-match</b>
334     Instead of outputting lines from the files, just output the names of the files
335     that do not contain any lines that would have been output. Each file name is
336     output once, on a separate line.
337 nigel 77 </P>
338     <P>
339 nigel 87 <b>-l</b>, <b>--files-with-matches</b>
340     Instead of outputting lines from the files, just output the names of the files
341     containing lines that would have been output. Each file name is output
342 ph10 429 once, on a separate line. Searching normally stops as soon as a matching line
343 ph10 461 is found in a file. However, if the <b>-c</b> (count) option is also used,
344     matching continues in order to obtain the correct count, and those files that
345     have at least one match are listed along with their counts. Using this option
346 ph10 429 with <b>-c</b> is a way of suppressing the listing of files with no matches.
347 nigel 63 </P>
348     <P>
349 nigel 77 <b>--label</b>=<i>name</i>
350     This option supplies a name to be used for the standard input when file names
351 nigel 87 are being output. If not supplied, "(standard input)" is used. There is no
352 nigel 77 short form for this option.
353     </P>
354     <P>
355 ph10 535 <b>--line-buffered</b>
356     When this option is given, input is read and processed line by line, and the
357     output is flushed after each write. By default, input is read in large chunks,
358     unless <b>pcregrep</b> can determine that it is reading from a terminal (which
359     is currently possible only in Unix environments). Output to terminal is
360     normally automatically flushed by the operating system. This option can be
361     useful when the input or output is attached to a pipe and you do not want
362     <b>pcregrep</b> to buffer up large amounts of data. However, its use will affect
363     performance, and the <b>-M</b> (multiline) option ceases to work.
364     </P>
365     <P>
366 ph10 286 <b>--line-offsets</b>
367     Instead of showing lines or parts of lines that match, show each match as a
368     line number, the offset from the start of the line, and a length. The line
369     number is terminated by a colon (as usual; see the <b>-n</b> option), and the
370     offset and length are separated by a comma. In this mode, no context is shown.
371     That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> options are ignored. If there is
372     more than one match in a line, each of them is shown separately. This option is
373     mutually exclusive with <b>--file-offsets</b> and <b>--only-matching</b>.
374     </P>
375     <P>
376 nigel 87 <b>--locale</b>=<i>locale-name</i>
377     This option specifies a locale to be used for pattern matching. It overrides
378     the value in the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variables. If no
379     locale is specified, the PCRE library's default (usually the "C" locale) is
380     used. There is no short form for this option.
381     </P>
382     <P>
383 ph10 579 <b>--match-limit</b>=<i>number</i>
384 ph10 567 Processing some regular expression patterns can require a very large amount of
385     memory, leading in some cases to a program crash if not enough is available.
386 ph10 579 Other patterns may take a very long time to search for all possible matching
387 ph10 567 strings. The <b>pcre_exec()</b> function that is called by <b>pcregrep</b> to do
388 ph10 579 the matching has two parameters that can limit the resources that it uses.
389 ph10 567 <br>
390     <br>
391     The <b>--match-limit</b> option provides a means of limiting resource usage
392     when processing patterns that are not going to match, but which have a very
393     large number of possibilities in their search trees. The classic example is a
394     pattern that uses nested unlimited repeats. Internally, PCRE uses a function
395     called <b>match()</b> which it calls repeatedly (sometimes recursively). The
396 ph10 583 limit set by <b>--match-limit</b> is imposed on the number of times this
397 ph10 567 function is called during a match, which has the effect of limiting the amount
398     of backtracking that can take place.
399     <br>
400     <br>
401     The <b>--recursion-limit</b> option is similar to <b>--match-limit</b>, but
402     instead of limiting the total number of times that <b>match()</b> is called, it
403     limits the depth of recursive calls, which in turn limits the amount of memory
404     that can be used. The recursion depth is a smaller number than the total number
405     of calls, because not all calls to <b>match()</b> are recursive. This limit is
406     of use only if it is set smaller than <b>--match-limit</b>.
407     <br>
408     <br>
409 ph10 579 There are no short forms for these options. The default settings are specified
410 ph10 567 when the PCRE library is compiled, with the default default being 10 million.
411     </P>
412     <P>
413 nigel 87 <b>-M</b>, <b>--multiline</b>
414 nigel 77 Allow patterns to match more than one line. When this option is given, patterns
415     may usefully contain literal newline characters and internal occurrences of ^
416 ph10 589 and $ characters. The output for a successful match may consist of more than
417     one line, the last of which is the one in which the match ended. If the matched
418     string ends with a newline sequence the output ends at the end of that line.
419     <br>
420     <br>
421     When this option is set, the PCRE library is called in "multiline" mode.
422 nigel 77 There is a limit to the number of lines that can be matched, imposed by the way
423     that <b>pcregrep</b> buffers the input file as it scans it. However,
424     <b>pcregrep</b> ensures that at least 8K characters or the rest of the document
425     (whichever is the shorter) are available for forward matching, and similarly
426     the previous 8K characters (or all the previous characters, if fewer than 8K)
427 ph10 535 are guaranteed to be available for lookbehind assertions. This option does not
428     work when input is read line by line (see \fP--line-buffered\fP.)
429 nigel 77 </P>
430     <P>
431 ph10 567 <b>-N</b> <i>newline-type</i>, <b>--newline</b>=<i>newline-type</i>
432 ph10 150 The PCRE library supports five different conventions for indicating
433 nigel 91 the ends of lines. They are the single-character sequences CR (carriage return)
434 ph10 150 and LF (linefeed), the two-character sequence CRLF, an "anycrlf" convention,
435     which recognizes any of the preceding three types, and an "any" convention, in
436 nigel 93 which any Unicode line ending sequence is assumed to end a line. The Unicode
437     sequences are the three just mentioned, plus VT (vertical tab, U+000B), FF
438 ph10 654 (form feed, U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and
439 ph10 150 PS (paragraph separator, U+2029).
440 nigel 93 <br>
441     <br>
442     When the PCRE library is built, a default line-ending sequence is specified.
443     This is normally the standard sequence for the operating system. Unless
444     otherwise specified by this option, <b>pcregrep</b> uses the library's default.
445 ph10 150 The possible values for this option are CR, LF, CRLF, ANYCRLF, or ANY. This
446     makes it possible to use <b>pcregrep</b> on files that have come from other
447     environments without having to modify their line endings. If the data that is
448     being scanned does not agree with the convention set by this option,
449     <b>pcregrep</b> may behave in strange ways.
450 nigel 91 </P>
451     <P>
452 nigel 87 <b>-n</b>, <b>--line-number</b>
453     Precede each output line by its line number in the file, followed by a colon
454 ph10 392 for matching lines or a hyphen for context lines. If the filename is also being
455     output, it precedes the line number. This option is forced if
456     <b>--line-offsets</b> is used.
457 nigel 63 </P>
458     <P>
459 ph10 691 <b>--no-jit</b>
460     If the PCRE library is built with support for just-in-time compiling (which
461     speeds up matching), <b>pcregrep</b> automatically makes use of this, unless it
462     was explicitly disabled at build time. This option can be used to disable the
463     use of JIT at run time. It is provided for testing and working round problems.
464     It should never be needed in normal use.
465     </P>
466     <P>
467 nigel 87 <b>-o</b>, <b>--only-matching</b>
468 ph10 567 Show only the part of the line that matched a pattern instead of the whole
469     line. In this mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and
470     <b>-C</b> options are ignored. If there is more than one match in a line, each
471     of them is shown separately. If <b>-o</b> is combined with <b>-v</b> (invert the
472     sense of the match to find non-matching lines), no output is generated, but the
473     return code is set appropriately. If the matched portion of the line is empty,
474     nothing is output unless the file name or line number are being printed, in
475     which case they are shown on an otherwise empty line. This option is mutually
476     exclusive with <b>--file-offsets</b> and <b>--line-offsets</b>.
477 nigel 77 </P>
478     <P>
479 ph10 567 <b>-o</b><i>number</i>, <b>--only-matching</b>=<i>number</i>
480 ph10 579 Show only the part of the line that matched the capturing parentheses of the
481 ph10 567 given number. Up to 32 capturing parentheses are supported. Because these
482     options can be given without an argument (see above), if an argument is
483     present, it must be given in the same shell item, for example, -o3 or
484 ph10 579 --only-matching=2. The comments given for the non-argument case above also
485     apply to this case. If the specified capturing parentheses do not exist in the
486     pattern, or were not set in the match, nothing is output unless the file name
487 ph10 567 or line number are being printed.
488     </P>
489     <P>
490 nigel 87 <b>-q</b>, <b>--quiet</b>
491     Work quietly, that is, display nothing except error messages. The exit
492     status indicates whether or not any matches were found.
493     </P>
494     <P>
495     <b>-r</b>, <b>--recursive</b>
496 nigel 77 If any given path is a directory, recursively scan the files it contains,
497 nigel 87 taking note of any <b>--include</b> and <b>--exclude</b> settings. By default, a
498     directory is read as a normal file; in some operating systems this gives an
499     immediate end-of-file. This option is a shorthand for setting the <b>-d</b>
500     option to "recurse".
501 nigel 63 </P>
502     <P>
503 ph10 567 <b>--recursion-limit</b>=<i>number</i>
504     See <b>--match-limit</b> above.
505     </P>
506     <P>
507 nigel 87 <b>-s</b>, <b>--no-messages</b>
508 nigel 77 Suppress error messages about non-existent or unreadable files. Such files are
509     quietly skipped. However, the return code is still 2, even if matches were
510     found in other files.
511 nigel 63 </P>
512     <P>
513 nigel 87 <b>-u</b>, <b>--utf-8</b>
514 nigel 63 Operate in UTF-8 mode. This option is available only if PCRE has been compiled
515 nigel 87 with UTF-8 support. Both patterns and subject lines must be valid strings of
516     UTF-8 characters.
517 nigel 63 </P>
518     <P>
519 nigel 87 <b>-V</b>, <b>--version</b>
520 nigel 77 Write the version numbers of <b>pcregrep</b> and the PCRE library that is being
521     used to the standard error stream.
522     </P>
523     <P>
524 nigel 87 <b>-v</b>, <b>--invert-match</b>
525     Invert the sense of the match, so that lines which do <i>not</i> match any of
526     the patterns are the ones that are found.
527 nigel 63 </P>
528     <P>
529 nigel 87 <b>-w</b>, <b>--word-regex</b>, <b>--word-regexp</b>
530     Force the patterns to match only whole words. This is equivalent to having \b
531 nigel 77 at the start and end of the pattern.
532     </P>
533     <P>
534 ph10 148 <b>-x</b>, <b>--line-regex</b>, <b>--line-regexp</b>
535 nigel 87 Force the patterns to be anchored (each must start matching at the beginning of
536     a line) and in addition, require them to match entire lines. This is
537 nigel 63 equivalent to having ^ and $ characters at the start and end of each
538 nigel 87 alternative branch in every pattern.
539 nigel 63 </P>
540 ph10 286 <br><a name="SEC5" href="#TOC1">ENVIRONMENT VARIABLES</a><br>
541 nigel 63 <P>
542 nigel 87 The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that
543     order, for a locale. The first one that is set is used. This can be overridden
544     by the <b>--locale</b> option. If no locale is set, the PCRE library's default
545     (usually the "C" locale) is used.
546 nigel 77 </P>
547 ph10 286 <br><a name="SEC6" href="#TOC1">NEWLINES</a><br>
548 nigel 77 <P>
549 nigel 91 The <b>-N</b> (<b>--newline</b>) option allows <b>pcregrep</b> to scan files with
550     different newline conventions from the default. However, the setting of this
551     option does not affect the way in which <b>pcregrep</b> writes information to
552     the standard error and output streams. It uses the string "\n" in C
553     <b>printf()</b> calls to indicate newlines, relying on the C I/O library to
554     convert this to an appropriate sequence if the output is sent to a file.
555     </P>
556 ph10 286 <br><a name="SEC7" href="#TOC1">OPTIONS COMPATIBILITY</a><br>
557 nigel 91 <P>
558 ph10 572 Many of the short and long forms of <b>pcregrep</b>'s options are the same
559     as in the GNU <b>grep</b> program (version 2.5.4). Any long option of the form
560 nigel 87 <b>--xxx-regexp</b> (GNU terminology) is also available as <b>--xxx-regex</b>
561 ph10 572 (PCRE terminology). However, the <b>--file-offsets</b>, <b>--include-dir</b>,
562     <b>--line-offsets</b>, <b>--locale</b>, <b>--match-limit</b>, <b>-M</b>,
563     <b>--multiline</b>, <b>-N</b>, <b>--newline</b>, <b>--recursion-limit</b>,
564     <b>-u</b>, and <b>--utf-8</b> options are specific to <b>pcregrep</b>, as is the
565     use of the <b>--only-matching</b> option with a capturing parentheses number.
566     </P>
567     <P>
568     Although most of the common options work the same way, a few are different in
569     <b>pcregrep</b>. For example, the <b>--include</b> option's argument is a glob
570     for GNU <b>grep</b>, but a regular expression for <b>pcregrep</b>. If both the
571 ph10 461 <b>-c</b> and <b>-l</b> options are given, GNU grep lists only file names,
572 ph10 429 without counts, but <b>pcregrep</b> gives the counts.
573 nigel 87 </P>
574 ph10 286 <br><a name="SEC8" href="#TOC1">OPTIONS WITH DATA</a><br>
575 nigel 87 <P>
576 nigel 77 There are four different ways in which an option with data can be specified.
577 ph10 572 If a short form option is used, the data may follow immediately, or (with one
578     exception) in the next command line item. For example:
579 nigel 77 <pre>
580     -f/some/file
581     -f /some/file
582 nigel 75 </pre>
583 ph10 579 The exception is the <b>-o</b> option, which may appear with or without data.
584     Because of this, if data is present, it must follow immediately in the same
585 ph10 572 item, for example -o3.
586     </P>
587     <P>
588 nigel 77 If a long form option is used, the data may appear in the same command line
589 ph10 572 item, separated by an equals character, or (with two exceptions) it may appear
590 nigel 87 in the next command line item. For example:
591 nigel 77 <pre>
592     --file=/some/file
593     --file /some/file
594 nigel 87 </pre>
595     Note, however, that if you want to supply a file name beginning with ~ as data
596     in a shell command, and have the shell expand ~ to a home directory, you must
597     separate the file name from the option, because the shell does not treat ~
598     specially unless it is at the start of an item.
599 nigel 63 </P>
600     <P>
601 ph10 572 The exceptions to the above are the <b>--colour</b> (or <b>--color</b>) and
602     <b>--only-matching</b> options, for which the data is optional. If one of these
603     options does have data, it must be given in the first form, using an equals
604 ph10 579 character. Otherwise <b>pcregrep</b> will assume that it has no data.
605 nigel 87 </P>
606 ph10 286 <br><a name="SEC9" href="#TOC1">MATCHING ERRORS</a><br>
607 nigel 87 <P>
608     It is possible to supply a regular expression that takes a very long time to
609     fail to match certain lines. Such patterns normally involve nested indefinite
610     repeats, for example: (a+)*\d when matched against a line of a's with no final
611     digit. The PCRE matching function has a resource limit that causes it to abort
612     in these circumstances. If this happens, <b>pcregrep</b> outputs an error
613     message and the line that caused the problem to the standard error stream. If
614     there are more than 20 such errors, <b>pcregrep</b> gives up.
615     </P>
616 ph10 572 <P>
617     The <b>--match-limit</b> option of <b>pcregrep</b> can be used to set the overall
618     resource limit; there is a second option called <b>--recursion-limit</b> that
619 ph10 579 sets a limit on the amount of memory (usually stack) that is used (see the
620 ph10 572 discussion of these options above).
621     </P>
622 ph10 286 <br><a name="SEC10" href="#TOC1">DIAGNOSTICS</a><br>
623 nigel 87 <P>
624 nigel 63 Exit status is 0 if any matches were found, 1 if no matches were found, and 2
625 ph10 654 for syntax errors, overlong lines, non-existent or inaccessible files (even if
626     matches were found in other files) or too many matching errors. Using the
627     <b>-s</b> option to suppress error messages about inaccessible files does not
628     affect the return code.
629 nigel 63 </P>
630 ph10 286 <br><a name="SEC11" href="#TOC1">SEE ALSO</a><br>
631 nigel 63 <P>
632 nigel 93 <b>pcrepattern</b>(3), <b>pcretest</b>(1).
633     </P>
634 ph10 286 <br><a name="SEC12" href="#TOC1">AUTHOR</a><br>
635 nigel 93 <P>
636 nigel 77 Philip Hazel
637 nigel 63 <br>
638     University Computing Service
639     <br>
640 nigel 93 Cambridge CB2 3QH, England.
641 ph10 99 <br>
642 nigel 63 </P>
643 ph10 286 <br><a name="SEC13" href="#TOC1">REVISION</a><br>
644 nigel 63 <P>
645 ph10 691 Last updated: 06 September 2011
646 nigel 63 <br>
647 ph10 589 Copyright &copy; 1997-2011 University of Cambridge.
648 ph10 99 <br>
649 nigel 75 <p>
650     Return to the <a href="index.html">PCRE index page</a>.
651     </p>

Properties

Name Value
svn:eol-style native
svn:keywords "Author Date Id Revision Url"

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12