| 1 |
nigel |
73 |
PCREGREP(1) PCREGREP(1) |
| 2 |
nigel |
49 |
|
| 3 |
|
|
|
| 4 |
nigel |
73 |
NAME |
| 5 |
|
|
pcregrep - a grep with Perl-compatible regular expressions. |
| 6 |
|
|
|
| 7 |
nigel |
79 |
|
| 8 |
nigel |
49 |
SYNOPSIS |
| 9 |
nigel |
77 |
pcregrep [options] [long options] [pattern] [file1 file2 ...] |
| 10 |
nigel |
49 |
|
| 11 |
|
|
|
| 12 |
nigel |
63 |
DESCRIPTION |
| 13 |
nigel |
49 |
|
| 14 |
nigel |
73 |
pcregrep searches files for character patterns, in the same way as |
| 15 |
|
|
other grep commands do, but it uses the PCRE regular expression library |
| 16 |
|
|
to support patterns that are compatible with the regular expressions of |
| 17 |
|
|
Perl 5. See pcrepattern for a full description of syntax and semantics |
| 18 |
|
|
of the regular expressions that PCRE supports. |
| 19 |
nigel |
49 |
|
| 20 |
nigel |
73 |
A pattern must be specified on the command line unless the -f option is |
| 21 |
|
|
used (see below). |
| 22 |
nigel |
63 |
|
| 23 |
nigel |
77 |
If no files are specified, pcregrep reads the standard input. The stan- |
| 24 |
|
|
dard input can also be referenced by a name consisting of a single |
| 25 |
|
|
hyphen. For example: |
| 26 |
nigel |
49 |
|
| 27 |
nigel |
77 |
pcregrep some-pattern /file1 - /file3 |
| 28 |
nigel |
49 |
|
| 29 |
nigel |
77 |
By default, each line that matches the pattern is copied to the stan- |
| 30 |
|
|
dard output, and if there is more than one file, the file name is |
| 31 |
|
|
printed before each line of output. However, there are options that can |
| 32 |
|
|
change how pcregrep behaves. In particular, the -M option makes it pos- |
| 33 |
|
|
sible to search for patterns that span line boundaries. |
| 34 |
nigel |
49 |
|
| 35 |
nigel |
77 |
Patterns are limited to 8K or BUFSIZ characters, whichever is the |
| 36 |
|
|
greater. BUFSIZ is defined in <stdio.h>. |
| 37 |
|
|
|
| 38 |
|
|
|
| 39 |
nigel |
63 |
OPTIONS |
| 40 |
nigel |
49 |
|
| 41 |
nigel |
77 |
-- This terminate the list of options. It is useful if the next |
| 42 |
|
|
item on the command line starts with a hyphen, but is not an |
| 43 |
|
|
option. |
| 44 |
nigel |
63 |
|
| 45 |
nigel |
77 |
-A number Print number lines of context after each matching line. If |
| 46 |
|
|
file names and/or line numbers are being printed, a hyphen |
| 47 |
|
|
separator is used instead of a colon for the context lines. A |
| 48 |
|
|
line containing "--" is printed between each group of lines, |
| 49 |
|
|
unless they are in fact contiguous in the input file. The |
| 50 |
|
|
value of number is expected to be relatively small. However, |
| 51 |
|
|
pcregrep guarantees to have up to 8K of following text avail- |
| 52 |
|
|
able for context printing. |
| 53 |
nigel |
49 |
|
| 54 |
nigel |
77 |
-B number Print number lines of context before each matching line. If |
| 55 |
|
|
file names and/or line numbers are being printed, a hyphen |
| 56 |
|
|
separator is used instead of a colon for the context lines. A |
| 57 |
|
|
line containing "--" is printed between each group of lines, |
| 58 |
|
|
unless they are in fact contiguous in the input file. The |
| 59 |
|
|
value of number is expected to be relatively small. However, |
| 60 |
|
|
pcregrep guarantees to have up to 8K of preceding text avail- |
| 61 |
|
|
able for context printing. |
| 62 |
|
|
|
| 63 |
|
|
-C number Print number lines of context both before and after each |
| 64 |
|
|
matching line. This is equivalent to setting both -A and -B |
| 65 |
|
|
to the same value. |
| 66 |
|
|
|
| 67 |
nigel |
73 |
-c Do not print individual lines; instead just print a count of |
| 68 |
|
|
the number of lines that would otherwise have been printed. |
| 69 |
|
|
If several files are given, a count is printed for each of |
| 70 |
|
|
them. |
| 71 |
nigel |
49 |
|
| 72 |
nigel |
77 |
--exclude=pattern |
| 73 |
|
|
When pcregrep is searching the files in a directory as a con- |
| 74 |
|
|
sequence of the -r (recursive search) option, any files whose |
| 75 |
|
|
names match the pattern are excluded. The pattern is a PCRE |
| 76 |
|
|
regular expression. If a file name matches both --include and |
| 77 |
|
|
--exclude, it is excluded. There is no short form for this |
| 78 |
|
|
option. |
| 79 |
|
|
|
| 80 |
nigel |
73 |
-ffilename |
| 81 |
nigel |
77 |
Read a number of patterns from the file, one per line, and |
| 82 |
|
|
match all of them against each line of input. A line is out- |
| 83 |
|
|
put if any of the patterns match it. When -f is used, no |
| 84 |
|
|
pattern is taken from the command line; all arguments are |
| 85 |
|
|
treated as file names. There is a maximum of 100 patterns. |
| 86 |
nigel |
73 |
Trailing white space is removed, and blank lines are ignored. |
| 87 |
nigel |
77 |
An empty file contains no patterns and therefore matches |
| 88 |
nigel |
73 |
nothing. |
| 89 |
nigel |
53 |
|
| 90 |
nigel |
73 |
-h Suppress printing of filenames when searching multiple files. |
| 91 |
nigel |
49 |
|
| 92 |
nigel |
73 |
-i Ignore upper/lower case distinctions during comparisons. |
| 93 |
nigel |
49 |
|
| 94 |
nigel |
77 |
--include=pattern |
| 95 |
|
|
When pcregrep is searching the files in a directory as a con- |
| 96 |
|
|
sequence of the -r (recursive search) option, only files |
| 97 |
|
|
whose names match the pattern are included. The pattern is a |
| 98 |
|
|
PCRE regular expression. If a file name matches both |
| 99 |
|
|
--include and --exclude, it is excluded. There is no short |
| 100 |
|
|
form for this option. |
| 101 |
nigel |
49 |
|
| 102 |
nigel |
77 |
-L Instead of printing lines from the files, just print the |
| 103 |
|
|
names of the files that do not contain any lines that would |
| 104 |
|
|
have been printed. Each file name is printed once, on a sepa- |
| 105 |
|
|
rate line. |
| 106 |
|
|
|
| 107 |
|
|
-l Instead of printing lines from the files, just print the |
| 108 |
|
|
names of the files containing lines that would have been |
| 109 |
|
|
printed. Each file name is printed once, on a separate line. |
| 110 |
|
|
|
| 111 |
|
|
--label=name |
| 112 |
|
|
This option supplies a name to be used for the standard input |
| 113 |
|
|
when file names are being printed. If not supplied, "(stan- |
| 114 |
|
|
dard input)" is used. There is no short form for this option. |
| 115 |
|
|
|
| 116 |
|
|
-M Allow patterns to match more than one line. When this option |
| 117 |
|
|
is given, patterns may usefully contain literal newline char- |
| 118 |
|
|
acters and internal occurrences of ^ and $ characters. The |
| 119 |
|
|
output for any one match may consist of more than one line. |
| 120 |
|
|
When this option is set, the PCRE library is called in "mul- |
| 121 |
|
|
tiline" mode. There is a limit to the number of lines that |
| 122 |
|
|
can be matched, imposed by the way that pcregrep buffers the |
| 123 |
|
|
input file as it scans it. However, pcregrep ensures that at |
| 124 |
|
|
least 8K characters or the rest of the document (whichever is |
| 125 |
|
|
the shorter) are available for forward matching, and simi- |
| 126 |
|
|
larly the previous 8K characters (or all the previous charac- |
| 127 |
|
|
ters, if fewer than 8K) are guaranteed to be available for |
| 128 |
|
|
lookbehind assertions. |
| 129 |
|
|
|
| 130 |
nigel |
73 |
-n Precede each line by its line number in the file. |
| 131 |
nigel |
49 |
|
| 132 |
nigel |
77 |
-q Work quietly, that is, display nothing except error messages. |
| 133 |
|
|
The exit status indicates whether or not any matches were |
| 134 |
nigel |
73 |
found. |
| 135 |
nigel |
49 |
|
| 136 |
nigel |
77 |
-r If any given path is a directory, recursively scan the files |
| 137 |
|
|
it contains, taking note of any --include and --exclude set- |
| 138 |
|
|
tings. Without -r a directory is scanned as a normal file. |
| 139 |
|
|
|
| 140 |
|
|
-s Suppress error messages about non-existent or unreadable |
| 141 |
|
|
files. Such files are quietly skipped. However, the return |
| 142 |
|
|
code is still 2, even if matches were found in other files. |
| 143 |
|
|
|
| 144 |
nigel |
73 |
-u Operate in UTF-8 mode. This option is available only if PCRE |
| 145 |
|
|
has been compiled with UTF-8 support. Both the pattern and |
| 146 |
nigel |
75 |
each subject line must be valid strings of UTF-8 characters. |
| 147 |
nigel |
63 |
|
| 148 |
nigel |
77 |
-V Write the version numbers of pcregrep and the PCRE library |
| 149 |
|
|
that is being used to the standard error stream. |
| 150 |
nigel |
49 |
|
| 151 |
nigel |
77 |
-v Invert the sense of the match, so that lines which do not |
| 152 |
|
|
match the pattern are the ones that are found. |
| 153 |
|
|
|
| 154 |
|
|
-w Force the pattern to match only whole words. This is equiva- |
| 155 |
|
|
lent to having \b at the start and end of the pattern. |
| 156 |
|
|
|
| 157 |
nigel |
73 |
-x Force the pattern to be anchored (it must start matching at |
| 158 |
|
|
the beginning of the line) and in addition, require it to |
| 159 |
|
|
match the entire line. This is equivalent to having ^ and $ |
| 160 |
|
|
characters at the start and end of each alternative branch in |
| 161 |
|
|
the regular expression. |
| 162 |
nigel |
49 |
|
| 163 |
|
|
|
| 164 |
nigel |
63 |
LONG OPTIONS |
| 165 |
nigel |
49 |
|
| 166 |
nigel |
73 |
Long forms of all the options are available, as in GNU grep. They are |
| 167 |
|
|
shown in the following table: |
| 168 |
nigel |
49 |
|
| 169 |
nigel |
77 |
-A --after-context |
| 170 |
|
|
-B --before-context |
| 171 |
|
|
-C --context |
| 172 |
nigel |
73 |
-c --count |
| 173 |
nigel |
77 |
--exclude (no short form) |
| 174 |
|
|
-f --file |
| 175 |
nigel |
73 |
-h --no-filename |
| 176 |
nigel |
77 |
--help (no short form) |
| 177 |
nigel |
73 |
-i --ignore-case |
| 178 |
nigel |
77 |
--include (no short form) |
| 179 |
|
|
-L --files-without-match |
| 180 |
nigel |
73 |
-l --files-with-matches |
| 181 |
nigel |
77 |
--label (no short form) |
| 182 |
nigel |
73 |
-n --line-number |
| 183 |
|
|
-r --recursive |
| 184 |
nigel |
77 |
-q --quiet |
| 185 |
nigel |
73 |
-s --no-messages |
| 186 |
|
|
-u --utf-8 |
| 187 |
|
|
-V --version |
| 188 |
|
|
-v --invert-match |
| 189 |
|
|
-x --line-regex |
| 190 |
|
|
-x --line-regexp |
| 191 |
nigel |
49 |
|
| 192 |
|
|
|
| 193 |
nigel |
77 |
OPTIONS WITH DATA |
| 194 |
nigel |
49 |
|
| 195 |
nigel |
77 |
There are four different ways in which an option with data can be spec- |
| 196 |
|
|
ified. If a short form option is used, the data may follow immedi- |
| 197 |
|
|
ately, or in the next command line item. For example: |
| 198 |
|
|
|
| 199 |
|
|
-f/some/file |
| 200 |
|
|
-f /some/file |
| 201 |
|
|
|
| 202 |
|
|
If a long form option is used, the data may appear in the same command |
| 203 |
|
|
line item, separated by an = character, or it may appear in the next |
| 204 |
|
|
command line item. For example: |
| 205 |
|
|
|
| 206 |
|
|
--file=/some/file |
| 207 |
|
|
--file /some/file |
| 208 |
|
|
|
| 209 |
|
|
|
| 210 |
nigel |
63 |
DIAGNOSTICS |
| 211 |
nigel |
49 |
|
| 212 |
nigel |
73 |
Exit status is 0 if any matches were found, 1 if no matches were found, |
| 213 |
nigel |
77 |
and 2 for syntax errors and non-existent or inacessible files (even if |
| 214 |
|
|
matches were found in other files). Using the -s option to suppress |
| 215 |
|
|
error messages about inaccessble files does not affect the return code. |
| 216 |
nigel |
49 |
|
| 217 |
|
|
|
| 218 |
|
|
AUTHOR |
| 219 |
nigel |
63 |
|
| 220 |
nigel |
77 |
Philip Hazel |
| 221 |
nigel |
73 |
University Computing Service |
| 222 |
|
|
Cambridge CB2 3QG, England. |
| 223 |
nigel |
49 |
|
| 224 |
nigel |
77 |
Last updated: 16 May 2005 |
| 225 |
|
|
Copyright (c) 1997-2005 University of Cambridge. |