| 1 |
PCREGREP(1) PCREGREP(1)
|
| 2 |
|
| 3 |
|
| 4 |
|
| 5 |
NAME
|
| 6 |
pcregrep - a grep with Perl-compatible regular expressions.
|
| 7 |
|
| 8 |
SYNOPSIS
|
| 9 |
pcregrep [options] [long options] [pattern] [file1 file2 ...]
|
| 10 |
|
| 11 |
|
| 12 |
DESCRIPTION
|
| 13 |
|
| 14 |
pcregrep searches files for character patterns, in the same way as
|
| 15 |
other grep commands do, but it uses the PCRE regular expression library
|
| 16 |
to support patterns that are compatible with the regular expressions of
|
| 17 |
Perl 5. See pcrepattern for a full description of syntax and semantics
|
| 18 |
of the regular expressions that PCRE supports.
|
| 19 |
|
| 20 |
A pattern must be specified on the command line unless the -f option is
|
| 21 |
used (see below).
|
| 22 |
|
| 23 |
If no files are specified, pcregrep reads the standard input. The stan-
|
| 24 |
dard input can also be referenced by a name consisting of a single
|
| 25 |
hyphen. For example:
|
| 26 |
|
| 27 |
pcregrep some-pattern /file1 - /file3
|
| 28 |
|
| 29 |
By default, each line that matches the pattern is copied to the stan-
|
| 30 |
dard output, and if there is more than one file, the file name is
|
| 31 |
printed before each line of output. However, there are options that can
|
| 32 |
change how pcregrep behaves. In particular, the -M option makes it pos-
|
| 33 |
sible to search for patterns that span line boundaries.
|
| 34 |
|
| 35 |
Patterns are limited to 8K or BUFSIZ characters, whichever is the
|
| 36 |
greater. BUFSIZ is defined in <stdio.h>.
|
| 37 |
|
| 38 |
|
| 39 |
OPTIONS
|
| 40 |
|
| 41 |
-- This terminate the list of options. It is useful if the next
|
| 42 |
item on the command line starts with a hyphen, but is not an
|
| 43 |
option.
|
| 44 |
|
| 45 |
-A number Print number lines of context after each matching line. If
|
| 46 |
file names and/or line numbers are being printed, a hyphen
|
| 47 |
separator is used instead of a colon for the context lines. A
|
| 48 |
line containing "--" is printed between each group of lines,
|
| 49 |
unless they are in fact contiguous in the input file. The
|
| 50 |
value of number is expected to be relatively small. However,
|
| 51 |
pcregrep guarantees to have up to 8K of following text avail-
|
| 52 |
able for context printing.
|
| 53 |
|
| 54 |
-B number Print number lines of context before each matching line. If
|
| 55 |
file names and/or line numbers are being printed, a hyphen
|
| 56 |
separator is used instead of a colon for the context lines. A
|
| 57 |
line containing "--" is printed between each group of lines,
|
| 58 |
unless they are in fact contiguous in the input file. The
|
| 59 |
value of number is expected to be relatively small. However,
|
| 60 |
pcregrep guarantees to have up to 8K of preceding text avail-
|
| 61 |
able for context printing.
|
| 62 |
|
| 63 |
-C number Print number lines of context both before and after each
|
| 64 |
matching line. This is equivalent to setting both -A and -B
|
| 65 |
to the same value.
|
| 66 |
|
| 67 |
-c Do not print individual lines; instead just print a count of
|
| 68 |
the number of lines that would otherwise have been printed.
|
| 69 |
If several files are given, a count is printed for each of
|
| 70 |
them.
|
| 71 |
|
| 72 |
--exclude=pattern
|
| 73 |
When pcregrep is searching the files in a directory as a con-
|
| 74 |
sequence of the -r (recursive search) option, any files whose
|
| 75 |
names match the pattern are excluded. The pattern is a PCRE
|
| 76 |
regular expression. If a file name matches both --include and
|
| 77 |
--exclude, it is excluded. There is no short form for this
|
| 78 |
option.
|
| 79 |
|
| 80 |
-ffilename
|
| 81 |
Read a number of patterns from the file, one per line, and
|
| 82 |
match all of them against each line of input. A line is out-
|
| 83 |
put if any of the patterns match it. When -f is used, no
|
| 84 |
pattern is taken from the command line; all arguments are
|
| 85 |
treated as file names. There is a maximum of 100 patterns.
|
| 86 |
Trailing white space is removed, and blank lines are ignored.
|
| 87 |
An empty file contains no patterns and therefore matches
|
| 88 |
nothing.
|
| 89 |
|
| 90 |
-h Suppress printing of filenames when searching multiple files.
|
| 91 |
|
| 92 |
-i Ignore upper/lower case distinctions during comparisons.
|
| 93 |
|
| 94 |
--include=pattern
|
| 95 |
When pcregrep is searching the files in a directory as a con-
|
| 96 |
sequence of the -r (recursive search) option, only files
|
| 97 |
whose names match the pattern are included. The pattern is a
|
| 98 |
PCRE regular expression. If a file name matches both
|
| 99 |
--include and --exclude, it is excluded. There is no short
|
| 100 |
form for this option.
|
| 101 |
|
| 102 |
-L Instead of printing lines from the files, just print the
|
| 103 |
names of the files that do not contain any lines that would
|
| 104 |
have been printed. Each file name is printed once, on a sepa-
|
| 105 |
rate line.
|
| 106 |
|
| 107 |
-l Instead of printing lines from the files, just print the
|
| 108 |
names of the files containing lines that would have been
|
| 109 |
printed. Each file name is printed once, on a separate line.
|
| 110 |
|
| 111 |
--label=name
|
| 112 |
This option supplies a name to be used for the standard input
|
| 113 |
when file names are being printed. If not supplied, "(stan-
|
| 114 |
dard input)" is used. There is no short form for this option.
|
| 115 |
|
| 116 |
-M Allow patterns to match more than one line. When this option
|
| 117 |
is given, patterns may usefully contain literal newline char-
|
| 118 |
acters and internal occurrences of ^ and $ characters. The
|
| 119 |
output for any one match may consist of more than one line.
|
| 120 |
When this option is set, the PCRE library is called in "mul-
|
| 121 |
tiline" mode. There is a limit to the number of lines that
|
| 122 |
can be matched, imposed by the way that pcregrep buffers the
|
| 123 |
input file as it scans it. However, pcregrep ensures that at
|
| 124 |
least 8K characters or the rest of the document (whichever is
|
| 125 |
the shorter) are available for forward matching, and simi-
|
| 126 |
larly the previous 8K characters (or all the previous charac-
|
| 127 |
ters, if fewer than 8K) are guaranteed to be available for
|
| 128 |
lookbehind assertions.
|
| 129 |
|
| 130 |
-n Precede each line by its line number in the file.
|
| 131 |
|
| 132 |
-q Work quietly, that is, display nothing except error messages.
|
| 133 |
The exit status indicates whether or not any matches were
|
| 134 |
found.
|
| 135 |
|
| 136 |
-r If any given path is a directory, recursively scan the files
|
| 137 |
it contains, taking note of any --include and --exclude set-
|
| 138 |
tings. Without -r a directory is scanned as a normal file.
|
| 139 |
|
| 140 |
-s Suppress error messages about non-existent or unreadable
|
| 141 |
files. Such files are quietly skipped. However, the return
|
| 142 |
code is still 2, even if matches were found in other files.
|
| 143 |
|
| 144 |
-u Operate in UTF-8 mode. This option is available only if PCRE
|
| 145 |
has been compiled with UTF-8 support. Both the pattern and
|
| 146 |
each subject line must be valid strings of UTF-8 characters.
|
| 147 |
|
| 148 |
-V Write the version numbers of pcregrep and the PCRE library
|
| 149 |
that is being used to the standard error stream.
|
| 150 |
|
| 151 |
-v Invert the sense of the match, so that lines which do not
|
| 152 |
match the pattern are the ones that are found.
|
| 153 |
|
| 154 |
-w Force the pattern to match only whole words. This is equiva-
|
| 155 |
lent to having \b at the start and end of the pattern.
|
| 156 |
|
| 157 |
-x Force the pattern to be anchored (it must start matching at
|
| 158 |
the beginning of the line) and in addition, require it to
|
| 159 |
match the entire line. This is equivalent to having ^ and $
|
| 160 |
characters at the start and end of each alternative branch in
|
| 161 |
the regular expression.
|
| 162 |
|
| 163 |
|
| 164 |
LONG OPTIONS
|
| 165 |
|
| 166 |
Long forms of all the options are available, as in GNU grep. They are
|
| 167 |
shown in the following table:
|
| 168 |
|
| 169 |
-A --after-context
|
| 170 |
-B --before-context
|
| 171 |
-C --context
|
| 172 |
-c --count
|
| 173 |
--exclude (no short form)
|
| 174 |
-f --file
|
| 175 |
-h --no-filename
|
| 176 |
--help (no short form)
|
| 177 |
-i --ignore-case
|
| 178 |
--include (no short form)
|
| 179 |
-L --files-without-match
|
| 180 |
-l --files-with-matches
|
| 181 |
--label (no short form)
|
| 182 |
-n --line-number
|
| 183 |
-r --recursive
|
| 184 |
-q --quiet
|
| 185 |
-s --no-messages
|
| 186 |
-u --utf-8
|
| 187 |
-V --version
|
| 188 |
-v --invert-match
|
| 189 |
-x --line-regex
|
| 190 |
-x --line-regexp
|
| 191 |
|
| 192 |
|
| 193 |
OPTIONS WITH DATA
|
| 194 |
|
| 195 |
There are four different ways in which an option with data can be spec-
|
| 196 |
ified. If a short form option is used, the data may follow immedi-
|
| 197 |
ately, or in the next command line item. For example:
|
| 198 |
|
| 199 |
-f/some/file
|
| 200 |
-f /some/file
|
| 201 |
|
| 202 |
If a long form option is used, the data may appear in the same command
|
| 203 |
line item, separated by an = character, or it may appear in the next
|
| 204 |
command line item. For example:
|
| 205 |
|
| 206 |
--file=/some/file
|
| 207 |
--file /some/file
|
| 208 |
|
| 209 |
|
| 210 |
DIAGNOSTICS
|
| 211 |
|
| 212 |
Exit status is 0 if any matches were found, 1 if no matches were found,
|
| 213 |
and 2 for syntax errors and non-existent or inacessible files (even if
|
| 214 |
matches were found in other files). Using the -s option to suppress
|
| 215 |
error messages about inaccessble files does not affect the return code.
|
| 216 |
|
| 217 |
|
| 218 |
AUTHOR
|
| 219 |
|
| 220 |
Philip Hazel
|
| 221 |
University Computing Service
|
| 222 |
Cambridge CB2 3QG, England.
|
| 223 |
|
| 224 |
Last updated: 16 May 2005
|
| 225 |
Copyright (c) 1997-2005 University of Cambridge.
|