|
|
some of the mental habits of those departed days. At most
terrestrial men fancied there might be other men upon Mars,
perhaps inferior to themselves and ready to welcome a mis-
sionary enterprise. Yet across the gulf of space, minds that
are to our minds as ours are to those of the beasts that perish,
intellects vast and cool and unsympathetic, regarded this
earth with envious eyes, and slowly and surely drew their
plans against us. And early in the twentieth century came
the great disillusionment.
grep men demotext
returns:
some of the mental habits of those departed days. At most
terrestrial men fancied there might be other men upon Mars,
the great disillusionment.
The string
| metacharacter quote | |||||||
| start of line | |||||||
| any character | |||||||
| end of line | |||||||
| infix operator (think: "or") | |||||||
character range
within the range:
| |||||||
| word boundaries (start and end) | |||||||
| grouping | |||||||
| alphanumeric characters | |||||||
| non-alphanumeric characters | |||||||
| word boundary | |||||||
| non-word boundary |
To search for a metacharacter as a normal character just prefix it with
a backslash.
grep . demotext
returns 9 lines (since any character will match), but
grep "\." demotext
returns 4 lines. The quotes are needed since the backslash is a special
character for the shell (where it serves the same purpose).
By using the character range metacharacters, a range of valid characters will
cause the match to succeed.
grep "in[tf]" demotext
returns:
perhaps inferior to themselves and ready to welcome a mis-
intellects vast and cool and unsympathetic, regarded this
This is the same as writing
egrep "int|inf" demotext
(more on the different flavors of grep later).
Grouping is handled by parentheses. Groups are referenced by \number. Thus
egrep "(.)(.)\2\1" demotext
returns all the lines that have words with 2 pair of letters where the second
pair is the reverse of the first:
terrestrial men fancied there might be other men upon Mars,
intellects vast and cool and unsympathetic, regarded this
| match once at most | |
| match zero or more times | |
| match one or more times | |
| match exactly | |
| match | |
| match at most | |
| match at least |
egrep "\<[0-9]+\>"
would match any number as a word on its own.
| count lines that match | |
| ignore case | |
| show filenames that have matches | |
| prefix matched lines with a line number | |
| invert, show lines that don't match | |
| turn off flag processing, needed for patterns with a leading dash | |