Linux Shell Scripting Cookbook

The grep command is the magic Unix utility for searching text. It accepts regular expressions and can produce reports in various formats.

Search stdin for lines that match a pattern:

        $ echo -e "this is a word\nnext line" | grep word 
        this is a word

Search a single file for lines that contain a given pattern:

        $ grep pattern filename
        this is the line containing pattern

Alternatively, this performs the same search:

        $ grep "pattern" filename
        this is the line containing pattern

Search multiple files for lines that match a pattern:

        $ grep "match_text" file1 file2 file3 ...

To highlight the matching pattern, use the -color option. While the option position does not matter, the convention is to place options first.

        $ grep -color=auto word filename
        this is the line containing word

The grep command uses basic regular expressions by default. These are a subset of the rules described earlier. The -E option will cause grep to use the Extended Regular Expression syntax. The egrep command is a variant of grep that uses extended regular expression by default:

        $ grep -E "[a-z]+" filename

Or:

        $ egrep "[a-z]+" filename

The -o option will report only the matching characters, not the entire line:

        $ echo this is a line. | egrep -o "[a-z]+\."
        line

The -v option will print all lines, except those containing match_pattern:

        $ grep -v match_pattern file

The -v option added to grep inverts the match results.

The -c option will count the number of lines in which the pattern appears:

        $ grep -c "text" filename
        10

It should be noted that -c counts the number of matching lines, not the number of times a match is made. Consider this example:

        $ echo -e "1 2 3 4\nhello\n5 6" | egrep  -c "[0-9]"
        2

Even though there are six matching items, grep reports 2, since there are only two matching lines. Multiple matches in a single line are counted only once.

To count the number of matching items in a file, use this trick:

        $ echo -e "1 2 3 4\nhello\n5 6" | egrep -o "[0-9]" | wc -l
        6

The -n option will print the line number of the matching string:

        $ cat sample1.txt
        gnu is not unix
        linux is fun
        bash is art
        $ cat sample2.txt
        planetlinux
        $ grep linux -n sample1.txt
        2:linux is fun

        $ cat sample1.txt | grep linux -n

If multiple files are used, the -c option will print the filename with the result:

        $ grep linux -n sample1.txt sample2.txt
        sample1.txt:2:linux is fun
        sample2.txt:2:planetlinux

The -b option will print the offset of the line in which a match occurs. Adding the -o option will print the exact character or byte offset where the pattern matches:

        $ echo gnu is not unix | grep -b -o "not"
        7:not

Character positions are numbered from 0, not from 1.

The -l option lists which files contain the pattern:

        $ grep -l linux sample1.txt sample2.txt
        sample1.txt
        sample2.txt

The inverse of the -l argument is -L. The -L argument returns a list of nonmatching files.

Table of Contents for
Linux Shell Scripting Cookbook - Third Edition

How to do it...

Table of Contents for Linux Shell Scripting Cookbook - Third Edition

Table of Contents for
Linux Shell Scripting Cookbook - Third Edition