The grep command is the magic Unix utility for searching text. It accepts regular expressions and can produce reports in various formats.
- Search stdin for lines that match a pattern:
$ echo -e "this is a word\nnext line" | grep word
this is a word
- Search a single file for lines that contain a given pattern:
$ grep pattern filename
this is the line containing pattern
Alternatively, this performs the same search:
$ grep "pattern" filename
this is the line containing pattern
- Search multiple files for lines that match a pattern:
$ grep "match_text" file1 file2 file3 ...
- To highlight the matching pattern, use the -color option. While the option position does not matter, the convention is to place options first.
$ grep -color=auto word filename
this is the line containing word
- The grep command uses basic regular expressions by default. These are a subset of the rules described earlier. The -E option will cause grep to use the Extended Regular Expression syntax. The egrep command is a variant of grep that uses extended regular expression by default:
$ grep -E "[a-z]+" filename
Or:
$ egrep "[a-z]+" filename
- The -o option will report only the matching characters, not the entire line:
$ echo this is a line. | egrep -o "[a-z]+\."
line
- The -v option will print all lines, except those containing match_pattern:
$ grep -v match_pattern file
The -v option added to grep inverts the match results.
- The -c option will count the number of lines in which the pattern appears:
$ grep -c "text" filename
10
It should be noted that -c counts the number of matching lines, not the number of times a match is made. Consider this example:
$ echo -e "1 2 3 4\nhello\n5 6" | egrep -c "[0-9]"
2
Even though there are six matching items, grep reports 2, since there are only two matching lines. Multiple matches in a single line are counted only once.
- To count the number of matching items in a file, use this trick:
$ echo -e "1 2 3 4\nhello\n5 6" | egrep -o "[0-9]" | wc -l
6
- The -n option will print the line number of the matching string:
$ cat sample1.txt
gnu is not unix
linux is fun
bash is art
$ cat sample2.txt
planetlinux
$ grep linux -n sample1.txt
2:linux is fun
Or
$ cat sample1.txt | grep linux -n
If multiple files are used, the -c option will print the filename with the result:
$ grep linux -n sample1.txt sample2.txt
sample1.txt:2:linux is fun
sample2.txt:2:planetlinux
- The -b option will print the offset of the line in which a match occurs. Adding the -o option will print the exact character or byte offset where the pattern matches:
$ echo gnu is not unix | grep -b -o "not"
7:not
Character positions are numbered from 0, not from 1.
- The -l option lists which files contain the pattern:
$ grep -l linux sample1.txt sample2.txt
sample1.txt
sample2.txt
The inverse of the -l argument is -L. The -L argument returns a list of nonmatching files.