We have already discussed a few popular options for grep to alter its default behavior: --ignore-case (-i), --invert-match (-v), and --word-regexp (-w). As a reminder here's what they do:
- -i allows us to search case-insensitively
- -v only prints lines that are not matched, instead of matched lines
- -w only matches on full words that are surrounded by spaces and/or line anchors and/or punctuation marks
There are three other options we'd like to share with you. The first new option, --only-matching (-o) prints only the matching words. If your search pattern does not contain any regular expressions, this will probably be a pretty boring option, as you can see in this example:
reader@ubuntu:~/scripts/chapter_10$ grep -o 'cool' grep-file.txt
cool
It does exactly as you expected: it printed the word you were looking for. However, unless you just wanted to confirm this, it is probably not that interesting.
Now, if we do the same thing when using a more interesting search pattern (containing a regular expression), this option makes more sense:
reader@ubuntu:~/scripts/chapter_10$ grep -o 'f.r' grep-file.txt
for
far
In this (simplified!) example, you actually get new information: whichever words fell within your search pattern are now printed. While this might not seem impressive for such a short word in such a small file, imagine a more complex search pattern on a much larger file!
This brings up another point: grep is fast. Because of the Boyer-Moore algorithm, grep can search very fast even in very large files (100 MB+).
The second extra option, --count (-c), does not return any lines. It does, however, return a single digit: the number of lines for which the search pattern matched. A well-known example of when this comes in handy is when looking at log files for package installations:
reader@ubuntu:/var/log$ grep 'status installed' dpkg.log
2018-04-26 19:07:29 status installed base-passwd:amd64 3.5.44
2018-04-26 19:07:29 status installed base-files:amd64 10.1ubuntu2
2018-04-26 19:07:30 status installed dpkg:amd64 1.19.0.5ubuntu2
<SNIPPED>
2018-06-30 17:59:37 status installed linux-headers-4.15.0-23:all 4.15.0-23.25
2018-06-30 17:59:37 status installed iucode-tool:amd64 2.3.1-1
2018-06-30 17:59:37 status installed man-db:amd64 2.8.3-2
<SNIPPED>
2018-07-01 09:31:15 status installed distro-info-data:all 0.37ubuntu0.1
2018-07-01 09:31:17 status installed libcurl3-gnutls:amd64 7.58.0-2ubuntu3.1
2018-07-01 09:31:17 status installed libc-bin:amd64 2.27-3ubuntu1
In the regular grep here, we see log lines that show which package was installed on which date. But what if we just wanted to know how many packages were installed on a certain date? --count to the rescue!
reader@ubuntu:/var/log$ grep 'status installed' dpkg.log | grep '2018-08-26'
2018-08-26 11:16:16 status installed base-files:amd64 10.1ubuntu2.2
2018-08-26 11:16:16 status installed install-info:amd64 6.5.0.dfsg.1-2
2018-08-26 11:16:16 status installed plymouth-theme-ubuntu-text:amd64 0.9.3-1ubuntu7
<SNIPPED>
reader@ubuntu:/var/log$ grep 'status installed' dpkg.log | grep -c '2018-08-26'
40
We perform this grep operation in two stages. The first grep 'status installed' filters out all lines related to successful installations, skipping intermediate steps such as unpacked and half-configured.
We use the alternative form of grep behind a pipe (which we will discuss further in Chapter 12, Using Pipes and Redirection in Scripts) to match another search pattern to the already-filtered data. This second grep '2018-08-26' filters on the date.
Now, without the -c option, we would see 40 lines. If we were curious about the packages, this might have been a good option, but otherwise, just the printed number is better than counting the lines by hand.
The final option, which is very interesting, especially for scripting, is the --quiet (-q) option. Imagine a situation where you want to know if a certain search pattern is present in a file. If you find the search pattern, you delete the file. If you do not find the search pattern, you'll add it to the file.
As you know, you can use a nice if-then-else construct to accomplish that. However, if you use a normal grep, you will see the text printed in the Terminal when you run your script.
This is not really that big an issue, but once your scripts get sufficiently large and complicated, a lot of output to the screen will make a script hard to use. For this, we have the --quiet option. Look at this example script to see how you would do this:
reader@ubuntu:~/scripts/chapter_10$ vim grep-then-else.sh
reader@ubuntu:~/scripts/chapter_10$ cat grep-then-else.sh
#!/bin/bash
#####################################
# Author: Sebastiaan Tammer
# Version: v1.0.0
# Date: 2018-10-16
# Description: Use grep exit status to make decisions about file manipulation.
# Usage: ./grep-then-else.sh
#####################################
FILE_NAME=/tmp/grep-then-else.txt
# Touch the file; creates it if it does not exist.
touch ${FILE_NAME}
# Check the file for the keyword.
grep -q 'keyword' ${FILE_NAME}
grep_rc=$?
# If the file contains the keyword, remove the file. Otherwise, write
# the keyword to the file.
if [[ ${grep_rc} -eq 0 ]]; then
rm ${FILE_NAME}
else
echo 'keyword' >> ${FILE_NAME}
fi
reader@ubuntu:~/scripts/chapter_10$ bash -x grep-then-else.sh
+ FILE_NAME=/tmp/grep-then-else.txt
+ touch /tmp/grep-then-else.txt
+ grep --quiet keyword /tmp/grep-then-else.txt
+ grep_rc='1'
+ [[ '1' -eq 0 ]]
+ echo keyword
reader@ubuntu:~/scripts/chapter_10$ bash -x grep-then-else.sh
+ FILE_NAME=/tmp/grep-then-else.txt
+ touch /tmp/grep-then-else.txt
+ grep -q keyword /tmp/grep-then-else.txt
+ grep_rc=0
+ [[ 0 -eq 0 ]]
+ rm /tmp/grep-then-else.txt
As you can see, the trick is in the exit status. If grep finds one or more matches of the search pattern, an exit code of 0 is given. If grep does not find anything, this return code will be 1.
You can see this for yourself on the command line:
reader@ubuntu:/var/log$ grep -q 'glgjegeg' dpkg.log
reader@ubuntu:/var/log$ echo $?
1
reader@ubuntu:/var/log$ grep -q 'installed' dpkg.log
reader@ubuntu:/var/log$ echo $?
0
In grep-then-else.sh, we suppress all output from grep. Still, we can achieve what we want: each run of the script changes between the then and else condition, as our bash -x debug output clearly shows.
Without the --quiet, the non-debug output of the script would be as follows:
reader@ubuntu:/tmp$ bash grep-then-else.sh
reader@ubuntu:/tmp$ bash grep-then-else.sh
keyword
reader@ubuntu:/tmp$ bash grep-then-else.sh
reader@ubuntu:/tmp$ bash grep-then-else.sh
keyword
It doesn't really add anything to the script, does it? Even better, a lot of commands have a --quiet, -q, or equivalent option.
When you're scripting, always consider whether the output of a command is relevant. If it is not, and you can use the exit status, this almost always makes for a cleaner output experience.