Learn Linux Shell Scripting - Fundamentals of Bash 4.4

We've now seen many examples of how to use regular expressions. While most things are pretty intuitive, we have also seen that if we want to filter for both uppercase and lowercase strings, we'd either have to specify the -i option for grep, or change the search pattern from [a-z] to [a-zA-z]. For numbers, we would need to use [0-9].

Some might find this fine to work with, but others might disagree. In this case, there is an alternative notation that can be used: [[:pattern:]].

The next example uses both this new double bracket notation, and the old single bracket one:

reader@ubuntu:~/scripts/chapter_10$ grep [[:digit:]] character-class.txt 
e2e
a2a
reader@ubuntu:~/scripts/chapter_10$ grep [0-9] character-class.txt 
e2e
a2a

As you can see, both patterns result in the same lines: those with a digit. The same can be done with uppercase characters:

reader@ubuntu:~/scripts/chapter_10$ grep [[:upper:]] grep-file.txt 
We can use this regular file for testing grep.
Regular expressions are pretty cool
Did you ever realise that in the UK they say colour,
but in the USA they use color (and realize)!
Also, New Zealand is pretty far away.
reader@ubuntu:~/scripts/chapter_10$ grep [A-Z] grep-file.txt 
We can use this regular file for testing grep.
Regular expressions are pretty cool
Did you ever realise that in the UK they say colour,
but in the USA they use color (and realize)!
Also, New Zealand is pretty far away.

At the end of the day, it is a matter of preference which notation you use. There is one thing to be said for the double bracket notation, though: it is much closer to implementations of other scripting/programming languages. For example, most regular expression implementations use \w (word) to select letters, and \d (digit) to search for digits. In the case of \w, the uppercase variant is intuitively \W.

For your convenience, here is a table with the most common POSIX double-bracket character classes:

Notation	Description	Single bracket equivalent
`[[:alnum:]]`	Matches lowercase and uppercase letters or digits	[a-z A-Z 0-9]
`[[:alpha:]]`	Matches lowercase and uppercase letters	[a-z A-Z]
`[[:digit:]]`	Matches digits	[0-9]
`[[:lower:]]`	Matches lowercase letters	[a-z]
`[[:upper:]]`	Matches uppercase letters	[A-Z]
`[[:blank:]]`	Matches spaces and tabs	[ \t]

We prefer to use the double bracket notation, as it maps better to other regular expression implementations. Feel free to use either in your scripting! However, as always: make sure you choose one, and stick with it; not following a standard results in sloppy scripts that are confusing to readers. The rest of the examples in this book will use the double bracket notation.

Table of Contents for
Learn Linux Shell Scripting - Fundamentals of Bash 4.4

Character classes

Table of Contents for Learn Linux Shell Scripting - Fundamentals of Bash 4.4

Table of Contents for
Learn Linux Shell Scripting - Fundamentals of Bash 4.4