The AWK programming language interpreter works in many ways like sed; it accepts (by default) lines of input, performs pattern-matching and filtering on each, and applies actions based on those patterns.
AWK is especially well-suited for data that takes the form of a regular pattern of words, or fields, which makes it ideal for filtering data in clearly-delineated columns. By default, fields are separated by any whitespace (spaces or tabs), and records are separated by newlines, but both of these can be changed as part of the program.
The basic form of an AWK program is a set of patterns for each record (normally a line) to match, for which an action is performed for each matching record. We will take a look at some simple examples so you can get an idea of the kinds of patterns and actions that are possible.
We'll use the following groceries data file for our example AWK programs:
$ cat groceries Item Quantity Price Apples 5 0.50 Cereal 1 3.40 Soda 2 1.10
The best-known application of AWK for administrators and programmers is to extract a single column of data. We can do this with a print command and by specifying the second column with $2:
$ awk '{ print $2 }' groceries
Quantity
5
1
2
We can also print multiple columns by separating them with commas:
$ awk '{ print $2, $3 }' groceries
Quantity Price
5 0.50
1 3.40
2 1.10
If we want to exclude the first line with the headers so that we only retrieve the number data, we can do that as part of the AWK program too, by specifying a condition as the pattern preceding the action:
$ awk 'NR > 1 { print $2, $3 }' groceries
5 0.50
1 3.40
2 1.10
NR in the preceding command refers to the record number, in this case the same as the line number, and specifies that the second and third columns should only be printed if the record number is greater than 1, thereby skipping the headers.
We can even do some arithmetic, calculating the total price for each item by multiplying the unit price by the quantity, then formatting them and prefixing them with a dollar sign using printf:
$ awk 'NR > 1 { printf "$%.2f\n", $2 * $3 }' groceries
$2.50
$3.40
$2.20
As with sed, these examples only scratch the surface; AWK is a small programming language, but it has many features beyond merely filtering by column number. Consult the documentation for your system's version of awk to get a better idea of what it can do. Note that different versions of AWK support different extensions to the POSIX standard; keep your AWK programs simple, if you can!