The awk command processes arguments in the following order:
- First, it executes the commands in the BEGIN { commands } block.
- Next, awk reads one line from the file or stdin, and executes the commands block if the optional pattern is matched. It repeats this step until the end of file.
- When the end of the input stream is reached, it executes the END { commands } block.
The BEGIN block is executed before awk starts reading lines from the input stream. It is an optional block. The commands, such as variable initialization and printing the output header for an output table, are common comamnds in the BEGIN block.
The END block is similar to the BEGIN block. It gets executed when awk completes reading all the lines from the input stream. This is commonly printing results after analyzing all the lines.
The most important block holds the common commands with the pattern block. This block is also optional. If it is not provided, { print } gets executed to print each line read. This block gets executed for each line read by awk. It is like a while loop, with statements to execute inside the body of the loop.
When a line is read, awk checks whether the pattern matches the line. The pattern can be a regular expression match, conditions, a range of lines, and so on. If the current line matches the pattern, awk executes the commands enclosed in { }.
The pattern is optional. If it is not used, all lines are matched:
$ echo -e "line1\nline2" | awk 'BEGIN{ print "Start" } { print } \
END{ print "End" } '
Start
line1
line2
End
When print is used without an argument, awk prints the current line.
The print command can accept arguments. These arguments are separated by commas, they are printed with a space delimiter. Double quotes are used as the concatenation operator.
Consider this example:
$ echo | awk '{ var1="v1"; var2="v2"; var3="v3"; \
print var1,var2,var3 ; }'
The preceding command will display this:
v1 v2 v3
The echo command writes a single line into the standard output. Hence, the statements in the { } block of awk are executed once. If the input to awk contains multiple lines, the commands in awk will be executed multiple times.
Concatenation is done with quoted strings:
$ echo | awk '{ var1="v1"; var2="v2"; var3="v3"; \
print var1 "-" var2 "-" var3 ; }'
v1-v2-v3
{ } is like a block in a loop, iterating through each line of a file.