The egrep command converts the text file into a stream of words, one word per line. The \b[[:alpha:]]+\b pattern matches each word and removes whitespace and punctuation. The -o option prints the matching character sequences as one word in each line.
The awk command counts each word. It executes the statements in the { } block for each line, so we don't need a specific loop for doing that. The count is incremented by the count[$0]++ command, in which $0 is the current line and count is an associative array. After all the lines are processed, the END{} block prints the words and their count.
The body of this procedure can be modified using other tools we've looked at. We can merge capitalized and non-capitalized words into a single count with the tr command, and sort the output using the sort command, like this:
egrep -o "\b[[:alpha:]]+\b" $filename | tr [A=Z] [a-z] | \
awk '{ count[$0]++ }
END{ printf("%-14s%s\n","Word","Count") ;
for(ind in count)
{ printf("%-14s%d\n",ind,count[ind]);
}
}' | sort