Table of Contents for
sed & awk, 2nd Edition

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition sed & awk, 2nd Edition by Arnold Robbins Published by O'Reilly Media, Inc., 1997
  1. sed & awk, 2nd Edition
  2. Cover
  3. sed & awk, 2nd Edition
  4. A Note Regarding Supplemental Files
  5. Dedication
  6. Preface
  7. Scope of This Handbook
  8. Availability of sed and awk
  9. Obtaining Example Source Code
  10. Conventions Used in This Handbook
  11. About the Second Edition
  12. Acknowledgments from the First Edition
  13. Comments and Questions
  14. 1. Power Tools for Editing
  15. 1.1. May You Solve Interesting Problems
  16. 1.2. A Stream Editor
  17. 1.3. A Pattern-Matching Programming Language
  18. 1.4. Four Hurdles to Mastering sed and awk
  19. 2. Understanding Basic Operations
  20. 2.1. Awk, by Sed and Grep, out of Ed
  21. 2.2. Command-Line Syntax
  22. 2.3. Using sed
  23. 2.4. Using awk
  24. 2.5. Using sed and awk Together
  25. 3. Understanding Regular Expression Syntax
  26. 3.1. That’s an Expression
  27. 3.2. A Line-Up of Characters
  28. 3.3. I Never Metacharacter I Didn’t Like
  29. 4. Writing sed Scripts
  30. 4.1. Applying Commands in a Script
  31. 4.2. A Global Perspective on Addressing
  32. 4.3. Testing and Saving Output
  33. 4.4. Four Types of sed Scripts
  34. 4.5. Getting to the PromiSed Land
  35. 5. Basic sed Commands
  36. 5.1. About the Syntax of sed Commands
  37. 5.2. Comment
  38. 5.3. Substitution
  39. 5.4. Delete
  40. 5.5. Append, Insert, and Change
  41. 5.6. List
  42. 5.7. Transform
  43. 5.8. Print
  44. 5.9. Print Line Number
  45. 5.10. Next
  46. 5.11. Reading and Writing Files
  47. 5.12. Quit
  48. 6. Advanced sed Commands
  49. 6.1. Multiline Pattern Space
  50. 6.2. A Case for Study
  51. 6.3. Hold That Line
  52. 6.4. Advanced Flow Control Commands
  53. 6.5. To Join a Phrase
  54. 7. Writing Scripts for awk
  55. 7.1. Playing the Game
  56. 7.2. Hello, World
  57. 7.3. Awk’s Programming Model
  58. 7.4. Pattern Matching
  59. 7.5. Records and Fields
  60. 7.6. Expressions
  61. 7.7. System Variables
  62. 7.8. Relational and Boolean Operators
  63. 7.9. Formatted Printing
  64. 7.10. Passing Parameters Into a Script
  65. 7.11. Information Retrieval
  66. 8. Conditionals, Loops, and Arrays
  67. 8.1. Conditional Statements
  68. 8.2. Looping
  69. 8.3. Other Statements That Affect Flow Control
  70. 8.4. Arrays
  71. 8.5. An Acronym Processor
  72. 8.6. System Variables That Are Arrays
  73. 9. Functions
  74. 9.1. Arithmetic Functions
  75. 9.2. String Functions
  76. 9.3. Writing Your Own Functions
  77. 10. The Bottom Drawer
  78. 10.1. The getline Function
  79. 10.2. The close( ) Function
  80. 10.3. The system( ) Function
  81. 10.4. A Menu-Based Command Generator
  82. 10.5. Directing Output to Files and Pipes
  83. 10.6. Generating Columnar Reports
  84. 10.7. Debugging
  85. 10.8. Limitations
  86. 10.9. Invoking awk Using the #! Syntax
  87. 11. A Flock of awks
  88. 11.1. Original awk
  89. 11.2. Freely Available awks
  90. 11.3. Commercial awks
  91. 11.4. Epilogue
  92. 12. Full-Featured Applications
  93. 12.1. An Interactive Spelling Checker
  94. 12.2. Generating a Formatted Index
  95. 12.3. Spare Details of the masterindex Program
  96. 13. A Miscellany of Scripts
  97. 13.1. uutot.awk—Report UUCP Statistics
  98. 13.2. phonebill—Track Phone Usage
  99. 13.3. combine—Extract Multipart uuencoded Binaries
  100. 13.4. mailavg—Check Size of Mailboxes
  101. 13.5. adj—Adjust Lines for Text Files
  102. 13.6. readsource—Format Program Source Files for troff
  103. 13.7. gent—Get a termcap Entry
  104. 13.8. plpr—lpr Preprocessor
  105. 13.9. transpose—Perform a Matrix Transposition
  106. 13.10. m1—Simple Macro Processor
  107. A. Quick Reference for sed
  108. A.1. Command-Line Syntax
  109. A.2. Syntax of sed Commands
  110. A.3. Command Summary for sed
  111. B. Quick Reference for awk
  112. B.1. Command-Line Syntax
  113. B.2. Language Summary for awk
  114. B.3. Command Summary for awk
  115. C. Supplement for Chapter 12
  116. C.1. Full Listing of spellcheck.awk
  117. C.2. Listing of masterindex Shell Script
  118. C.3. Documentation for masterindex
  119. masterindex
  120. C.3.1. Background Details
  121. C.3.2. Coding Index Entries
  122. C.3.3. Output Format
  123. C.3.4. Compiling a Master Index
  124. Index
  125. About the Authors
  126. Colophon
  127. Copyright

Original awk

In each of the sections that follow, we’ll take a brief look at how the original awk differs from POSIX awk. Over the years, UNIX vendors have enhanced their versions of original awk; you may need to write small test programs to see exactly what features your old awk has or doesn’t have.

Escape Sequences

The original V7 awk only had “\t”, “\n”, “\"”, and, of course, “\\”. Most UNIX vendors have added some or all of “\b” and “\r” and “\f”.

Exponentiation

Exponentiation (using the ^, ^=, **, and **= operators) is not in old awk.

The C Conditional Expression

The three-argument conditional expression found in C, "expr1 ? expr2 : expr3" is not in old awk. You must resort to a plain old if-else statement.

Variables as Boolean Patterns

You cannot use the value of a variable as a Boolean pattern.

flag { print "..." }

You must instead use a comparison expression.

flag != 0 { print "..." }

Faking Dynamic Regular Expressions

The original awk made it difficult to use patterns dynamically because they had to be fixed when the script was interpreted. You can get around the problem of not being able to use a variable as a regular expression by importing a shell variable inside an awk program. The value of the shell variable will be interpreted by awk as a constant. Here’s an example:

$ cat awkro2
#!/bin/sh
# assign shell's $1 to awk search variable
search=$1
awk '$1 ~ /'"$search"'/' acronyms

The first line of the script makes the variable assignment before awk is invoked. To get the shell to expand the variable inside the awk procedure, we enclose it within single, then double, quotation marks.[1] Thus, awk never sees the shell variable and evaluates it as a constant string.

Here’s another version that makes use of the Bourne shell variable substitution feature. Using this feature gives us an easy way to specify a default value for the variable if, for instance, the user does not supply a command-line argument.

search=$1
awk '$1 ~ /'"${search:-.*}"'/' acronyms

The expression “${search:-.*}” tells the shell to use the value of search if it is defined; if not, use “.*” as the value. Here, “.*” is regular-expression syntax specifying any string of characters; therefore, all entries are printed if no entry is supplied on the command line. Because the whole thing is inside double quotes, the shell does not perform a wildcard expansion on “.*”.

Control Flow

In POSIX awk, if a program has just a BEGIN procedure, and nothing else, awk will exit after executing that procedure. The original awk is different; it will execute the BEGIN procedure and then go on to process input, even if there are no pattern-action statements. You can force awk to exit by supplying /dev/null on the command line as a data file argument, or by using exit.

In addition, the BEGIN and END procedures, if present, have to be at the beginning and end of program, respectively. Furthermore, you can only have one of each.

Field Separating

Field separating works the same in old awk as it does in modern awk, except that you can’t use regular expressions.

Arrays

There is no way in the original awk to delete an element from an array. The best thing you can do is assign the empty string to the unwanted array element, and then code your program to ignore array elements whose values are empty.

Along the same lines, in is not an operator in original awk; you cannot use if (item in array) to see if an item is present. Unfortunately, this forces you to loop through every item in an array to see if the index you want is present.

for (item in array) {
	if (item == searchkey) {
		process array[item]
		break
	}
}

The getline Function

The original V7 awk did not have getline. If your awk is really ancient, then getline may not work for you. Some vendors have the simplest form of getline, which reads the next record from the regular input stream, and sets $0, NF and NR (there is no FNR, see below). All of the other forms of getline are not available.

Functions

The original awk had only a limited number of built-in string functions. (See Table 11.1 and Table 11.3.)

Table 11.1. Original awk’s Built-In String Functions
Awk FunctionDescription
index(s,t)Returns position of substring t in string s or zero if not present.
length(s)

Returns length of string s or length of $0 if no string is supplied.

split(s,a,sep)

Parses string s into elements of array a using field separator sep; returns number of elements. If sep is not supplied, FS is used. Array splitting works the same way as field splitting.

sprintf(”fmt“,expr)Uses printf format specification for expr.
substr(s,p,n)

Returns substring of string s at beginning position p up to maximum length of n. If n isn’t supplied, the rest of the string from p is used.

Some built-in functions can be classified as arithmetic functions. Most of them take a numeric argument and return a numeric value. Table 11.2 summarizes these arithmetic functions.

Table 11.2. Original awk’s Built-In Arithmetic Functions
Awk FunctionDescription
exp(x)Returns e to the power x.
int(x)Returns truncated value of x.
log(x)Returns natural logarithm (base-e) of x.
sqrt(x)Returns square root of x.

One of the nicest facilities in awk, the ability to define your own functions, is also not available in original awk.

Built-In Variables

In original awk only the variables shown in Table 11.3 are built in.

Table 11.3. Original awk System Variables
VariableDescription
FILENAMECurrent filename
FSField separator (a blank)
NFNumber of fields in current record
NRNumber of the current record
OFMTOutput format for numbers (%.6g)
OFSOutput field separator (a blank)
ORSOutput record separator (a newline)
RSRecord separator (a newline)

OFMT does double duty, serving as the conversion format for the print statement, as well as for converting numbers to strings.



[1] Actually, this is the concatenation of single-quoted text with double-quoted text with more single-quoted text to produce one large quoted string. This trick was used ealier, in Chapter 6.