Table of Contents for
sed & awk, 2nd Edition

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition sed & awk, 2nd Edition by Arnold Robbins Published by O'Reilly Media, Inc., 1997
  1. sed & awk, 2nd Edition
  2. Cover
  3. sed & awk, 2nd Edition
  4. A Note Regarding Supplemental Files
  5. Dedication
  6. Preface
  7. Scope of This Handbook
  8. Availability of sed and awk
  9. Obtaining Example Source Code
  10. Conventions Used in This Handbook
  11. About the Second Edition
  12. Acknowledgments from the First Edition
  13. Comments and Questions
  14. 1. Power Tools for Editing
  15. 1.1. May You Solve Interesting Problems
  16. 1.2. A Stream Editor
  17. 1.3. A Pattern-Matching Programming Language
  18. 1.4. Four Hurdles to Mastering sed and awk
  19. 2. Understanding Basic Operations
  20. 2.1. Awk, by Sed and Grep, out of Ed
  21. 2.2. Command-Line Syntax
  22. 2.3. Using sed
  23. 2.4. Using awk
  24. 2.5. Using sed and awk Together
  25. 3. Understanding Regular Expression Syntax
  26. 3.1. That’s an Expression
  27. 3.2. A Line-Up of Characters
  28. 3.3. I Never Metacharacter I Didn’t Like
  29. 4. Writing sed Scripts
  30. 4.1. Applying Commands in a Script
  31. 4.2. A Global Perspective on Addressing
  32. 4.3. Testing and Saving Output
  33. 4.4. Four Types of sed Scripts
  34. 4.5. Getting to the PromiSed Land
  35. 5. Basic sed Commands
  36. 5.1. About the Syntax of sed Commands
  37. 5.2. Comment
  38. 5.3. Substitution
  39. 5.4. Delete
  40. 5.5. Append, Insert, and Change
  41. 5.6. List
  42. 5.7. Transform
  43. 5.8. Print
  44. 5.9. Print Line Number
  45. 5.10. Next
  46. 5.11. Reading and Writing Files
  47. 5.12. Quit
  48. 6. Advanced sed Commands
  49. 6.1. Multiline Pattern Space
  50. 6.2. A Case for Study
  51. 6.3. Hold That Line
  52. 6.4. Advanced Flow Control Commands
  53. 6.5. To Join a Phrase
  54. 7. Writing Scripts for awk
  55. 7.1. Playing the Game
  56. 7.2. Hello, World
  57. 7.3. Awk’s Programming Model
  58. 7.4. Pattern Matching
  59. 7.5. Records and Fields
  60. 7.6. Expressions
  61. 7.7. System Variables
  62. 7.8. Relational and Boolean Operators
  63. 7.9. Formatted Printing
  64. 7.10. Passing Parameters Into a Script
  65. 7.11. Information Retrieval
  66. 8. Conditionals, Loops, and Arrays
  67. 8.1. Conditional Statements
  68. 8.2. Looping
  69. 8.3. Other Statements That Affect Flow Control
  70. 8.4. Arrays
  71. 8.5. An Acronym Processor
  72. 8.6. System Variables That Are Arrays
  73. 9. Functions
  74. 9.1. Arithmetic Functions
  75. 9.2. String Functions
  76. 9.3. Writing Your Own Functions
  77. 10. The Bottom Drawer
  78. 10.1. The getline Function
  79. 10.2. The close( ) Function
  80. 10.3. The system( ) Function
  81. 10.4. A Menu-Based Command Generator
  82. 10.5. Directing Output to Files and Pipes
  83. 10.6. Generating Columnar Reports
  84. 10.7. Debugging
  85. 10.8. Limitations
  86. 10.9. Invoking awk Using the #! Syntax
  87. 11. A Flock of awks
  88. 11.1. Original awk
  89. 11.2. Freely Available awks
  90. 11.3. Commercial awks
  91. 11.4. Epilogue
  92. 12. Full-Featured Applications
  93. 12.1. An Interactive Spelling Checker
  94. 12.2. Generating a Formatted Index
  95. 12.3. Spare Details of the masterindex Program
  96. 13. A Miscellany of Scripts
  97. 13.1. uutot.awk—Report UUCP Statistics
  98. 13.2. phonebill—Track Phone Usage
  99. 13.3. combine—Extract Multipart uuencoded Binaries
  100. 13.4. mailavg—Check Size of Mailboxes
  101. 13.5. adj—Adjust Lines for Text Files
  102. 13.6. readsource—Format Program Source Files for troff
  103. 13.7. gent—Get a termcap Entry
  104. 13.8. plpr—lpr Preprocessor
  105. 13.9. transpose—Perform a Matrix Transposition
  106. 13.10. m1—Simple Macro Processor
  107. A. Quick Reference for sed
  108. A.1. Command-Line Syntax
  109. A.2. Syntax of sed Commands
  110. A.3. Command Summary for sed
  111. B. Quick Reference for awk
  112. B.1. Command-Line Syntax
  113. B.2. Language Summary for awk
  114. B.3. Command Summary for awk
  115. C. Supplement for Chapter 12
  116. C.1. Full Listing of spellcheck.awk
  117. C.2. Listing of masterindex Shell Script
  118. C.3. Documentation for masterindex
  119. masterindex
  120. C.3.1. Background Details
  121. C.3.2. Coding Index Entries
  122. C.3.3. Output Format
  123. C.3.4. Compiling a Master Index
  124. Index
  125. About the Authors
  126. Colophon
  127. Copyright

Using sed

There are two ways to invoke sed: either you specify your editing instructions on the command line or you put them in a file and supply the name of the file.

Specifying Simple Instructions

You can specify simple editing commands on the command line.

sed [-e] 'instruction' file

The -e option is necessary only when you supply more than one instruction on the command line. It tells sed to interpret the next argument as an instruction. When there is a single instruction, sed is able to make that determination on its own. Let’s look at some examples.

Using the sample input file, list, the following example uses the s command for substitution to replace “MA” with “Massachusetts.”

$ sed 's/MA/Massachusetts/' list
John Daggett, 341 King Road, Plymouth Massachusetts
Alice Ford, 22 East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury Massachusetts
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston Massachusetts

Three lines are affected by the instruction but all lines are displayed.

Enclosing the instruction in single quotes is not required in all cases but you should get in the habit of always doing it. The enclosing single quotes prevent the shell from interpreting special characters or spaces found in the editing instruction. (The shell uses spaces to determine individual arguments submitted to a program; characters that are special to the shell are expanded before the command is invoked.)

For instance, the first example could have been entered without them but in the next example they are required, since the substitution command contains spaces:

$ sed 's/ MA/, Massachusetts/' list
John Daggett, 341 King Road, Plymouth, Massachusetts
Alice Ford, 22 East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury, Massachusetts
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston, Massachusetts

In order to place a comma between the city and state, the instruction replaced the space before the two-letter abbreviation with a comma and a space.

There are three ways to specify multiple instructions on the command line:

  1. Separate instructions with a semicolon.

    sed 's/ MA/, Massachusetts/; s/ PA/, Pennsylvania/' list
  2. Precede each instruction by -e.

    sed -e 's/ MA/, Massachusetts/' -e 's/ PA/, Pennsylvania/' list
  3. Use the multiline entry capability of the Bourne shell.[1] Press RETURN after entering a single quote and a secondary prompt (>) will be displayed for multiline input.

    $ sed ' 
    > s/ MA/, Massachusetts/
    > s/ PA/, Pennsylvania/
    > s/ CA/, California/' list 
    John Daggett, 341 King Road, Plymouth, Massachusetts
    Alice Ford, 22 East Broadway, Richmond VA
    Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
    Terry Kalkas, 402 Lans Road, Beaver Falls, Pennsylvania
    Eric Adams, 20 Post Road, Sudbury, Massachusetts
    Hubert Sims, 328A Brook Road, Roanoke VA
    Amy Wilde, 334 Bayshore Pkwy, Mountain View, California
    Sal Carpenter, 73 6th Street, Boston, Massachusetts

    This technique will not work in the C shell. Instead, use semicolons at the end of each instruction, and you can enter commands over multiple lines by ending each line with a backslash. (Or, you could temporarily go into the Bourne shell by entering sh and then type the command.)

In the example above, changes were made to five lines and, of course, all lines were displayed. Remember that nothing has changed in the input file.

Command garbled

The syntax of a sed command can be detailed, and it’s easy to make a mistake or omit a required element. Notice what happens when incomplete syntax is entered:

$ sed -e 's/MA/Massachusetts' list
sed: command garbled: s/MA/Massachusetts

Sed will usually display any line that it cannot execute, but it does not tell you what is wrong with the command.[2] In this instance, a slash, which marks the search and replacement portions of the command, is missing at the end of the substitute command.

GNU sed is more helpful:

$ gsed -e 's/MA/Massachusetts' list
gsed: Unterminated `s' command

Script Files

It is not practical to enter longer editing scripts on the command line. That is why it is usually best to create a script file that contains the editing instructions. The editing script is simply a list of sed commands that are executed in the order in which they appear. This form, using the -f option, requires that you specify the name of the script file on the command line.

sed -f scriptfile file

All the editing commands that we want executed are placed in a file. We follow a convention of creating temporary script files named sedscr.

$ cat sedscr
s/ MA/, Massachusetts/
s/ PA/, Pennsylvania/
s/ CA/, California/
s/ VA/, Virginia/
s/ OK/, Oklahoma/

The following command reads all of the substitution commands in sedscr and applies them to each line in the input file list:

$ sed -f sedscr list
John Daggett, 341 King Road, Plymouth, Massachusetts
Alice Ford, 22 East Broadway, Richmond, Virginia
Orville Thomas, 11345 Oak Bridge Road, Tulsa, Oklahoma
Terry Kalkas, 402 Lans Road, Beaver Falls, Pennsylvania
Eric Adams, 20 Post Road, Sudbury, Massachusetts
Hubert Sims, 328A Brook Road, Roanoke, Virginia
Amy Wilde, 334 Bayshore Pkwy, Mountain View, California
Sal Carpenter, 73 6th Street, Boston, Massachusetts

Once again, the result is ephemeral, displayed on the screen. No change is made to the input file.

If a sed script can be used again, you should rename the script and save it. Scripts of proven value can be maintained in a personal or system-wide library.

Saving output

Unless you are redirecting the output of sed to another program, you will want to capture the output in a file. This is done by specifying one of the shell’s I/O redirection symbols followed by the name of a file:

$ sed -f sedscr list > newlist

Do not redirect the output to the file you are editing or you will clobber it. (The “>” redirection operator truncates the file before the shell does anything else.) If you want the output file to replace the input file, you can do that as a separate step, using the mv command. But first make very sure your editing script has worked properly!

In Chapter 4, we will look at a shell script named runsed that automates the process of creating a temporary file and using mv to overwrite the original file.

Suppressing automatic display of input lines

The default operation of sed is to output every input line. The -n option suppresses the automatic output. When specifying this option, each instruction intended to produce output must contain a print command, p. Look at the following example.

$ sed -n -e 's/MA/Massachusetts/p' list
John Daggett, 341 King Road, Plymouth Massachusetts
Eric Adams, 20 Post Road, Sudbury Massachusetts
Sal Carpenter, 73 6th Street, Boston Massachusetts

Compare this output to the first example in this section. Here, only the lines that were affected by the command were printed.

Mixing options (POSIX)

You can build up a script by combining both the -e and -f options on the command line. The script is the combination of all the commands in the order given. This appears to be supported in UNIX versions of sed, but this feature is not clearly documented in the manpage. The POSIX standard explicitly mandates this behavior.

Summary of options

Table 2.1 summarizes the sed command-line options.

Table 2.1. Command-Line Options for sed
OptionDescription
-e

Editing instruction follows.

-f

Filename of script follows.

-n

Suppress automatic output of input lines.



[1] These days there are many shells that are compatible with the Bourne shell, and work as described here: ksh, bash, pdksh, and zsh, to name a few.

[2] Some vendors seem to have improved things. For instance, on SunOS 4.1.x, sed reports “sed: Ending delimiter missing on substitution: s/MA/Massachusetts”.