Table of Contents for
sed & awk, 2nd Edition

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition sed & awk, 2nd Edition by Arnold Robbins Published by O'Reilly Media, Inc., 1997
  1. sed & awk, 2nd Edition
  2. Cover
  3. sed & awk, 2nd Edition
  4. A Note Regarding Supplemental Files
  5. Dedication
  6. Preface
  7. Scope of This Handbook
  8. Availability of sed and awk
  9. Obtaining Example Source Code
  10. Conventions Used in This Handbook
  11. About the Second Edition
  12. Acknowledgments from the First Edition
  13. Comments and Questions
  14. 1. Power Tools for Editing
  15. 1.1. May You Solve Interesting Problems
  16. 1.2. A Stream Editor
  17. 1.3. A Pattern-Matching Programming Language
  18. 1.4. Four Hurdles to Mastering sed and awk
  19. 2. Understanding Basic Operations
  20. 2.1. Awk, by Sed and Grep, out of Ed
  21. 2.2. Command-Line Syntax
  22. 2.3. Using sed
  23. 2.4. Using awk
  24. 2.5. Using sed and awk Together
  25. 3. Understanding Regular Expression Syntax
  26. 3.1. That’s an Expression
  27. 3.2. A Line-Up of Characters
  28. 3.3. I Never Metacharacter I Didn’t Like
  29. 4. Writing sed Scripts
  30. 4.1. Applying Commands in a Script
  31. 4.2. A Global Perspective on Addressing
  32. 4.3. Testing and Saving Output
  33. 4.4. Four Types of sed Scripts
  34. 4.5. Getting to the PromiSed Land
  35. 5. Basic sed Commands
  36. 5.1. About the Syntax of sed Commands
  37. 5.2. Comment
  38. 5.3. Substitution
  39. 5.4. Delete
  40. 5.5. Append, Insert, and Change
  41. 5.6. List
  42. 5.7. Transform
  43. 5.8. Print
  44. 5.9. Print Line Number
  45. 5.10. Next
  46. 5.11. Reading and Writing Files
  47. 5.12. Quit
  48. 6. Advanced sed Commands
  49. 6.1. Multiline Pattern Space
  50. 6.2. A Case for Study
  51. 6.3. Hold That Line
  52. 6.4. Advanced Flow Control Commands
  53. 6.5. To Join a Phrase
  54. 7. Writing Scripts for awk
  55. 7.1. Playing the Game
  56. 7.2. Hello, World
  57. 7.3. Awk’s Programming Model
  58. 7.4. Pattern Matching
  59. 7.5. Records and Fields
  60. 7.6. Expressions
  61. 7.7. System Variables
  62. 7.8. Relational and Boolean Operators
  63. 7.9. Formatted Printing
  64. 7.10. Passing Parameters Into a Script
  65. 7.11. Information Retrieval
  66. 8. Conditionals, Loops, and Arrays
  67. 8.1. Conditional Statements
  68. 8.2. Looping
  69. 8.3. Other Statements That Affect Flow Control
  70. 8.4. Arrays
  71. 8.5. An Acronym Processor
  72. 8.6. System Variables That Are Arrays
  73. 9. Functions
  74. 9.1. Arithmetic Functions
  75. 9.2. String Functions
  76. 9.3. Writing Your Own Functions
  77. 10. The Bottom Drawer
  78. 10.1. The getline Function
  79. 10.2. The close( ) Function
  80. 10.3. The system( ) Function
  81. 10.4. A Menu-Based Command Generator
  82. 10.5. Directing Output to Files and Pipes
  83. 10.6. Generating Columnar Reports
  84. 10.7. Debugging
  85. 10.8. Limitations
  86. 10.9. Invoking awk Using the #! Syntax
  87. 11. A Flock of awks
  88. 11.1. Original awk
  89. 11.2. Freely Available awks
  90. 11.3. Commercial awks
  91. 11.4. Epilogue
  92. 12. Full-Featured Applications
  93. 12.1. An Interactive Spelling Checker
  94. 12.2. Generating a Formatted Index
  95. 12.3. Spare Details of the masterindex Program
  96. 13. A Miscellany of Scripts
  97. 13.1. uutot.awk—Report UUCP Statistics
  98. 13.2. phonebill—Track Phone Usage
  99. 13.3. combine—Extract Multipart uuencoded Binaries
  100. 13.4. mailavg—Check Size of Mailboxes
  101. 13.5. adj—Adjust Lines for Text Files
  102. 13.6. readsource—Format Program Source Files for troff
  103. 13.7. gent—Get a termcap Entry
  104. 13.8. plpr—lpr Preprocessor
  105. 13.9. transpose—Perform a Matrix Transposition
  106. 13.10. m1—Simple Macro Processor
  107. A. Quick Reference for sed
  108. A.1. Command-Line Syntax
  109. A.2. Syntax of sed Commands
  110. A.3. Command Summary for sed
  111. B. Quick Reference for awk
  112. B.1. Command-Line Syntax
  113. B.2. Language Summary for awk
  114. B.3. Command Summary for awk
  115. C. Supplement for Chapter 12
  116. C.1. Full Listing of spellcheck.awk
  117. C.2. Listing of masterindex Shell Script
  118. C.3. Documentation for masterindex
  119. masterindex
  120. C.3.1. Background Details
  121. C.3.2. Coding Index Entries
  122. C.3.3. Output Format
  123. C.3.4. Compiling a Master Index
  124. Index
  125. About the Authors
  126. Colophon
  127. Copyright

phonebill—Track Phone Usage

Contributed by Nick Holloway

The problem is to calculate the cost of phone calls made. In the United Kingdom, charges are made for the number of “units” used during the duration of the call (no free local calls). The length of time a “unit” lasts depends on the charge band (linked to distance) and the charge rate (linked to time of day). You get charged a whole unit as soon as the time period begins.

The input to the program is four fields. The first field is the date (not used). The second field is “band/rate” and is used to look up the length a unit will last. The third field is the length of the call. This can either be “ss,” “mm:ss,” or “hh:mm:ss”. The fourth field is the name of the caller. We keep a stopwatch (old cheap digital), a book, and a pen. Come bill time this is fed through my awk script. This only deals with the cost of the calls, not the standing charge.

The aim of the program was to enable the minimum amount of information to be entered by the callers, and the program could be used to collect together the call costs for each user in one report. It is also written so that if British Telecom changes its costs, these can be done easily in the top of the source (this has been done once already). If more charge bands or rates are added, the table can be simply expanded (wonders of associative arrays). There are no real sanity checks done on the input data. The usage is:

phonebill [ file ... ]

Here is a (short) sample of input and output.

Input:

29/05   b/p      5:35   Nick
29/05   L/c   1:00:00   Dale
01/06   L/c     30:50   Nick

Output:

Summary for Dale:
	29/05   L/c  1:00:00  11 units
Total: 11 units @ 5.06 pence per unit = $0.56
Summary for Nick:
	29/05   b/p     5:35  19 units
	01/06   L/c    30:50   6 units
Total: 25 units @ 5.06 pence per unit = $1.26

The listing for phonebill follows:

#!/bin/awk -f
#------------------------------------------------------------------
#   Awk script to take in phone usage - and calculate cost for each
#   person
#------------------------------------------------------------------
#   Author: N.Holloway (alfie@cs.warwick.ac.uk)
#   Date  : 27 January 1989
#   Place : University of Warwick
#------------------------------------------------------------------
#   Entries are made in the form
#	Date   Type/Rate   Length  Name
#
#   Format:
#	Date		: "dd/mm"		- one word
#	Type/Rate	: "bb/rr"  (e.g. L/c)
#	Length		: "hh:mm:ss", "mm:ss", "ss"
#	Name		: "Fred"		- one word (unique)
#------------------------------------------------------------------
#   Charge information kept in array 'c', indexed by "type/rate",
#   and the cost of a unit is kept in the variable 'pence_per_unit'
#   The info is stored in two arrays, both indexed by the name. The
#   first 'summary' has the lines that hold input data, and number 
#   of units, and 'units' has the cumulative total number of units
#   used by name.
#------------------------------------------------------------------

BEGIN \
    {	
	# --- Cost per unit
	pence_per_unit  = 4.40		# cost is 4.4 pence per unit
	pence_per_unit *= 1.15		# VAT is 15%

	# --- Table of seconds per unit for different bands/rates
	#     [ not applicable have 0 entered as value ]
	c ["L/c"] = 330 ;  c ["L/s"] = 85.0;  c ["L/p"] = 60.0;
	c ["a/c"] =  96 ;  c ["a/s"] = 34.3;  c ["a/p"] = 25.7;
	c ["b1/c"]= 60.0;  c ["b1/s"]= 30.0;  c ["b1/p"]= 22.5;
	c ["b/c"] = 45.0;  c ["b/s"] = 24.0;  c ["b/p"] = 18.0;
	c ["m/c"] = 12.0;  c ["m/s"] = 8.00;  c ["m/p"] = 8.00;
	c ["A/c"] = 9.00;  c ["A/s"] = 7.20;  c ["A/p"] = 0   ;
	c ["A2/c"]= 7.60;  c ["A2/s"]= 6.20;  c ["A2/p"]= 0   ;
	c ["B/c"] = 6.65;  c ["B/s"] = 5.45;  c ["B/p"] = 0   ;
	c ["C/c"] = 5.15;  c ["C/s"] = 4.35;  c ["C/p"] = 3.95;
	c ["D/c"] = 3.55;  c ["D/s"] = 2.90;  c ["D/p"] = 0   ;
	c ["E/c"] = 3.80;  c ["E/s"] = 3.05;  c ["E/p"] = 0   ;
	c ["F/c"] = 2.65;  c ["F/s"] = 2.25;  c ["F/p"] = 0   ;
	c ["G/c"] = 2.15;  c ["G/s"] = 2.15;  c ["G/p"] = 2.15;
    }

    {
	spu = c [ $2 ]				# look up charge band
	if ( spu == "" || spu == 0 ) {
	    summary [ $4 ] = summary [ $4 ] "\n\t" \
			    sprintf ( "%4s  %4s  %7s   ? units",\
	                          $1, $2, $3 ) \
			    " - Bad/Unknown Chargeband"
	} else {
	    n = split ( $3, t, ":" )  # calculate length in seconds
	    seconds = 0
	    for ( i = 1; i <= n; i++ )
		seconds = seconds*60 + t[i]
	    u = seconds / spu   # calculate number of seconds
	    if ( int( u ) == u )   # round up to next whole unit
		u = int( u )
	    else
		u = int( u ) + 1
	    units [ $4 ] += u   # store info to output at end
	    summary [ $4 ] = summary [ $4 ] "\n\t" \
			    sprintf ( "%4s  %4s  %7s %3d units",\
	                         $1, $2, $3, u )
	}
    }

END \
    {
	for ( i in units ) {		# for each person
	    printf ( "Summary for %s:", i ) # newline at start
                                            # of summary
	    print summary [ i ]			# print summary details
	    # calc cost
	    total = int ( units[i] * pence_per_unit + 0.5 )
	    printf ( \
		"Total: %d units @ %.2f pence per unit = $%d.%02d\n\n", \
			    units [i], pence_per_unit, total/100, \
                                               total%100 )
	}
    }

Program Notes for phonebill

This program is another example of generating a report that consolidates information from a simple record structure.

This program also follows the three-part structure. The BEGIN procedure defines variables that are used throughout the program. This makes it easy to change the program, as phone companies are known to “upwardly revise” their rates. One of the variables is a large array named c in which each element is the number of seconds per unit, using the band over the rate as the index to the array.

The main procedure reads each line of the user log. It uses the second field, which identifies the band/rate, to get a value from the array c. It checks that a positive value was returned and then processes that value by the time specified in $3. The number of units for that call is then stored in an array named units, indexed by the name of the caller ($4). This value accumulates for each caller.

Finally, the END routine prints out the values in the units array, producing the report of units used per caller and the total cost of the calls.