Chapter 10. Additional Features for Scripting

Many scripts are written as simple one-off scripts that are only used by their author, consisting of only a few lines—perhaps only a single loop, if that. But some scripts are heavy-duty scripts that will see a lot of use from a variety of users. Such scripts will often need to take advantage of features that allow for better sharing and reuse of code. These advanced scripting techniques can be useful for many kinds of scripts, and are often found in larger systems of scripts such as the /etc/init.d scripts on many Linux systems. You don’t have to be a system administrator to appreciate and use the tips and techniques described here. They will prove themselves on any large scripting effort.

10.1 “Daemon-izing” Your Script

Problem

Sometimes you want a script to run as a daemon, in the background and never ending. To do this properly you need to be able to detach your script from its controlling TTY—that is, from the terminal session used to start the daemon. Simply putting an ampersand on the command isn’t enough. If you start your daemon script on a remote system via an SSH (or similar) session, you’ll notice that when you log out, the SSH session doesn’t end and your window is hung until that script ends (which, being a daemon, it won’t).

Solution

Use the following to invoke your script, run it in the background, and still allow yourself to log out:

nohup mydaemonscript 0<&-1>/dev/null 2>&1 &

or:

nohup mydaemonscript >>/var/log/myadmin.log 2>&1 <&-  &

Discussion

You need to close the controlling TTY (terminal), which is connected in three ways to your (or any) job: via standard input (STDIN), standard output (STDOUT), and standard error (STDERR). We can close STDOUT and STDERR by pointing them at another file—typically either a logfile, so that you can retrieve their output at a later time, or the file /dev/null to throw away all their output. We use the redirecting operator > to do this.

But what about STDIN? The cleanest way to deal with STDIN is to close the file descriptor. The bash syntax to do that is like a redirect, but with a dash for the file-name (0<&- or <&-).

We use the nohup command so that the script is run without being interrupted by a hangup signal when we log off.

In the first example, we use the file descriptor numbers (i.e., 0, 1, 2) explicitly in all three redirections. They are optional in the case of STDIN and STDOUT, so in our second example we don’t use them explicitly. We also put the input redirect at the end of the second command rather than at the beginning, since the order here is not important. (However, the order is important and the file descriptor number is necessary in redirecting STDERR.)

See Also

  • Chapters 2 and 3 for more on redirecting output and input

10.2 Reusing Code with Includes and Sourcing

Problem

There are a set of shell variable assignments that you would like to have common across a set of scripts that you are writing. You tried putting this configuration information in its own script, but when you run that script from within another script, the values don’t stick; your configuration is running in another shell, and when that shell exits, so do your values. Is there some way to run that configuration script within the current shell?

Solution

Use the bash shell’s source command or POSIX’s single period (.) to read in the contents of that configuration file. The lines of that file will be processed as if encountered in the current script.

Here’s an example of some configuration data:

$ cat myprefs.cfg
SCRATCH_DIR=/var/tmp
IMG_FMT=png
SND_FMT=ogg
$

It is just a simple script consisting of three assignments. Here’s another script, one that will use these values:

# use the user prefs
source $HOME/myprefs.cfg
cd ${SCRATCH_DIR:-/tmp}
echo You prefer $IMG_FMT image files
echo You prefer $SND_FMT sound files

Discussion

The script that is going to use the configuration file uses the source command to read in the file. It can also use a dot (.) in place of the word source. A dot is easy and quick to type, but hard to notice in a script or screenshot:

. $HOME/myprefs.cfg

You wouldn’t be the first person to look right past the dot and think that the script was just being executed.

Sourcing is both a powerful and a dangerous feature of bash scripting. It gives you a way to create a configuration file and then share that file among several scripts. With that mechanism, you can change your configuration by editing one file, not several scripts.

The contents of the configuration file are not limited to simple variable assignment, however. Any valid shell command is legal syntax, because when you source a file like this, it is simply getting its input from a different source; it is still the bash shell processing bash commands. Regardless of what shell commands are in that sourced file—for example, loops or invoking other commands—it is all legitimate shell input and will be run as if it were part of your script.

Here’s a modified configuration file:

$ cat myprefs.cfg
SCRATCH_DIR=/var/tmp
IMG_FMT=$(cat $HOME/myimage.pref)
if [ -e /media/mp3 ]
then
        SND_FMT=mp3
else
        SND_FMT=ogg
fi
echo config file loaded
$

This configuration file is hardly what one thinks of as a passive list of configured variables. It can run other commands (e.g., cat) and use if statements to vary its choices. It even ends by echoing a message. Be careful when you source something, as it’s a wide-open door into your script.

One of the best uses of sourcing scripts comes when you define bash functions (as we will show you in Recipe 10.3). These functions can then be shared as a common library of functions among all the scripts that source the script of function definitions.

10.3 Using Configuration Files in a Script

Problem

You want to use one or more external configuration files for one or more scripts.

Solution

You could write a lot of code to parse some special configuration file format. Do yourself a favor and don’t do that. Just make the config file a shell script and use the solution in Recipe 10.2.

Discussion

This is just a specific application of sourcing a file. However, it’s worth noting that you may need to give a little thought to how you can reduce all of your configuration needs to bash-legal syntax. In particular, you can make use of Boolean flags and optional variables (see Chapter 5 and Recipe 15.11):

# In config file
VERBOSE=0                # 0 or '' for off, 1 for on
SSH_USER='jbagadonutz@'  # Note trailing @, set to '' to use the current user

# In script
[ "$VERBOSE" ] || echo "Verbose msg from $0 goes to STDERR" >&2
[...]
ssh $SSH_USER$REMOTE_HOST [...]

Of course, depending on the user to get the configuration file correct can be chancy, so instead of requiring the user to read the comment and add the trailing @, we could do it in the script:

# If $SSH_USER is set and doesn't have a trailing @ add it:
[ -n "$SSH_USER" -a "$SSH_USER" = "${SSH_USER%@}" ] && SSH_USER="$SSH_USER@"

or just use:

ssh ${SSH_USER:+${SSH_USER}@}${REMOTE_HOST} [...]

to make that same substitution right in place. The bash variable operator :+ will do the following: if $SSH_USER has a value, it will return the value to the right of the :+ (in this case we specified the variable itself along with an extra @); otherwise, if unset or empty, it will return nothing.

10.4 Defining Functions

Problem

There are several places in your shell script where you would like to give the user a usage message (a message describing the proper syntax for the command), but you don’t want to keep repeating the code for the same echo statement. Isn’t there a way to do this just once and have several references to it? If you could make the usage message its own script, then you could just invoke it anywhere in your original script—but that requires two scripts, not one. Besides, it seems odd to have the message for how to use one script be the output of a different script. Isn’t there a better way to do this?

Solution

You need a bash function. At the beginning of your script, put something like this:

function usage ()
{
    printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/} > &2
}

Then later in your script you can write code like this:

if [ $# -lt 1]
then
    usage
fi

Discussion

Functions may be defined in several ways ([ function ] name [()] {compound-command } [ redirections ]). We could write a function definition any of these ways:

function usage ()
{
    printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/} > &2
}

function usage {
    printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/} > &2
}

usage ()
{
    printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/} > &2
}

Either the reserved word function or the trailing literal () must be present. If function is used, the () are optional. We like using the word function because it is very clear and readable, and it is easy to grep for; e.g., grep ^'function' script will list the functions in your script file.

This function definition should go at the top of your shell script, or at least somewhere before you need to invoke the function. The definition is, in a sense, just another bash statement. But once it has been executed, the function is defined. If you invoke the function before it is defined you will get a “command not found” error. That’s why we always put our function definitions first, before any other commands in our scripts.

Our function does very little; it is just a printf statement. Because we have this one usage message embodied in a single function, though, if we ever add a new option we don’t need to modify several statements scattered throughout the script, just this one.

The only argument to printf beyond the format string is $0, the name by which the shell script was invoked, modified (with the ## operator) so that only the last part of any pathname is included. This is similar to using $(basename $0).

Since the usage message is an error message, we redirect the output of the printf statement to standard error. We could also have put that redirection on the outside of the function definition, so that all output from the function would be redirected. This would be convenient if we had multiple output statements, like this:

function usage ()
{
    printf "usage: %s [ -a | - b ] file1 ... filen\n" ${0##*/}
    printf "example: %s -b *.jpg \n" ${0##*/}
    printf "or else: %s -a myfile.txt yourfile.txt \n" ${0##*/}
} > &2

10.5 Using Functions: Parameters and Return Values

Problem

You want to use a function and you need to get some values into the function. How do you pass in parameters? How do you get values back?

Solution

You don’t put parentheses around the arguments like you might expect from some programming languages. Put any parameters for a bash function right after the function’s name, separated by whitespace, just as if you were invoking any shell script or command. Don’t forget to quote them if necessary!

# define the function:
function max ()
{ ... }
#
# call the function:
#
max   128   $SIM
max  $VAR  $CNT

You have two ways to get values back from a function. First, you can assign values to variables inside the body of your function, as in Example 10-1. Those variables will be global to the whole script unless they are explicitly declared local within the function.

Example 10-1. ch10/func_max.1
# cookbook filename: func_max.1

# define the function:
function max ()
{
    local HIDN
    if [ $1 -gt $2 ]
    then
        BIGR=$1
    else
        BIGR=$2
    fi
    HIDN=5
}

For example:

# call the function:
max 128 $SIM
# use the result:
echo $BIGR

The other way is to use echo or printf to send the output to standard output, as in Example 10-2.

Example 10-2. ch10/func_max.2
# cookbook filename: func_max.2

# define the function:
function max ()
{
    if [ $1 -gt $2 ]
    then
        echo $1
    else
        echo $2
    fi
}

Then you must invoke the function inside a $(), capturing the output and using the result, or it will be wasted on the screen. For example:

# call the function:
BIGR=$(max 128 $SIM)
# use the result
echo $BIGR

Discussion

Putting parameters on the invocation of the function is just like calling any shell script. The parameters are just the other words on the command line.

Within the function, the parameters are referred to as if they were command-line arguments by using $1, $2, etc. However, $0 is left alone. It remains the name by which the entire script was invoked. On return from the function, $1, $2, etc. are back to referring to the parameters with which the script was invoked.

Also of interest is the $FUNCNAME array. $FUNCNAME all by itself references the zeroth element of the array, which is the name of the currently executing function. In other words, $FUNCNAME is to a function as $0 is to a script, except without all the path information. The rest of the array elements are what amounts to a call stack, with “main” as the bottom or last element. This variable only exists while a function is executing.

We included the useless variable $HIDN just to show that it is local to the function definition. Even though we can assign it values inside the function, any such value would not be available elsewhere in the script. It is a variable whose value is local to that function; it comes into existence when the function is called, and is gone once the function returns.

Returning values by setting variables is more efficient, and can handle lots of data— many variables can be set—but the approach has its drawbacks. Notably, it requires that the function and the rest of the script agree on variable names for the information hand-off. This kind of coupling has maintenance issues. The other approach, using the output as the way to return values, does reduce this coupling, but is limited in its usefulness—it is limited in how much data it can return before your script has to spend lots of effort parsing the results of the function. So which to use? As with much of engineering, this, too, is a trade-off and you have to decide based on your specific needs.

10.6 Trapping Interrupts

Problem

You are writing a script that needs to be able to trap signals and respond accordingly.

Solution

Use the trap utility to set signal handlers. First, use trap -l (or kill -l) to list the signals you may trap. They vary from system to system:

# NetBSD
$ trap -l
 1) SIGHUP    2)   SIGINT      3) SIGQUIT    4) SIGILL
 5) SIGTRAP   6)   SIGABRT     7) SIGEMT     8) SIGFPE
 9) SIGKILL   10)  SIGBUS     11) SIGSEGV   12) SIGSYS
 13) SIGPIPE  14)  SIGALRM    15) SIGTERM   16) SIGURG
 17) SIGSTOP  18)  SIGTSTP    19) SIGCONT   20) SIGCHLD
 21) SIGTTIN  22)  SIGTTOU    23) SIGIO     24) SIGXCPU
 25) SIGXFSZ  26)  SIGVTALRM  27) SIGPROF   28) SIGWINCH
 29) SIGINFO  30)  SIGUSR1    31) SIGUSR2   32) SIGPWR
$
# Linux (re-wrapped to fit on the page)
$ trap -l
 1) SIGHUP        2) SIGINT        3) SIGQUIT       4) SIGILL
 5) SIGTRAP       6) SIGABRT       7) SIGBUS        8) SIGFPE
 9) SIGKILL      10) SIGUSR1      11) SIGSEGV      12) SIGUSR2
13) SIGPIPE      14) SIGALRM      15) SIGTERM      16) SIGSTKFLT
17) SIGCHLD      18) SIGCONT      19) SIGSTOP      20) SIGTSTP
21) SIGTTIN      22) SIGTTOU      23) SIGURG       24) SIGXCPU
25) SIGXFSZ      26) SIGVTALRM    27) SIGPROF      28) SIGWINCH
29) SIGIO        30) SIGPWR       31) SIGSYS       34) SIGRTMIN
35) SIGRTMIN+1   36) SIGRTMIN+2   37) SIGRTMIN+3   38) SIGRTMIN+4
39) SIGRTMIN+5   40) SIGRTMIN+6   41) SIGRTMIN+7   42) SIGRTMIN+8
43) SIGRTMIN+9   44) SIGRTMIN+10  45) SIGRTMIN+11  46) SIGRTMIN+12
47) SIGRTMIN+13  48) SIGRTMIN+14  49) SIGRTMIN+15  50) SIGRTMAX-14
51) SIGRTMAX-13  52) SIGRTMAX-12  53) SIGRTMAX-11  54) SIGRTMAX-10
55) SIGRTMAX-9   56) SIGRTMAX-8   57) SIGRTMAX-7   58) SIGRTMAX-6
59) SIGRTMAX-5   60) SIGRTMAX-4   61) SIGRTMAX-3   62) SIGRTMAX-2
63) SIGRTMAX-1   64) SIGRTMAX
$

Next, set your trap(s) and signal handlers. Note that the exit status of your script will be 128+signal number if the command was terminated by signal signal number. Here is a simple case where we only care that we got a signal and don’t care what it was. If our trap had been trap '' ABRT EXIT HUP INT QUIT TERM, this script would be rather hard to kill because any of those signals would just be ignored:

$ cat hard_to_kill
#!/bin/bash
trap ' echo "You got me! $?" ' ABRT EXIT HUP INT QUIT TERM
trap ' echo "Later... $?"; exit ' USR1
sleep 120

$ ./hard_to_kill
^CYou got me! 130
You got me! 130

$ ./hard_to_kill &
[1] 26354

$ kill -USR1 %1
User defined signal 1
Later... 158
You got me! 0
[1]+ Done                      ./hard_to_kill

$ ./hard_to_kill &
[1] 28180

$ kill %1
You got me! 0
[1]+ Terminated                ./hard_to_kill

Example 10-3 is a more interesting example.

Example 10-3. ch10/hard_to_kill
#!/usr/bin/env bash
# cookbook filename: hard_to_kill

function trapped {
    if [ "$1" = "USR1" ]; then
        echo "Got me with a $1 trap!"
        exit
    else
        echo "Received $1 trap--neener, neener"
    fi
}

trap "trapped ABRT" ABRT
trap "trapped EXIT" EXIT
trap "trapped HUP"  HUP
trap "trapped INT"  INT
trap "trapped KILL" KILL   # This won't actually work
trap "trapped QUIT" QUIT
trap "trapped TERM" TERM
trap "trapped USR1" USR1   # This one is special

# Just hang out and do nothing, without introducing "third-party"
# trap behavior, such as if we used 'sleep'
while (( 1 )); do
    :  # : is a NOOP
done

Here we invoke this example then try to kill it:

$ ./hard_to_kill
^CReceived INT trap--neener, neener
^CReceived INT trap--neener, neener
^CReceived INT trap--neener, neener
^Z
[1]+ Stopped                  ./hard_to_kill

$ kill -TERM %1

[1]+ Stopped                  ./hard_to_kill
Received TERM trap--neener, neener

$ jobs
[1]+ Stopped                  ./hard_to_kill

$ bg
[1]+ ./hard_to_kill &

$ jobs
[1]+ Running                  ./hard_to_kill &

$ kill -TERM %1
Received TERM trap--neener, neener

$ kill -HUP %1
Received HUP trap--neener, neener

$ kill -USR1 %1
Got me with a USR1 trap!
Received EXIT trap--neener, neener

[1]+ Done                     ./hard_to_kill

Discussion

First, we should mention that you can’t actually trap -SIGKILL (-9). That signal kills processes dead immediately, so they have no chance to trap anything—so maybe our examples weren’t really so hard to kill after all. But remember that this signal does not allow the script or program to clean up or shut down gracefully at any time. That’s often a bad thing, so try to avoid using kill -KILL unless you have no other choice.

Usage for trap is as follows:

trap [-lp] [arg] [signal [signal]]

The first nonoption argument to trap is the code to execute when the given signal is received. As shown in the previous examples, that code can be self-contained, or it can be a call to a function. For most nontrivial uses a call to one or more error handling functions is probably best, since that lends itself well to cleanup and graceful termination features. If this argument is the null string, the given signal or signals will be ignored. If the argument is - or missing, but one or more signals are listed, they will be reset to the shell defaults. -l lists the signal names, as shown in the Solution section, while -p will print any current traps and their handlers.

When using more than one trap handler, we recommend you take the extra time to alphabetize the signal names because that makes them easier to read and find later on.

As noted previously, the exit status of your script will be 128+signal number if the command was terminated by signal signal number.

There are three pseudosignals for various special purposes. The DEBUG signal is similar to EXIT but is used before every command for debugging purposes. The RETURN signal is triggered when execution resumes after a function or source (.) call. And the ERR signal is triggered after a simple command fails. Consult the Bash Reference Manual for more specific details and caveats, especially dealing with functions using the declare builtin or the set -o functrace option.

Tip

There are some POSIX differences that affect trap. As noted in the Bash Reference Manual, “Starting bash with the --posix command-line option or executing 'set -o posix' while Bash is running will cause Bash to conform more closely to the POSIX standard by changing the behavior to match that specified by POSIX in areas where the Bash default differs.” In particular, this will cause kill and trap to display signal names without the leading SIG and the output of kill -l will be different. Also, trap will handle its argument somewhat more strictly; in particular, it will require a leading - in order to reset the trap to the shell default. In other words, it requires trap -USR1, not just trap USR1. We recommend that you always include the - even when not necessary, because it makes your intent clearer in the code.

10.7 Redefining Commands with alias

Problem

You’d like to slightly alter the definition of a command, perhaps so that you always use a particular option on the command (e.g., always using -a on the ls command or -i on the rm command).

Solution

Use the alias feature of bash for interactive shells (only). The alias command is smart enough not to go into an endless loop when you say something like:

alias ls='ls -a'

In fact, just type alias with no other arguments and you can see a list of aliases that are already defined for you in your bash session. Some installations may already have several available for you.

Discussion

The alias mechanism is a straightforward text substitution. It occurs very early in the command-line processing, so other substitutions will occur after the alias. For example, if you want to define the single letter “h” to be the command that lists your home directory, you can do it like this:

alias h='ls $HOME'

or like this:

alias h='ls ~'

The use of single quotes is significant in the first instance, meaning that the variable $HOME will not be evaluated when the definition of the alias is made. Only when you run the command will the (string) substitution be made, and only then will the $HOME variable be evaluated. That way if you change the definition of $HOME the alias will move with it, so to speak.

If, instead, you used double quotes, then the substitution of the variable’s value would be made right away and the alias would be defined with the value of $HOME substituted. You can see this by typing alias with no arguments so that bash lists all the alias definitions. You would see something like this:

...
alias h='ls /home/youracct'
...

If you don’t like what your alias does and want to get rid of it, just use unalias and the name of the alias that you no longer want. For example:

\unalias h

will remove the definition we made earlier. If you get really messed up, you can use unalias -a to remove all the alias definitions in your current shell session. Why did we prefix the previous command with a backslash? The backslash prefix disables alias expansion for any command, so it is standard security best practice to use \unalias just in case some bad actor has aliased unalias, perhaps to “:”, to make it ineffective:

$ alias unalias=':'

$ alias unalias
alias unalias=':'

$ unalias unalias

$ alias unalias
alias unalias=':'

$ \unalias unalias

$ alias unalias
bash: alias: unalias: not found

Aliases do not allow arguments. For example, you cannot do this:

alias='mkdir $1 && cd $1'

The difference between $1 and $HOME is that $HOME is defined (one way or another) when the alias itself is defined, while you’d expect $1 to be passed in at runtime. Sorry, that doesn’t work. Use a function instead.

10.8 Avoiding Aliases and Functions

Problem

You’ve written an alias or function to override a real command, but now you want to execute the real command.

Solution

Use the bash shell’s builtin command to ignore shell functions and aliases and run an actual builtin command.

Use the command command to ignore shell functions and aliases and run an actual external command.

If you only want to avoid alias expansion, but still allow function definitions to be considered, then prefix the command with \ to just prevent alias expansion.

Use the type command (also with -a) to figure out what you’ve got.

Here are some examples:

$ alias echo='echo ~~~'

$ echo test
~~~ test

$ \echo test
test

$ builtin echo test
test

$ type echo
echo is aliased to `echo ~~~'

$ unalias echo

$ type echo
echo is a shell builtin

$ type -a echo
echo is a shell builtin
echo is /bin/echo

$ echo test
test

Here is a function definition that we will discuss:

function cd ()
{
    if [[ $1 = "..." ]]
    then
        builtin cd ../..
    else
        builtin cd "$1"
    fi
}

Discussion

The alias command is smart enough not to go into an endless loop when you say something like alias ls='ls-a' or alias echo='echo ~~~', so in our first example we need to do nothing special on the righthand side of our alias definition to refer to the actual echo command.

When we have echo defined as an alias, the type command will not only tell us that this is an alias, but show us the alias definition. Similarly, with function definitions, we would be shown the actual body of the function. type -a some_command will show us all of the places (aliases, builtins, functions, and external) that contain some_command (as long as we are not also using -p).

In our last example, the function overrides the definition of cd so that we can add a simple shortcut. We want our function to understand that cd ... means to go up two directories; i.e., cd ../.. (see Recipe 16.15). All other arguments will be treated as normal. Our function simply looks for a match with ... and substitutes the real meaning. But how, within (or without) the function, do we invoke the underlying cd command so as to actually change directories? The builtin command tells bash to assume that the command that follows is a shell builtin command and not to use any alias or function definition. We use it within the function, but it can be used at any time to refer, unambiguously, to the actual command, avoiding any function name that might be overriding it.

If your function name is that of an executable, like ls, and not a builtin command, then you can override any alias and/or function definition by just referring to the full path to the executable, such as /bin/ls rather than just ls as the command. If you don’t know its full path, just prefix the command with the keyword command and bash will ignore any alias and function definitions with that name and use the actual command. Please note, however, that the $PATH variable will still be used to determine the location of the command. If you are running the wrong ls because your $PATH has some unexpected values, adding command will not help in that situation.

10.9 Counting Elapsed Time

Problem

You want to display how long a script, or an operation in a script, takes.

Solution

Use the time builtin or the bash variable $SECONDS.

Discussion

time reports the time used by a process or pipeline in a variety of ways:

$ time sleep 4

real	0m4.029s
user	0m0.000s
sys	0m0.000s

$ time sha256sum /bin/* &> /dev/null

real	0m1.252s
user	0m0.072s
sys	0m0.028s

You can use time for commands or functions inside a script, but you can’t time the entire script from inside itself. You can certainly add time to a calling script or cron job, but be aware if you add it to cron that there will always be output, so you will always get a cron email about the run.

If that seems like overkill or you just want to know how long the entire script took, you can use $SECONDS. According to the Bash Reference Manual:

[$SECONDS] expands to the number of seconds since the shell was started. Assignment to this variable resets the count to the value assigned, and the expanded value becomes the value assigned plus the number of seconds since the assignment.

Examples:

$ cat seconds
started="$SECONDS"
sleep 4
echo "Run-time = $(($SECONDS - $started)) seconds..."

$ bash seconds
Run-time = 4 seconds...

$ time bash seconds
Run-time = 4 seconds...

real	0m4.003s
user	0m0.000s
sys	0m0.000s

See Also

10.10 Writing Wrappers

Problem

You have a series of related commands or tools that you often need to use in an ad hoc manner, and you want to collect them in one place to make them easier to use and remember.

Solution

Write a shell script “wrapper” using case..esac blocks as needed.

Discussion

There are two basic ways to handle needs like this. One is to write a lot of tiny shell scripts, or perhaps aliases, to handle all the needs. This is the approach taken by BusyBox where a large number of tools are really just symlinks to a single binary. The other is like the majority of revision control tools, where you call a single binary like a “prefix,” then add the action or command. Both approaches have merit, but we tend to prefer the second one because you only have to remember the single prefix command.

There is an excellent discussion of this concept and a more complicated implementation in Signal v. Noise; we encourage you to read about it. Our implementation is a bit simpler, and has some handy tricks. Some of our basic design considerations are as follows:

  • Simple to read and understand

  • Simple to add to

  • Built-in, inline help that’s easy to write

  • Easy to use and remember

We wrote the second edition of this book in Asciidoc, and there is a lot of markup to remember, so here’s an excerpt from a tool we wrote to help us (Example 10-4). This tool can get input from the command line or it can read from and write to the Linux clipboard.

1

Sanity-check required variables and locations.

2

Set a more readable name for recursion.

3

Set a more readable name for the command or action we’re going to take.

4

Remove that argument from the list so we don’t reuse or include it in the input or output later.

5

If the xsel command is available and executable, and we passed no other arguments, then set up the input and output to be from and to the clipboard. That turns this script into an application-generic macro tool! No matter what editor you are using, if you have a GUI and read from and write to the clipboard, if you switch to a terminal session you can copy text, process it, and paste it easily, which is a really handy thing to be able to do!

6

Each block in the case..esac is both the code and the documentation. The number of # characters determines the section, so the code can be in whatever order makes sense, but the help/usage can vary from that.

7

Take the input text and make a recursive call to get an ID out of that, then output the boilerplate markup.

8

Note that inside the here-document the indentation must be tabs.

9

Sometimes the boilerplate markup doesn’t include any input text.

10

Sometimes the operation is very simple, like just remembering how many equals signs are needed.

11

Sometimes the operation is a bit more complicated, with embedded newlines and expanded escape characters.

12

Actions can do anything you can think of and figure out how to automate!

13

If you don’t provide any arguments, or provide incorrect arguments, even including ones like -h or --help, you get a generated usage message.

14

We wrap the blocks in a () subshell to get the output in the right order and send it all into the more command. The two egrep commands display our case..esac section lines, as in 6, which are both code and documentation, grouped by the count of # characters (one or two).

Tip

Use pbcopy and pbpaste instead of xsel on a Mac.

Example usage:

$ ad
Usage:
    rec|recipe )         # Create the tags for a new recipe
    table )              # Create the tags for a new table
    h1 )                 # Inside chapter heading 1 (really Asciidoc h3)
    h2 )                 # Inside chapter heading 2 (really Asciidoc h4)
    h3 )                 # Inside chapter heading 3 (really Asciidoc h5)
    bul|bullet )         # Bullet list (.. = level 2, + = multiline element)
    nul|number|order* )  # Num./ordered list (## = level 2, + = multiline element)
    term )               # Terms

    cleanup )            ## Clean up all the xHTML/XML/PDF cruft
$

To use ad to create the tags for a new recipe, like this one, you would type out the title, select it, open or flip to a terminal window, type ad rec, flip back to your editor, and paste it in. It’s much easier than it sounds and much faster to do than to describe. The beauty of this kind of script is that it works for all kinds of problems, it’s usually easy to extend, and the usage reminders all but write themselves. We’ve used scripts following this pattern to:

  • Write the second edition of this book

  • Wrap up various SSH commands to do common chores on groups of servers

  • Collect various Debian package system tools, prior to the advent of apt

  • Automate various “cleanup” tasks like trimming whitespace, sorting, and performing various simple text manipulations like stripping out rich-text formatting

  • Automate grep commands to search various specific file types and locations for notes and archived documentation