Chapter 2. Bash Primer

Bash should be thought of as a programming language whose default operation is to launch other programs. Here is a brief look at some of the features that make bash a powerful programming language, especially for scripting.

Output

As with any programming language, bash has the ability to output information to the screen. Output can be achieved using the echo command.

$ echo "Hello World"

Hello World

You may also use the printf command which allows for some additonal formatting.

$ printf "Hello World"

Hello World

Variables

Bash variables begin with an alphabetic character or underscore followed by alphanumeric characters. They are string variables unless declared otherwise. To assign a value to the variable, you write something like this:

MYVAR=textforavalue

To retrieve the value of that variable, for example to print out the value using the echo command, you use the $ in front of the variable name, like this:

echo $MYVAR

If you want to assign a series of words to the variable, that is, to preserve any whitespace, use quotation marks around the value, as in:

MYVAR='here is a longer set of words'
OTHRV="either double or single quotes will work"

The use of double quotes will allow other substitutions to occur inside the string. For example:

firstvar=beginning
secondvr="this is just the $firstvar"
echo $secondvr

This will result in the output: this is just the beginning

There are a variety of substitutions that can occur when retrieving the value of a variable; we will show those as we use them in the scripts to follow.

Warning

Remember that by using double quotes (") any substitutions that begin with the $ will still be made, whereas inside single quotes (') no substitutions of any sort are made.

You can also store the output of a shell command using $( ) as follows:

CMDOUT=$(pwd)

That will execute the command pwd in a sub-shell and rather than printing the the result to stdout, it will store the output of the command in the variable CMDOUT. You can also pipe together multiple commands within the $ ( ).

Positional Paramaters

It is common when using command line tools to pass data into the commands using arguments or parameters. Each parameter is separated by the space character and is accessed inside of bash using a special set of identifiers. In a bash script, the first parameter passed into the script can be accessed using $1, the second using $2, and so on. $0 is a special parameter that holds the name of the script, and $# returns the total number of parameters. Take the following script:

Example 2-1. echoparams.sh

#!/bin/bash -
#
# Rapid Cybersecurity Ops
# echoparams.sh
#
# Description:
# Demonstrates accessing parameters in bash
#
# Usage:
# ./echoparms.sh <param 1> <param 2> <param 3>
#

echo $#
echo $0
echo $1
echo $2
echo $3

This script first prints out the number of parameters ($#), then the name of the script ($0), and then the first three parameters. Here is the output:

$ ./echoparams.sh bash is fun

3
./echoparams.sh
bash
is
fun

Input

User input is received in bash using the read command. The read command will obtain user input from the command line and store it in a specified variable. The script below reads user input into the MYVAR variable and then prints it to the screen.

read MYVAR
echo "$MYVAR"

Conditionals

Bash has a rich variety of conditionals. Many, but not all, begin with the keyword if.

Any command or program that you invoke in bash may do some output but it will also always return a success or fail value. In the shell this value can be found in the $? variable immediately after a command has run. A return value of 0 is considered “success” or “true”; any non-zero value is considered “error” or “false”. The simplest form of the if statement uses this fact. It takes the form:

if cmd
then
   other cmds
fi

For example, the script below attempts to change directories to /tmp. If that command is successful (returns 0) the body of the if statement will execute.

if  cd /tmp
then
    echo "here is what is in /tmp:"
    ls -l
fi

Bash can even handle a pipeline of commands in a similar fashion:

if ls | grep pdf
then
    echo found one or more pdf files here
fi

With a pipeline, it is the success/failure of the last command in the pipeline that determines if the “true” branch is taken. Here is an example where that fact matters:

ls | grep pdf | wc

This series of commands will be “true” even if no pdf string is found by the grep command. That is because the wc command (a word count of the input) will print:

0       0       0

That output indicates 0 characters, 0 words, and 0 lines when no output comes from the grep command. That is still a successful (or true) result, not an error or failure. It counted as many lines as it was given, even if it was given zero lines to count.

A more typical form of if used for comparisons makes use of the compound command [[ or the shell built-in commands [ or test. Use these to test file attributes or to make comparisons of value.

To test if a file exists on the file system:

if [[ -e $FILENAME ]]
then
    echo $FILENAME exists
fi

Table 2-1 lists additional tests that can be done on files using if comparisons.

Table 2-1. File Test Operators
File Test Operator	Use
-d	Test if a directory exists
-e	Test if a file exists
-r	Test if a file exists and is readable
-w	Test if a file exists and is writable
-x	Test if a file exists and is executable

To test if the variable $VAL is less than the variable $MIN:

if [[ $VAL -lt $MIN ]]
then
    echo "value is too small"
fi

Table 2-2 lists additional numeric tests that can be done using if comparisons.

Table 2-2. Numeric Test Operators
Numeric Test Operator	Use
-eq	Test for equality between numbers
-gt	Test if one number is greater than another
-lt	Test if one number is less than another

Warning

Be cautious of using the < symbol. Take the following code:

if [[ $VAL < $OTHR ]]

This operator is a less-than but in this context it uses lexical (alphabetical) ordering. That means that 12 is less than 2, since they alphabetically sort in that order. (Just like a < b, so 1 < 2, but also 12 < 2anything)

If you want to do numerical comparisons with the less-than sign, use the double parentheses construct. It assumes that the variables are all numerical and will evaluate them as such. Empty or unset variables are evaluated as 0. Inside the parentheses you don’t need the $ operator to retrieve a value, except for positional parameters like $1 and $2 (so as not to confuse them with the constants 1 and 2). For example:

if (( VAL < 12 ))
then
    echo "value $VAL is too small"
fi

In bash you can even make branching decisions without an explicit if/then construct. Commands are typically separated by a newline - that is, they appear one per line. You can get the same effect by separating them with a semicolon. If you write cd $DIR ; ls then bash will perform the cd and then the ls.

Two commands can also be separated by either && or || symbols. If you write cd $DIR && ls then the ls command will run only if the cd command succeeds. Similarly if you write cd $DIR || echo cd failed the message will be printed only if the cd fails.

You can use the [[ syntax to make various tests, even without an explicit if.

[[ -d $DIR ]] && ls "$DIR"

means the same as if you had written

if [[ -d $DIR ]]
then
  ls "$DIR"
fi

Warning

When using && or || you will need to group multiple statements if you want more than one action within the “then” clause. For example:

[[ -d $DIR ]] || echo "error: no such directory: $DIR" ; exit

will always exit, whether or not $DIR is a directory.

What you probably want is this:

[[ -d $DIR ]] || { echo "error: no such directory: $DIR" ; exit ; }

where the braces will group both statements together.

Looping

Looping with a while statement is similar to the if construct in that it can take a single command or a pipeline of commands for the decision of true or false. It can also make use of the brackets or parentheses as in the if examples, above.

In some languages braces ( { } ) are used to group the statement together that are the body of the while loop. In others, like python, indentation is the indication of which statements are the loop body. In bash, however, the statements are grouped between two keywords: do and done.

Here is a simple while loop:

i=0
while (( i < 1000 ))
do
    echo $i
    let i++
done

The loop above will execute while the variable i is less than 1000. Each time the body of the loop executes it will print the value of i to the screen. It then uses the let command to execute i++ as an arithmetic expression, thus incrementing i by 1 each time.

Here is a more complicated while loop that executes commands as part of its condition.

while ls | grep -q pdf
do
    echo -n 'there is a file with pdf in its name here: '
    pwd
    cd ..
done

A for loop is also available in bash - in three variations.

Simple numerical looping can be done using the double parentheses construct. It looks much like the for loop in C or Java, but with double parentheses and with do and done instead of braces:

for ((i=0; i < 100; i++))
do
    echo $i
done

Another useful form of the for loop is used to iterate through all the parameters that are passed to a shell script (or function within the script), that is, $1, $2, $3 an so on. Note that ARG in args.sh can be replaced with any variable name of your choice.

for ARG
do
    echo here is an argument: $ARG
done

Here is the output of args.sh when three parameters are passed in.

$ ./args.sh bash is fun

here is an argument: bash
here is an argument: is
here is an argument: fun

Finally, for an arbitrary list of values, use a similar form of the for statement simply naming each of the values you want for each iteration of the loop. That list can be explicitly written out, like this:

for VAL in 20 3 dog peach 7 vanilla
do
    echo $VAL
done

The values used in the for loop can also be generated by calling other programs or using other shell features:

for VAL in $(ls | grep pdf) {0..5}
do
    echo $VAL
done

Here the variable VAL will take, in turn, the value for each of the filenames that ls piped into grep finds with the letters pdf in its filename (e.g. “doc.pdf” or “notapdfile.txt”) and then each of the numbers 0 through 5. It may not be that sensible to have the variable VAL be a filename sometimes and a single digit another time, but this shows you that it can be done.

Functions

Define a function with sytnax like this:

function myfun ()
{
  # body of the function goes here
}

Not all that syntax is necessary - you can use either "function" or "()" - you don’t need both. We recommend, and will be using, both - mostly for readability.

There are a few important considerations to keep in mind with bash functions:

Unless declared with the local builtin command inside the function, variables are global in scope. A for loop which sets and increments i could be messing with the value of i used elsewhere in your code.
The braces are the most commonly used grouping for the function body, but any of the shell’s compound command syntax is allowed - though why, e.g., would you want the function to run in a sub-shell?
Redirecting I/O on the braces does so for all the statements inside the function. Examples of this will be seen in upcomoing chapters.
No parameters are declared in the function definition. Whatever and however many arguments are supplied on the invocation of the function are passed to it.

The function is called (invoked) just like any command is called in the shell. Having defined myfun as a function you can call it like this:

myfun 2 /arb "14 years"

which calls the function myfun supplying it with 3 arguments.

Function Arguments

Inside the function defintion arguments are referred to in the same way as parameters to the shell script — as $1, $2, etc. Realize that this means that they “hide” the parameters originally passed to the script. If you want access to the script’s first parameter, you need to store $1 into a variable before you call the function (or pass it as a paramter to the function).

Other variables are set accordingly, too. $# gives the number of arguments passed to the function, whereas normally it gives the number of arguments passed to the script itself. The one exception to this is $0 - it doesn’t change in the function. It retains its value as the name of the script (and not of the function).

Returning Values

Functions, like commands, should return a status - a 0 if all goes well and a non-zero value if some error has occurred. To return some other kinds of values - pathnames or computed values for example - you can either set a variable to hold that value - since those variables are global unless declared local within the function, or you can send the result to stdout, that is, print the answer. Just don’t try to do both.

Warning

If you print the answer you’ll typically use that output as part of a pipeline of commands (e.g., myfunc args | next step | etc ) or you’ll capture the output like this: RESVAL=$( myfunc args ). In both cases the function will be run in a sub-shell and not in the current shell. Thus changes to any global variables will only be effective in that sub-shell and not in the main shell instance. They are effectively lost.

Pattern Matching in bash

When you need to name a lot of files on a command line, you don’t need to type each and every name. Bash provides pattern matching (sometimes called “wildcarding”) to allow you to specify a set of files with a pattern. The easiest one is simply an asterisk * (or “star”) which will match any number of any characters. When used by itself, therefore, it matches all files in the current directory. The asterisk can be used in conjunction with other characters. For example \*.txt matches all the files in the current directory which end with the four characters .txt. The pattern /usr/bin/g\* will match all the files in /usr/bin that begin with the letter g.

Another special character in pattern matching is ? the question mark, which matches a single character. For example, source.? will match source.c or source.o but not source.py or source.cpp.

The last of the three special pattern matching characters are [ ], the square brackets. A match can be made with any one of the characters listed inside the square brackets, so the pattern x[abc]y matches any or all of the files named xay, xby, or xcy, assuming they exist. You can specify a range within the square brackets, like [0-9] for all digits. If the first character within the brackets is either a \! or a ^ then the pattern means anything other than the remaining characters in the brackets. For example, [aeiou] would match a vowel whereas [^aeiou] would match any character except the vowels (including digits and punctuation characters).

Similar to ranges, you can specify character classes within braces. Table 2-3 lists the character classes and their description.

Table 2-3. Pattern Matching Character Classes
Character Class	Description
`[:alnum:]`	Alphanumeric
`[:alpha:]`	Alphabetic
`[:ascii:]`	ASCII
`[:blank:]`	Space and Tab
`[:ctrl:]`	Control Characters
`[:digit:]`	Number
`[:graph:]`	Anything Other Than Control Characters and Space
`[:lower:]`	Lowercase
`[:print:]`	Anything Other Than Control Characters
`[:punct:]`	Punctuation
`[:space:]`	Whitespace Including Line Breaks
`[:upper:]`	Uppercase
`[:word:]`	Letters, Numbers, and Underscore
`[:xdigit:]`	Hexadecimal

Character classes are specified like this: [:cntrl:] within square brackets (so you have two sets of []). For example, this pattern: \*[[:punct:]]jpg will match any filename that has any number of any characters followed by a punctuation character followed by the letters jpg. So it would match files named wow!jpg or some,jpg or photo.jpg but not a file named this.is.myjpg since there is no punctuation character right before the jpg.

There are more complex aspects of pattern matching if you turn on the shell option extglob (like this: shopt -s extglob) so that you can repeat patterns or negate patterns. We won’t need these in our example scripts but we encourage you to learn about them (e.g., via the bash man page).

There are a few things to keep in mind when using shell pattern matching:

Patterns aren’t regular expressions (discussed later); don’t confuse the two.
Patterns are matched against files in the file system; if the pattern begins with a pathname (e.g., /usr/lib ) then the matching will be done against files in that directory.
If no pattern is matched, the shell will use the special pattern matching characters as literal characters of the filename; for example, if your script says echo data > /tmp/*.out but there is no file in /tmp that ends in .out then the shell will create a file called *.out in the /tmp directory. Remove it like this: rm /tmp/\*.out by using the backslash to tell the shell not to pattern match with the asterisk.
No pattern matching occurs inside of quotes (either double or single quotes), so if your script says echo data > "/tmp/*.out" it will create a file called /tmp/*.out (which we recommend you avoid doing).

Note

The dot, or period, is just an ordinary character and has no special meaning in shell pattern matching - unlike in regular expressions which will be discussed later.

Writing Your First Script - Detecting Operating System Type

Now that we have gone over the fundamentals of the command line and bash you are ready to write your first script. The bash shell is available on a variety of platforms including Linux, Windows, macOS, and Git Bash. As you write more complex scripts in the future it is imperative that you know what operating system you are interacting with as each one has a slightly different set of commands available. The osdetect.sh script helps you in making that determination.

The general idea of the script is that it will look for a command that is unique to a particular operating system. The limitation is that on any given system an administrator may have created and added a command with that name, so this is not foolproof.

Example 2-2. osdetect.sh

#!/bin/bash -
#
# Rapid Cybersecurity Ops
# osdetect.sh
#
# Description:
# Distinguish between MS-Windows/Linux/MacOS
#
# Usage:
# Output will be one of: Linux MSWin macOS
#

if type -t wevtutil &> /dev/null           
then
    OS=MSWin
elif type -t scutil &> /dev/null           
then
    OS=macOS
else
    OS=Linux
fi
echo $OS

: We use the type built-in in bash to tell us what kind of a command (alias, keyword, function, built-in, or file) its arguments are. The -t option tells it to print nothing if the command isn’t found. The command returns as “false” in that case. We redirect all the output (both stdout and stderr) to /dev/null thereby throwing it away, as we only want to know if the wevtutil command was found.
: Again we use the type built-in but this time we are looking for the scutil command which is available on macOS systems.

Summary

The bash shell can be seen as a programming language, one with variables and if/then/else statements, loops, and functions. It has its own syntax, similar in many ways to other programming languages, but just different enough to catch you if you’re not careful.

It has its strengths - like easily invoking other programs or connecting sequences of other programs - and it has its weaknesses: it doesn’t have floating point arithmetic or much support (though some) for complex data structures.

In the chapters ahead we will describe and use many bash features and OS commands in the context of cybersecurity operations. We will further explore some of the features we have touched on here, and other more advanced or obsure features. Keep your eyes out for those featues and practice and use them for your own scripting.

Exercises

Experiment with the uname command, seeing what it prints on the various operating systems. Re-write the osdetect.sh script to use the uname command, possibly with one of its options. Caution: not all options are available on every operating system.
Modify the osdetect.sh script to use a function. Put the if/then/else logic inside the function and then call it from the script. Don’t have the function itself do any output. Make the output come from the main part of the script.
Set the permissions on the osdetect.sh script to be executable (see man chmod) so that you can run the script without using bash as the first word on the command line. How do you now invoke the script?
Write a script called argcnt.sh that tells how many arguments are supplied to the script.
1. Modify your script to have it also echo each argument one per line.
2. Modify your script further to label each argument like this:
```
$ bash argcnt.sh this is a "real live" test
there are 5 arguments
arg1: this
arg2: is
arg3: a
arg4: real live
arg5: test
$
```
Modify argcnt.sh so it only lists the even arguments.

Previous Chapter

1. Command Line Primer

Next Chapter

3. Regular Expressions

Table of Contents for Rapid Cybersecurity Ops

Chapter 2. Bash Primer

Output

Variables

Warning

Positional Paramaters

Example 2-1. echoparams.sh

Input

Conditionals

Warning

Warning

Looping

Functions

Function Arguments

Returning Values

Warning

Pattern Matching in bash

Note

Writing Your First Script - Detecting Operating System Type

Example 2-2. osdetect.sh

Summary

Exercises

Table of Contents for
Rapid Cybersecurity Ops