"sh" — The Unix Shell Scripting Language
A summary of practical techniques by Mike McCarthy, 2002
Scripts are text files that contain a set of instructions that can be
executed by the shell. While scripts can be written in various
languages (such as perl, awk, php, etc), this guide explains most of
the common features of the Unix Bourne shell scripting language.
All comments are single-line duration preceeded by the '#' character.
The first line of a shell script must contain the special comment
#!/bin/sh which tells the kernel that this is a shell script whose
interpreter resides in the bin directory. If this were a perl script,
the first line comment would be #!/bin/perl. All other comments
simply begin with the '#' followed by plain text.
# this is a comment
Unlike C/C++ and other languages, the shell script
instruction set is not compiled into object code. It remains a plain
text file which when executed, can invoke various Unix software
tools. Each tool (or command) is itself a miniature program which
generally "does one thing well". To make a shell script executable,
the chmod command must be used to set its executable privilege.
For example chmod +x myscript, or chmod go+x myscript.
Shell programming language
While a script could be
than a simple list of Unix commands executed one-by-one, the shell
programming language offers a rich set of instructions to enable
flow control, variable assignments, and conditional tests. The
language can only interpret character strings, not binary values such
as integers, floats, etc. If a numeric value is needed, the 'expr'
command can be used to temporarily treat a given character or string as a
numeric value (similar to casting in C/C++).
Variables are symbols that can store string values.
They can be used anywhere in the script and do not need to be
although they are sometimes presented in a variable list
with initially assigned values. As the name 'variable' suggests,
the value of a variable can be assigned, and reassigned. Here are
||The symbol 'LIST' is assigned a null string||EXIST=true
||The symbol 'EXIST' is assigned the string value "true"
||The symbol 'PROCESS' is assigned the string value "NO"||ERROR1="command not found"
||The string "command not found" is assigned to 'ERROR1'
contained in a variable can be accessed by preceeding the
variable's name with a '$' character. For example, to print the value
stored in 'Error1', the statement...
- echo $ERROR1
...will send "command not found" to stdout. Note: the echo command
automatically appends a new-line character at the end of the string.
In cases where you do not want the new-line character, use printf like this...
- printf "File already exists, overwrite? [y/n]: "
Use the '$' character anytime you need to output the value of a
variable, but do not use it when assigning a value.
To control the flow of a program, the 'test' command can be
used to check the value of a variable or condition. There are two
ways to use 'test': 1) by using the word "test", or 2) by using
its more readable macro ' [ '. If the square bracket method is
used, the test condition must be followed by a closing ' ] '
bracket and must be surrounded by whitespaces. Anytime a math
operator is used in a test, the values being tested are
temporarily interpreted as numbers. If no math operator is
present the test will be a string comparison. For example...
- if [ $VAL -gt 0 ]
...tests a numeric value, while...
- if [ $MYVAR = "OK" ]
...tests a string.
The conditional operators are...
(Note the leading hyphen where present)
- ' = '
- "is equal to" -- the whitespace on either side
of the '=' sign is required. The whitespace characters
used here, differentiate the condition from an assignment like
STR="Exit" which contain no spaces around the '=' sign.
- "is equal to", an alternative to the ' = '
sign described above.
- "is less than", like '<' in C/C++.
- "is greater than", like '>' in C/C++.
- "is less than or equal to", like '<=' in C/C++.
- "is greater than or equal to", like '>=' in C/C++.
- "is not equal to", like ' != ' in C/C++. ( ' != ' can also be used
as long as whitespace is inserted after the first operand and before
the second operand as with the ' = ' sign described above).
All shell variables are strings. To perform a
math operation on a string variable, its value must be
converted to a number with the 'expr' command substitution
function. For example...
||will print the string '10'|
|count=`expr $count + 2`
||(uses 'back tick' quote marks, see below)|
||will print the string '12'|
The value of a variable can contain the output of
a command using the backward single quote character (located to
the left of the number '1' on standard keyboards). When a command
is enclosed within these backward quotes
(often called "back ticks"), its output can be trapped and
assigned to a variable. For example, to set a string variable
to the current date...
The value of 'date' will now be the string output by
the Unix 'date' command. Therefore...
...will display the same string as would result by typing 'date'
at the command line, but here the string is trapped in a
Various control mechanisms can be used to control
the flow of a shell program. Like in all languages, there are
methods for testing, looping and iterating. The syntax of each
control mechanism is shown in the examples below...
if [ $VAR = "OK" ]
elif [ $VAR = "NOT_OK" ]
(Note: to end an 'if' block, close it with 'fi' -- "if" spelled backward.)
case $1 in
statement(s) ;; (each case separated by double colons)
* ) (the 'default' case)
esac (Note: to end a 'case' block, close it with 'esac' -- "case" spelled backward)
while test $var != "done" ; do
for element in $LIST
echo $element (example statement)
Statements are normally placed one-per-line but can be
included on the same line as long as there is a semi-colon
terminator. For example, the syntax for an 'if' statement might
be written like this...
if test $cond = "true" ; then
...instead of like this...
if test $cond = "true"
then (either method works, it's a matter of choice).
Similarly when using a while loop, the structure might be written...
while [ $num -gt 0 ] ; do
while [ $num -gt 0 ]
do (either method works, it's a matter of choice).
Remember that 'test' can be expressed with the word "test" or
with the '[' ...']' bracket characters.
since shell programming works mainly with
strings, there are specific uses of quotation marks. String
variables must be enclosed in quotes if they contain any
whitespace like this...
echo "this sentence contains white space"
Strings that do not contain any whitespace do not need the quotes,
but it is often a good idea to use them anyway.
The example case
block above tests for the specific cases -p and -c. Since those
symbols are strings they might also be written "-p" and "-c". It
is sometimes useful to append or prefix an arbitrary character with
the string for comparing null strings like this...
if test "x$word" = "x" ; then
...In this example, if the value of 'word' is null it will be equal
to the single character 'x' since 'word' was prefixed or
concatenated with a single 'x'.
When strings do not contain any whitespace, the statement...
echo okay ...would have the same result as...
Input / Output
The keywords 'echo' and 'read' are used to send
strings to stdout, and get strings from stdin respectively. When
using the 'read' device, do not prefix the variable with a '$',
If a variable 'val' will be set by the user, do this...
read val ...not this...
While there are other I/O devices, 'echo' and 'read' are the
standard methods similar to 'cin >>' and 'cout << ' in C++ or
'scanf' and 'printf' in C.
In addition to programmer-defined variables,
a script can use a set of variables automatically inherited from
the parent process
that represent certain values unique
to this (child process)
|$?||contains the exit value returned by the last executed command|
|$$||contains the process ID number of the shell|
|$#||contains the number of command line arguments sent to the script|
|$*||contains the current argument list as individual tokens. The construct "$*" glues the argument list into a single string.|
|$1, $2, $3, ...etc.||contain the individual argument strings sent from the command line.|
Here is an example of how the $1, $2, ... variables work...
If a script is called like this:
myscript these are the args
then within the script, the special variables $1, $2, $3, and $4 will contain the following values:
$1 - "these"
$2 - "are"
$3 - "the"
$4 - "args"
The only numeric values available for command line arguments are 0 - 9. $0 contains the name of the script, while $1, $2, ... contain the argument strings. If more than nine arguments are entered at the command line and you need to use all of them, the values must be shifted to the left with the keyword 'shift' which reassigns $1 to what formerly was $2, and reassigns $2 to what formerly was $3, ... etc.
Here is an example...
if the script body contained...
while [ $# -gt 0 ] ; do
...and the command line entered was:
myscript one two three
the output would be...
The '$?' variable is useful for testing the exit value of the last-executed command. The 'exit value' of a Unix process is similar to the 'return value' of a C/C++ function or program. The value '0' zero, generally means "success" and '1' generally means failure although other values can be used. Exit values can be used within scripts, or simply from the shell.
or example, if you enter...
'date' at the command line
The Unix date command will execute normally.
If you then type...
...an exit value of 0 will display. Since scripts can call other scripts, as well as standard Unix commands, it is often useful to examine exit values for success or failure.
Creating arrays of strings by concatenation
Just as command line arguments are an array of strings that can be used within a script, other arrays of strings can created and used as well. This is done by concatenation of whitespace-separated tokens.
FIELDS="Name Address Phone"
for element in $FIELDS ; do
printf "$element: "
...would interactively collect one data record, concatenate it into a single string, and then display it.