The Unix Shell: Writing Shell Scripts¶
The shell commands constitute a programming language, and command line programs known as shell scripts can be written to perform complex tasks.
This will only provide a brief overview - shell scripts have many traps and pitfalls for the unwary, and we generally prefer to use languages such as Python or R with more consistent syntax for complex tasks. However, shell scripts are extensively used in domains such as the preprocessing of genomics data, and it is a useful tool to know about.
Pipe error hack¶
In [ ]:
cleanup () {
:
}
trap "cleanup" SIGPIPE
Warn on uninitialized variable¶
You will typically want to set this. I will leave it unset for this session to show what can happen if you don’t.
set -u
Assigning variables¶
We assign variables using =
and recall them by using $
. It is
customary to spell shell variable names in ALL_CAPS.
In [ ]:
NAME='Joe'
echo "Hello $NAME"
echo "Hello ${NAME}"
Single and double parentheses¶
The main difference between the use of ‘’ and “” is that variable expansion only occurs with double parentheses. For plain text, they are equivalent.
In [ ]:
echo '${NAME}'
In [ ]:
echo "${NAME}"
Use of curly braces¶
Use of curly braces unambiguously specifies the variable of interest. I suggest you always use them as a defensive programming technique.
In [ ]:
echo "Hello ${NAME}l"
$Namel is not defined, and so returns an empty string!
In [ ]:
echo "Hello $NAMEl"
One of the quirks of shell scripts is already present - there cannot be
spaces before or after the =
in an assignment.
In [ ]:
NAME2= 'Joe'
echo "Hello ${NAME2}"
The previous instruction assigns the empty space to NAME2, then tries to execute ‘Joe’ as a command.
In [ ]:
NAME3 ='Joe'
echo "Hello ${NAME3}"
The previous instruction runs the command NAME3 with =’Joe’ as its argument.
Assigning commands to variables¶
In [ ]:
pwd
In [ ]:
CUR_DIR=$(pwd)
dirname ${CUR_DIR}
basename ${CUR_DIR}
Working with numbers¶
Careful: Note the use of double parentheses to trigger evaluation of a mathematical expression.
In [5]:
NUM=0
((NUM++))
echo $NUM
1
In [ ]:
NUM=$((1+2+3+4))
echo ${NUM}
Branching¶
Using if to check for file existence¶
Note the test condition must use square brackets.
In [ ]:
if [ -f hello.txt ]; then
cat hello.txt
else
echo "No such file"
fi
Downloading remote files¶
In [31]:
man wget | head -n 20
WGET(1) GNU Wget WGET(1)
NAME
Wget - The non-interactive network downloader.
SYNOPSIS
wget [option]... [URL]...
DESCRIPTION
GNU Wget is a free utility for non-interactive download of files from
the Web. It supports HTTP, HTTPS, and FTP protocols, as well as
retrieval through HTTP proxies.
Wget is non-interactive, meaning that it can work in the background,
while the user is not logged on. This allows you to start a retrieval
and disconnect from the system, letting Wget finish the work. By
contrast, most of the Web browsers require constant user's presence,
which can be a great hindrance when transferring a lot of data.
Wget can follow links in HTML, XHTML, and CSS pages, to create local
col: write error
grotty:<standard input> (<standard input>):37016:fatal error: output error
man: command exited with status 1: (cd /usr/share/man && /usr/lib/man-db/zsoelim) | (cd /usr/share/man && /usr/lib/man-db/manconv -f UTF-8:ISO-8859-1 -t UTF-8//IGNORE) | (cd /usr/share/man && preconv -e UTF-8) | (cd /usr/share/man && tbl) | (cd /usr/share/man && nroff -mandoc -Tutf8)
Some useful wget
flags¶
-O
redirects to named path-O-
redirects to standard output (so it can be piped)-q
suppresses messages
In [ ]:
if [ ! -f "data/forbes.csv" ]; then
wget https://vincentarelbundock.github.io/Rdatasets/csv/HSAUR/Forbes2000.csv \
-O data/forbes.csv
fi
In [ ]:
wget -qO- https://vincentarelbundock.github.io/Rdatasets/doc/HSAUR/Forbes2000.html \
| html2text | head -n 27 | tail -n 17
Conditional evaluation with test
¶
The [ -f hello.txt ]
syntax is equivalent to test -f hello.txt
,
where test
is a shell command with a large range of operators and
flags that you can view in the man page.
TEST(1) BSD General Commands Manual TEST(1)
NAME
test, [ -- condition evaluation utility
SYNOPSIS
test expression
[ expression ]
DESCRIPTION
The test utility evaluates the expression and, if it evaluates to true,
returns a zero (true) exit status; otherwise it returns 1 (false). If
there is no expression, test also returns 1 (false).
All operators and flags are separate arguments to the test utility.
The following primaries are used to construct expression:
-b file True if file exists and is a block special file.
-c file True if file exists and is a character special file.
-d file True if file exists and is a directory.
-e file True if file exists (regardless of type).
-f file True if file exists and is a regular file.
-g file True if file exists and its set group ID flag is set.
Looping¶
For loop¶
In [ ]:
for FILE in $(ls *ipynb); do
echo $FILE
done
Traditional C-style for loop¶
We make use of the double parentheses to evaluate the counter arithmatic.
In [2]:
for ((i=0; i<5; i++)); do
echo $i
done
0
1
2
3
4
Double square brackets provide enhanced test functionality¶
- You can use
&&
instead of-a
- You can use
||
instead of-o
- you can use
=~
to match regular expression patterns
Example of regular expression matching in test condition¶
NOte that STRING should be quoted and REGULAR EXPRESSION should not be quoted.
In [30]:
for FILE in $(ls); do
if [[ "${FILE}" =~ ^Bash.*sh ]]; then
echo $FILE
fi
done
Bash_Exercise_1_Solutions.sh
Bash_Exercise_2_Solutoins.sh
While loop¶
In [ ]:
COUNTER=10
while [ $COUNTER -gt 0 ]; do
echo $COUNTER
COUNTER=$(($COUNTER - 1))
done
Careful: Note that <
is the redirection operator, and hence will
lead to an infinite loop. Use -lt
for less than and -gt
for
greater than, ==
for equality and !=
for inequality.
In [ ]:
COUNTER=10
while [ $COUNTER != 0 ]; do
echo $COUNTER
COUNTER=$(($COUNTER - 1))
done
Brace expansion¶
Brace expansions create lists of strings, typically used in loops.
In [1]:
for NUM in {000..005}; do
echo mkdir EXPT-${NUM}
done
mkdir EXPT-000
mkdir EXPT-001
mkdir EXPT-002
mkdir EXPT-003
mkdir EXPT-004
mkdir EXPT-005
But brace expansions cna be used outside loops as well.
In [38]:
echo foo.{c,cpp,h,hp}
foo.c foo.cpp foo.h foo.hp
Shell script¶
From now on, we will write the shell script using an editor for convenience.
A shell script is traditionally given the extension .sh
. There are a
few things to note:
- To make the script standalone, you need to add
#!/path/to/shell
in the first line. Otherwise you need to call the script withbash /path/to/script
instead of just/path/to/script
. - To make the script executable, change the file permissions to
executable with
chmod +x /path/to/script
- Shell arguments are similar to function arguments - i.e.
$1
,$2
,$@
etc. Another useful variable is$#
which gives the number of command line arguments.
In [ ]:
which bash
In [40]:
cat -g scripts/my_first_script.sh
Error: cannot read infile: [Errno 2] No such file or directory: 'scripts/cat_if_exists.sh'
In [ ]:
chmod +x scripts/my_first_script.sh
In [ ]:
scripts/my_first_script.sh
Passing arguments to shell scripts¶
In [52]:
cat <<EOF > my_second_script.sh
#!/bin/bash
echo $#
for ARG in {0..$#}; do
echo $ARG
done
EOF
In [53]:
chmod +x my_second_script.sh
In [54]:
./my_second_script.sh 1 2 3
0
In [51]:
which bash
/bin/bash