The Unix Shell: Working with Text

Text editors

Usually we create text using a text editor. Standard text editors are vi and emacs, although you can also use an alternative from this list. I personally use Atom.

vi

Regardless of what text editor you choose as your primary tool, it is essential to have at least some experience using vi, because that is available on ALL Unix systems, and may be the only editor available on a remote server. Work through this tutorial.

emacs

Most unix systems will also have emacs installed. It is therefore also worth trying out the built-in emacs tutorial. start Emacs (emacs) and type C-h t, that is, Ctrl-h followed by t to access the tutorial.

Note that people who love vi often hate emacs and vice versa ;-)

Text streams

Input and output of most Unix shell programs consists of plain text streams. Text output from a program can be piped into another program, or redirected to other streams. The standard streams are stdin (0) (standard input), stdout (1) (standard output) and stderr (2) (standard error). The default is to assume that input comes from stdin and output goes to stdout. We can also stream to and from a file.

Pipes and redirection

Creating a text file from command line

Sometimes using a text editor is over-kill. For simple file creation, we can just use re-direction

A single > will create a new file or over-write an existing one.

In [1]:
echo "1 Hello, bash" > hello.txt
In [2]:
ls *txt
hello.txt       stderr.txt      stdout.txt

Appending

In [3]:
echo "2 Hello, again" >> hello.txt

Special non-printing characters

In [4]:
echo -e "3 Hello\n4 again" >> hello.txt

From file to `stdout

In [5]:
cat hello.txt
1 Hello, bash
2 Hello, again
3 Hello
4 again

Here docs

We can also create multi-line text streams with here docs. Here docs are strted with << and delimited by some arbitrary string at the beginning and end.

In [22]:
cat > test1.txt << EOF
One, two buckle my shoe
Three, four lock the door
EOF
In [20]:
cat test1.txt
One, two buckle my shoe
Three, four lock the door
In [23]:
tr aeiou xxxxx << SOME_DELIMITER
One, two buckle my shoe
Three, four lock the door
SOME_DELIMITER
Onx, twx bxcklx my shxx
Thrxx, fxxr lxck thx dxxr

Pipe to cut program to extract columns 2,3,4,5

In [6]:
cat hello.txt | cut -c 2-5
 Hel
 Hel
 Hel
 aga

Building a chain of pipes

wc -lc reports the number of lines and bytes (usually corresponds to characters when using English text).

Note that character count is 5 per line and not 4 because cut adds a newline character for each line.

In [7]:
cat hello.txt | cut -c 2-5 | wc -lc
       4      20

Capturing error messages

The redirection operator > is actually 1> - that is, using stdout. We can also use 2> to redirect the output of stderr. &> means redirect both stdout and stderr, and is useful if for example, you want to direct all output to the same log file for later inspection.

In [8]:
mkdir foo/bar/baz > 'stdout.txt' | cat
mkdir: foo/bar: No such file or directory

As there is notheing from stdout the file is emtpy

In [9]:
cat 'stdout.txt'

We need to use 2> to capture the output from stderr

In [10]:
mkdir foo/bar/baz 2> 'stderr.txt' | cat
In [11]:
cat 'stderr.txt'
mkdir: foo/bar: No such file or directory

Example - getting the 2nd and 3rd lines of hello.txt

In [12]:
cat hello.txt | head -n 3 | tail -n 2
2 Hello, again
3 Hello

Character substitution with tr (transliteration)

Switch case.

In [20]:
echo "This is Duke" | tr a-zA-Z A-Za-z
tHIS IS dUKE

Find reverse complement of DNA string.

In [15]:
echo 'GATTACA' | tr ACTG TGAC | rev
TGTAATC

Caesar cipher encoding and decoding

In [21]:
echo "This is Duke" | tr a-zA-Z c-zabC-ZAB
Vjku ku Fwmg
In [22]:
echo "Vjku ku Fwmg" | tr c-zabC-ZAB a-zA-Z
This is Duke

Clean up

In [1]:
rm *txt