The Unix Shell: Working with Text

Text streams

Input and output of most Unix shell programs consists of plain text streams. Text output from a program can be piped into another program, or redirected to other streams. The standard streams are stdin (0) (standard input), stdout (1) (standard output) and stderr (2) (standard error). The default is to assume that input comes from stdin and output goes to stdout. We can also stream to and from a file.

Pipes and redirection

Creating a text file from command line

Sometimes using a text editor is over-kill. For simple file creation, we can just use re-direction

A single > will create a new file or over-write an existing one.

In [1]:
echo "1 Hello, bash" > hello.txt
In [2]:
ls *txt
hello.txt       stderr.txt      stdout.txt      vietnam.txt

Appending

In [3]:
echo "2 Hello, again" >> hello.txt

Special non-printing characters

In [4]:
echo -e "3 Hello\n4 again" >> hello.txt

From file to `stdout

In [5]:
cat hello.txt
1 Hello, bash
2 Hello, again
3 Hello
4 again

Pipe to cut program to extract columns 2,3,4,5

In [6]:
cat hello.txt | cut -c 2-5
 Hel
 Hel
 Hel
 aga

Building a chain of pipes

wc -lc reports the number of lines and bytes (usually corresponds to characters when using English text).

Note that character count is 5 per line and not 4 because cut adds a newline character for each line.

Extract columns 2-5 and then count the number of lines and characters

In [7]:
cat hello.txt | cut -c 2-5 | wc -lc
       4      20

Extract lines 2-3

In [8]:
cat hello.txt | head -n 3 | tail -n 2
2 Hello, again
3 Hello

Capturing error messages

The redirection operator > is actually 1> - that is, using stdout. We can also use 2> to redirect the output of stderr. &> means redirect both stdout and stderr, and is useful if for example, you want to direct all output to the same log file for later inspection.

In [9]:
mkdir foo/bar/baz > 'stdout.txt'
mkdir: foo/bar: No such file or directory

As there is nothing from stdout the file is empty

In [10]:
cat 'stdout.txt'

We need to use 2> to capture the output from stderr

In [11]:
mkdir foo/bar/baz 2> 'stderr.txt'
cat 'stderr.txt'
mkdir: foo/bar: No such file or directory

Character substitution with tr (transliteration)

Switch case.

In [12]:
echo "gattaca" | tr a-z A-Z
GATTACA

Find reverse complement of DNA string.

In [13]:
echo 'GATTACA' | tr ACTG TGAC | rev
TGTAATC