The Unix Shell: Exercises with Solutions

Exercise 1

List all directories with the string ‘PM’ in the parent directory.

In [1]:
ls -d ../*PM*
../Wk1_Day2_PM  ../Wk3_Day1_PM  ../Wk3_Day3_PM  ../Wk4_Day2_PM
../Wk1_Day3_PM  ../Wk3_Day2_PM  ../Wk3_Day4_PM  ../Wk4_Day3_PM

Exercise 2

  • Create the folder foo/bar/baz
In [2]:
mkdir -p foo/bar/baz
  • Create a file containing ‘Hello world’ named hello.txt in the foo/bar directory
In [3]:
echo "Hello world" > foo/bar/hello.txt
  • Delete the foo folder and everything in it, including subdirectoreis.
In [4]:
rm -rf foo

EXercise 3

In [5]:
wget -nc https://molb7621.github.io/workshop/_downloads/sample.fa
wget -nc https://molb7621.github.io/workshop/_downloads/SP1.fq
--2017-07-26 10:52:06--  https://molb7621.github.io/workshop/_downloads/sample.fa
Resolving molb7621.github.io... 151.101.57.147, 2a04:4e42:e::403
Connecting to molb7621.github.io|151.101.57.147|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 118 [application/octet-stream]
Saving to: ‘sample.fa’

sample.fa           100%[===================>]     118  --.-KB/s    in 0s

2017-07-26 10:52:06 (7.50 MB/s) - ‘sample.fa’ saved [118/118]

--2017-07-26 10:52:06--  https://molb7621.github.io/workshop/_downloads/SP1.fq
Resolving molb7621.github.io... 151.101.57.147, 2a04:4e42:e::403
Connecting to molb7621.github.io|151.101.57.147|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22471 (22K) [application/octet-stream]
Saving to: ‘SP1.fq’

SP1.fq              100%[===================>]  21.94K  --.-KB/s    in 0.03s

2017-07-26 10:52:06 (666 KB/s) - ‘SP1.fq’ saved [22471/22471]

  • Show only lines 5-8 from the SP1.fq
In [6]:
cat SP1.fq | head -n 8 | tail -n 4
@cluster_8:UMI_CTTTGA
TATCCTTGCAATACTCTCCGAACGGGAGAGC
+
1/04.72,(003,-2-22+00-12./.-.4-
  • Create an MD5 checksum file MD5SUM for the FASTA and FASTQ files
In [7]:
md5sum sample.fa SP1.fq > MD5SUM
  • Crate a tar gzipped archive called examples.tar.gz that contains these two files
In [8]:
tar -czvf examples.tar.gz SP1.fq sample.fa
a SP1.fq
a sample.fa
  • Delete the FASTA and FASTQ files
In [9]:
rm SP1.fq sample.fa
  • Recover the original files from examples.tar.gz
In [10]:
tar -xzvf examples.tar.gz
x ./._SP1.fq
x SP1.fq
x ./._sample.fa
x sample.fa
  • Confirm that the MD5 checksums are correct for the recovered files
In [11]:
md5sum -c MD5SUM
sample.fa: OK
SP1.fq: OK

Exercise 4

Find any file(s) with the string GATCGTACGTACGTA and the line number on which it occurs within files that have the .fa or .fq extensions within the current directory.

In [22]:
grep -n GATCGTACGTACGTA *f[aq]
sample.fa:6:CATCGATCGTACGTACGTAG

Exercise 5

  • Write a shell script that will report the number of lines in each file within the current directory.
In [45]:
cat > count_lines.sh << 'EOF'
#!/bin/bash

for FILE in $(ls)
do
    wc -l "${FILE}"
done
EOF
  • Make the file executable
In [46]:
chmod +x count_lines.sh
  • Run the file
In [47]:
cat count_lines.sh
#!/bin/bash

for FILE in $(ls)
do
    wc -l "${FILE}"
done
In [48]:
./count_lines.sh
       2 MD5SUM
       3 MD5_CHECKSUM
    1000 SP1.fq
    2191 The_Unix_Shell_01___File_and_Directory_Management.ipynb
     361 The_Unix_Shell_02___Working_with_Text.ipynb
     733 The_Unix_Shell_03___Finding_Stuff.ipynb
     883 The_Unix_Shell_04___Regular_Expresssions.ipynb
     940 The_Unix_Shell_05___Shell_Scripts.ipynb
     444 The_Unix_Shell___Exercises.ipynb
       1 a.txt
       1 b.txt
       1 c.txt
       6 count_lines.sh
wc: data: read: Is a directory
      39 examples.tar.gz
       1 goodbye.md5
       4 goodbye.txt
       4 hell.txt
       1 hello.md5
       4 hello.txt
       7 sample.fa
wc: scripts: read: Is a directory
       1 stderr.txt
       0 stdout.txt
       1 test.md5