{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Unix Shell: Exercises with Solutions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise 1**\n", "\n", "List all directories with the string 'PM' in the **parent** directory." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "../Wk1_Day2_PM\t../Wk3_Day1_PM\t../Wk3_Day3_PM\t../Wk4_Day2_PM\n", "../Wk1_Day3_PM\t../Wk3_Day2_PM\t../Wk3_Day4_PM\t../Wk4_Day3_PM\n" ] } ], "source": [ "ls -d ../*PM*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise 2**\n", "\n", "- Create the folder `foo/bar/baz`" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "mkdir -p foo/bar/baz" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Create a file containing 'Hello world' named hello.txt in the `foo/bar` directory" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "echo \"Hello world\" > foo/bar/hello.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Delete the `foo` folder and everything in it, including subdirectoreis." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "rm -rf foo" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**EXercise 3**\n", "\n", "- Download an example FASTA file from https://molb7621.github.io/workshop/_downloads/sample.fa\n", "- Download the example FASTQ file from https://molb7621.github.io/workshop/_downloads/SP1.fq" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2017-07-26 10:52:06-- https://molb7621.github.io/workshop/_downloads/sample.fa\n", "Resolving molb7621.github.io... 151.101.57.147, 2a04:4e42:e::403\n", "Connecting to molb7621.github.io|151.101.57.147|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 118 [application/octet-stream]\n", "Saving to: ‘sample.fa’\n", "\n", "sample.fa 100%[===================>] 118 --.-KB/s in 0s \n", "\n", "2017-07-26 10:52:06 (7.50 MB/s) - ‘sample.fa’ saved [118/118]\n", "\n", "--2017-07-26 10:52:06-- https://molb7621.github.io/workshop/_downloads/SP1.fq\n", "Resolving molb7621.github.io... 151.101.57.147, 2a04:4e42:e::403\n", "Connecting to molb7621.github.io|151.101.57.147|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 22471 (22K) [application/octet-stream]\n", "Saving to: ‘SP1.fq’\n", "\n", "SP1.fq 100%[===================>] 21.94K --.-KB/s in 0.03s \n", "\n", "2017-07-26 10:52:06 (666 KB/s) - ‘SP1.fq’ saved [22471/22471]\n", "\n" ] } ], "source": [ "wget -nc https://molb7621.github.io/workshop/_downloads/sample.fa\n", "wget -nc https://molb7621.github.io/workshop/_downloads/SP1.fq" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Show only lines 5-8 from the SP1.fq" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "@cluster_8:UMI_CTTTGA\n", "TATCCTTGCAATACTCTCCGAACGGGAGAGC\n", "+\n", "1/04.72,(003,-2-22+00-12./.-.4-\n" ] } ], "source": [ "cat SP1.fq | head -n 8 | tail -n 4 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Create an MD5 checksum file MD5SUM for the FASTA and FASTQ files" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "md5sum sample.fa SP1.fq > MD5SUM" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Crate a tar gzipped archive called `examples.tar.gz` that contains these two files" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a SP1.fq\n", "a sample.fa\n" ] } ], "source": [ "tar -czvf examples.tar.gz SP1.fq sample.fa " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Delete the FASTA and FASTQ files" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "rm SP1.fq sample.fa " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Recover the original files from `examples.tar.gz` " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "x ./._SP1.fq\n", "x SP1.fq\n", "x ./._sample.fa\n", "x sample.fa\n" ] } ], "source": [ "tar -xzvf examples.tar.gz" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Confirm that the MD5 checksums are correct for the recovered files" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sample.fa: OK\n", "SP1.fq: OK\n" ] } ], "source": [ "md5sum -c MD5SUM" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise 4**\n", "\n", "Find any file(s) with the string `GATCGTACGTACGTA` and the line number on which it occurs within files that have the `.fa` or `.fq` extensions within the current directory." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sample.fa:6:CATCGATCGTACGTACGTAG\r\n" ] } ], "source": [ "grep -n GATCGTACGTACGTA *f[aq]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise 5**\n", "\n", "- Write a shell script that will report the number of lines in each file within the current directory. " ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": true }, "outputs": [], "source": [ "cat > count_lines.sh << 'EOF'\n", "#!/bin/bash\n", "\n", "for FILE in $(ls)\n", "do \n", " wc -l \"${FILE}\"\n", "done\n", "EOF" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Make the file executable" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": true }, "outputs": [], "source": [ "chmod +x count_lines.sh" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Run the file" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#!/bin/bash\n", "\n", "for FILE in $(ls)\n", "do \n", " wc -l \"${FILE}\"\n", "done\n" ] } ], "source": [ "cat count_lines.sh" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 2 MD5SUM\n", " 3 MD5_CHECKSUM\n", " 1000 SP1.fq\n", " 2191 The_Unix_Shell_01___File_and_Directory_Management.ipynb\n", " 361 The_Unix_Shell_02___Working_with_Text.ipynb\n", " 733 The_Unix_Shell_03___Finding_Stuff.ipynb\n", " 883 The_Unix_Shell_04___Regular_Expresssions.ipynb\n", " 940 The_Unix_Shell_05___Shell_Scripts.ipynb\n", " 444 The_Unix_Shell___Exercises.ipynb\n", " 1 a.txt\n", " 1 b.txt\n", " 1 c.txt\n", " 6 count_lines.sh\n", "wc: data: read: Is a directory\n", " 39 examples.tar.gz\n", " 1 goodbye.md5\n", " 4 goodbye.txt\n", " 4 hell.txt\n", " 1 hello.md5\n", " 4 hello.txt\n", " 7 sample.fa\n", "wc: scripts: read: Is a directory\n", " 1 stderr.txt\n", " 0 stdout.txt\n", " 1 test.md5\n" ] } ], "source": [ "./count_lines.sh" ] } ], "metadata": { "kernelspec": { "display_name": "Bash", "language": "bash", "name": "bash" }, "language_info": { "codemirror_mode": "shell", "file_extension": ".sh", "mimetype": "text/x-sh", "name": "bash" } }, "nbformat": 4, "nbformat_minor": 2 }