{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Unix Shell: Working with Text" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Text streams\n", "\n", "Input and output of most Unix shell programs consists of plain text streams. Text output from a program can be piped into another program, or redirected to other streams. The standard streams are `stdin (0)` (standard input), `stdout (1)` (standard output) and `stderr (2)` (standard error). The default is to assume that input comes from `stdin` and output goes to `stdout`. We can also stream to and from a file. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pipes and redirection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating a text file from command line\n", "\n", "Sometimes using a text editor is over-kill. For simple file creation, we can just use re-direction\n", "\n", "A single `>` will create a new file or over-write an existing one." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "echo \"1 Hello, bash\" > hello.txt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ls *txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Appending" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "echo \"2 Hello, again\" >> hello.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Special non-printing characters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "echo -e \"3 Hello\\n4 again\" >> hello.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### From file to `stdout" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cat hello.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pipe to `cut` program to extract columns 2,3,4,5" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cat hello.txt | cut -c 2-5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building a chain of pipes\n", "\n", "`wc -lc` reports the number of lines and bytes (usually corresponds to characters when using English text). \n", "\n", "Note that character count is 5 per line and not 4 because cut adds a newline character for each line." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Extract columns 2-5 and then count the number of lines and characters " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cat hello.txt | cut -c 2-5 | wc -lc " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Extract lines 2-3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cat hello.txt | head -n 3 | tail -n 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Capturing error messages\n", "\n", "The redirection operator > is actually 1> - that is, using stdout. We can also use 2> to redirect the output of stderr. &> means redirect both stdout and stderr, and is useful if for example, you want to direct all output to the same log file for later inspection." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mkdir foo/bar/baz > 'stdout.txt'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### As there is nothing from stdout the file is empty" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cat 'stdout.txt'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### We need to use 2> to capture the output from stderr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mkdir foo/bar/baz 2> 'stderr.txt' \n", "cat 'stderr.txt'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Character substitution with `tr` (transliteration)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Switch case." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "echo \"gattaca\" | tr a-z A-Z" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Find reverse complement of DNA string." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "echo 'GATTACA' | tr ACTG TGAC | rev" ] } ], "metadata": { "kernelspec": { "display_name": "Bash", "language": "bash", "name": "bash" }, "language_info": { "codemirror_mode": "shell", "file_extension": ".sh", "mimetype": "text/x-sh", "name": "bash" } }, "nbformat": 4, "nbformat_minor": 2 }