{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Working with Loops \n",
    "\n",
    "## Shell Variables\n",
    "Assign the variables in this notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "source bioinf_intro_config.sh\n",
    "mkdir -p $TRIMMED $STAR_OUT"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A Brief journey into `for` loops\n",
    "`for` loops take our use of the `$FASTQ` variable to the next level! It is analogous to how you would teach a child to set the table: \"FOR each place at the table, put a plate . . .,\n",
    "At the shell you phrase it like this:\n",
    "\n",
    "    for PERSON in Alice Bob Carol Dave Eve\n",
    "    do\n",
    "    put plate at PERSON's place\n",
    "    put napkin at PERSON's place\n",
    "    put fork at PERSON's place\n",
    "    put spoon at PERSON's place\n",
    "    put knife at PERSON's place\n",
    "    done\n",
    "\n",
    "Here is a real example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for FASTQ in A B C D E F\n",
    "    do\n",
    "       echo \"______${FASTQ}________\"\n",
    "    done"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `for` loop in Bash is conceptually the same as in any other programming language, although the syntax may be different.  The `do` and `done` are essential - `do` needs to be before the \"loop body\" (what is going to be repeated) and `done` needs to be after it.\n",
    "\n",
    "So let's try something almost useful:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for FASTQ in 27_MA_P_S38_L002_R1\n",
    "    do\n",
    "        echo \"RUNNING FASTQ: ${FASTQ}\"\n",
    "    done"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Now for the real thing . . .\n",
    "### Let's run the pipeline in a loop:\n",
    "Notice that we are now assigning to the `$FASTQ` variable in the `for` statement"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for FASTQ in 27_MA_P_S38_L002_R1\n",
    "    do\n",
    "        echo \"---------------- TRIMMING: $FASTQ ----------------\"\n",
    "        fastq-mcf \\\n",
    "            $MYINFO/neb_e7600_adapters.fasta \\\n",
    "            $RAW_FASTQS/${FASTQ}_001.fastq.gz \\\n",
    "            -q 20 -x 0.5 \\\n",
    "            -o $TRIMMED/${FASTQ}_001.trim.fastq.gz\n",
    "        \n",
    "        echo \"---------------- MAPPING: $FASTQ ----------------\"\n",
    "        STAR \\\n",
    "            --runMode alignReads \\\n",
    "            --twopassMode None \\\n",
    "            --genomeDir $GENOME_DIR \\\n",
    "            --readFilesIn $TRIMMED/${FASTQ}_001.trim.fastq.gz \\\n",
    "            --readFilesCommand gunzip -c \\\n",
    "            --outFileNamePrefix ${STAR_OUT}/${FASTQ}_ \\\n",
    "            --quantMode GeneCounts \\\n",
    "            --outSAMtype None\n",
    "    done"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### And let's check the result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ls ${STAR_OUT}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "head ${STAR_OUT}/27_MA_P_S38_L002_R1_ReadsPerGene.out.tab"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Bash",
   "language": "bash",
   "name": "bash"
  },
  "language_info": {
   "codemirror_mode": "shell",
   "file_extension": ".sh",
   "mimetype": "text/x-sh",
   "name": "bash"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}