Working with Loops

Let’s kick it up another notch we have lots of FASTQs, let’s run our analysis on more than one!

Shell Variables

Assign the variables in this notebook.

In [1]:
source bioinf_intro_config.sh
mkdir -p $TRIMMED $STAR_OUT
In [2]:
for FASTQ in 27_MA_P_S38_L002_R1 27_MA_P_S38_L001_R1
    do
        echo "RUNNING FASTQ: ${FASTQ}"
    done
RUNNING FASTQ: 27_MA_P_S38_L002_R1
RUNNING FASTQ: 27_MA_P_S38_L001_R1

Now let’s run both samples throught the pipeline:

In [3]:
for FASTQ in 27_MA_P_S38_L002_R1 27_MA_P_S38_L001_R1
    do
        echo "---------------- TRIMMING: $FASTQ ----------------"
        fastq-mcf \
            $MYINFO/neb_e7600_adapters.fasta \
            $RAW_FASTQS/${FASTQ}_001.fastq.gz \
            -q 20 -x 0.5 \
            -o $TRIMMED/${FASTQ}_001.trim.fastq.gz

        echo "---------------- MAPPING: $FASTQ ----------------"
        STAR \
            --runMode alignReads \
            --twopassMode None \
            --genomeDir $GENOME_DIR \
            --readFilesIn $TRIMMED/${FASTQ}_001.trim.fastq.gz \
            --readFilesCommand gunzip -c \
            --outFileNamePrefix ${STAR_OUT}/${FASTQ}_ \
            --quantMode GeneCounts \
            --outSAMtype None
    done
---------------- TRIMMING: 27_MA_P_S38_L002_R1 ----------------
Command Line: /Users/cliburn/work/scratch/bioinf_intro/myinfo/neb_e7600_adapters.fasta /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz -q 20 -x 0.5 -o /Users/cliburn/work/scratch/bioinf_intro/trimmed_fastqs/27_MA_P_S38_L002_R1_001.trim.fastq.gz
Scale used: 2.2
gunzip: can't stat: /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz (/data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz.gz): No such file or directory
Phred: 64
No records in file /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz
---------------- MAPPING: 27_MA_P_S38_L002_R1 ----------------
STAR: Bad Option: --runMode.
Usage:  STAR cmd [options] [-find] file1 ... filen [find expression]

Use     STAR -help
and     STAR -xhelp
to get a list of valid cmds and options.

Use     STAR H=help
to get a list of valid archive header formats.

Use     STAR diffopts=help
to get a list of valid diff options.
---------------- TRIMMING: 27_MA_P_S38_L001_R1 ----------------
Command Line: /Users/cliburn/work/scratch/bioinf_intro/myinfo/neb_e7600_adapters.fasta /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz -q 20 -x 0.5 -o /Users/cliburn/work/scratch/bioinf_intro/trimmed_fastqs/27_MA_P_S38_L001_R1_001.trim.fastq.gz
Scale used: 2.2
gunzip: can't stat: /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz (/data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz.gz): No such file or directory
Phred: 64
No records in file /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz
---------------- MAPPING: 27_MA_P_S38_L001_R1 ----------------
STAR: Bad Option: --runMode.
Usage:  STAR cmd [options] [-find] file1 ... filen [find expression]

Use     STAR -help
and     STAR -xhelp
to get a list of valid cmds and options.

Use     STAR H=help
to get a list of valid archive header formats.

Use     STAR diffopts=help
to get a list of valid diff options.

And let’s check the result

In [4]:
ls ${STAR_OUT}