Working with Loops¶
Let’s kick it up another notch we have lots of FASTQs, let’s run our analysis on more than one!
Shell Variables¶
Assign the variables in this notebook.
In [1]:
source bioinf_intro_config.sh
mkdir -p $TRIMMED $STAR_OUT
In [2]:
for FASTQ in 27_MA_P_S38_L002_R1 27_MA_P_S38_L001_R1
do
echo "RUNNING FASTQ: ${FASTQ}"
done
RUNNING FASTQ: 27_MA_P_S38_L002_R1
RUNNING FASTQ: 27_MA_P_S38_L001_R1
Now let’s run both samples throught the pipeline:¶
In [3]:
for FASTQ in 27_MA_P_S38_L002_R1 27_MA_P_S38_L001_R1
do
echo "---------------- TRIMMING: $FASTQ ----------------"
fastq-mcf \
$MYINFO/neb_e7600_adapters.fasta \
$RAW_FASTQS/${FASTQ}_001.fastq.gz \
-q 20 -x 0.5 \
-o $TRIMMED/${FASTQ}_001.trim.fastq.gz
echo "---------------- MAPPING: $FASTQ ----------------"
STAR \
--runMode alignReads \
--twopassMode None \
--genomeDir $GENOME_DIR \
--readFilesIn $TRIMMED/${FASTQ}_001.trim.fastq.gz \
--readFilesCommand gunzip -c \
--outFileNamePrefix ${STAR_OUT}/${FASTQ}_ \
--quantMode GeneCounts \
--outSAMtype None
done
---------------- TRIMMING: 27_MA_P_S38_L002_R1 ----------------
Command Line: /Users/cliburn/work/scratch/bioinf_intro/myinfo/neb_e7600_adapters.fasta /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz -q 20 -x 0.5 -o /Users/cliburn/work/scratch/bioinf_intro/trimmed_fastqs/27_MA_P_S38_L002_R1_001.trim.fastq.gz
Scale used: 2.2
gunzip: can't stat: /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz (/data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz.gz): No such file or directory
Phred: 64
No records in file /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L002_R1_001.fastq.gz
---------------- MAPPING: 27_MA_P_S38_L002_R1 ----------------
STAR: Bad Option: --runMode.
Usage: STAR cmd [options] [-find] file1 ... filen [find expression]
Use STAR -help
and STAR -xhelp
to get a list of valid cmds and options.
Use STAR H=help
to get a list of valid archive header formats.
Use STAR diffopts=help
to get a list of valid diff options.
---------------- TRIMMING: 27_MA_P_S38_L001_R1 ----------------
Command Line: /Users/cliburn/work/scratch/bioinf_intro/myinfo/neb_e7600_adapters.fasta /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz -q 20 -x 0.5 -o /Users/cliburn/work/scratch/bioinf_intro/trimmed_fastqs/27_MA_P_S38_L001_R1_001.trim.fastq.gz
Scale used: 2.2
gunzip: can't stat: /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz (/data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz.gz): No such file or directory
Phred: 64
No records in file /data/hts2018_pilot/Granek_4837_180427A5/27_MA_P_S38_L001_R1_001.fastq.gz
---------------- MAPPING: 27_MA_P_S38_L001_R1 ----------------
STAR: Bad Option: --runMode.
Usage: STAR cmd [options] [-find] file1 ... filen [find expression]
Use STAR -help
and STAR -xhelp
to get a list of valid cmds and options.
Use STAR H=help
to get a list of valid archive header formats.
Use STAR diffopts=help
to get a list of valid diff options.
And let’s check the result¶
In [4]:
ls ${STAR_OUT}