# Bash Functions

## Load Variables and Make Directories

In [None]:
source bioinf_intro_config.sh
mkdir -p $STAR_OUT

## Trim and Map Reads

In [None]:
TrimAndMap() {
 FASTQ=$1
 FASTQ_BASE="$(basename ${FASTQ} '_001.fastq.gz')"
 echo $FASTQ
 echo $FASTQ_BASE

 # make a pipe for trimmed fastq
 CUR_PIPE=`mktemp --dry-run`_${FASTQ_BASE}_pipe.fq
 mkfifo $CUR_PIPE

 # Run fastq-mcf
 fastq-mcf \
 $ADAPTERS \
 $FASTQ \
 -o $CUR_PIPE \
 -q 20 -x 0.5 &
 
 # Run STAR
 STAR \
 --runMode alignReads \
 --twopassMode None \
 --genomeDir $GENOME_DIR \
 --outSAMtype None \
 --quantMode GeneCounts \
 --outFileNamePrefix ${STAR_OUT}/${FASTQ_BASE}_ \
 --alignIntronMax 5000 \
 --outSJfilterIntronMaxVsReadN 500 1000 2000 \
 --readFilesIn $CUR_PIPE 
 
 # Destroy pipe
 rm -f $CUR_PIPE
}

export -f TrimAndMap

## Call the function

In [None]:
for FASTQ in $RAW_FASTQS/35_MA_P_S39_L00[1-2]_R1_001.fastq.gz
 do
 TrimAndMap $FASTQ
 done

In [None]:
ls -ltr $STAR_OUT

## Notes
1. We are using a named pipe here, but that is not at all necessary for bash functions, its just a bell-and-whistle
 1. `fastq-mcf` is outputing the trimmed fastq *into* the named pipe and STAR is reading the trimmed fastq *from* the named pipe
 2. Because we are channeling the output of `fastq-mcf` to `STAR` through a named pipe, we are not telling `fastq-mcf` to gzip its output. In this case gzipping doesn't buy us anything: it doesn't save disk space, because data passed through the named pipe never touches the disk, and for this same reason, it doesn't save us time in writing to or reading from the disk. If we gzipped, we would incur the computational cost of compressing and decompressing without any benefit

2. `$1` refers to the first argument passed to the function
3. We *could* call the `TrimAndMap` function directly, we don't have to call it within a loop.