Working with Paired-Reads¶
If we had paired-end read data, we would need to do things a little bit different at some of the steps.
Running fastq-mcf on Paired Data¶
It only takes two minor changes to run fastq-mcf on paired data, we need to tell it to also load the read 2 file, and also what to call the trimmed output from this file.
- neb_e7600_adapters.fasta
- 27_MA_P_S38_L002_R1_001.fastq.gz
- 27_MA_P_S38_L002_R2_001.fastq.gz : NEW for paired-data
- -q 20
- -x 0.5
- -o 27_MA_P_S38_L002_R1_001.trim.fastq.gz
- -o 27_MA_P_S38_L002_R2_001.trim.fastq.gz : NEW for paired-data
Like this:
fastq-mcf $MYINFO/neb_e7600_adapters.fasta \
$RAW_FASTQS/27_MA_P_S38_L002_R1_001.fastq.gz \
$RAW_FASTQS/27_MA_P_S38_L002_R2_001.fastq.gz \
-q 20 -x 0.5 \
-o $TRIMMED/27_MA_P_S38_L002_R1_001.trim.fastq.gz \
-o $TRIMMED/27_MA_P_S38_L002_R2_001.trim.fastq.gz
Note: Now that, since we are now including the reverse reads, we
expect to see contamination with both adapters now
Running STAR on Paired Data¶
As with fastq-mcf, running STAR on Paired Data on requires a minor
change: adding the R2 FASTQ file to the arguments for --readFilesIn
and removing the “R1” from the --outFileNamePrefix
, since the output
will combine R1 and R2, like this:
STAR \
--runMode alignReads \
--twopassMode None \
--genomeDir $GENOME_DIR \
--readFilesIn $TRIMMED/27_MA_P_S38_L002_R1_001.trim.fastq.gz \
$TRIMMED/27_MA_P_S38_L002_R2_001.trim.fastq.gz \
--readFilesCommand gunzip -c \
--outFileNamePrefix ${STAR_OUT}/27_MA_P_S38_L002_ \
--quantMode GeneCounts \
--outSAMtype BAM Unsorted \
--outSAMunmapped Within