STAR: Pipeline¶

Synposis¶

This notebook will outline the steps using the STAR pipeline for pre-processing RNA-Seq reads.

Set program names¶

In [1]:

mystar=/opt/NGS/STAR/STAR-2.5.2b/bin/Linux_x86_64_static/STAR

Set reference and annotation files¶

In [2]:

myrefdir="/data1/workspace/tmp/STAR/reference"
mystarindex="/data1/workspace/tmp/STAR/index"
myfasta=$myrefdir"genome.fasta"
mygtf=$myrefdir"genome.gtf"

Create STAR Index¶

The first step is to produce the STAR index. Note that in this illustration up to 16 cores will be used. You need at least 32GB of RAM to process a large genome.

In [3]:

time $mystar \
    --runMode genomeGenerate \
    --genomeDir $mystarindex \
    --sjdbGTFfile $mygtf \
    --genomeFastaFiles $myfasta \
    --runThreadN 16

bash: /opt/NGS/STAR/STAR-2.5.2b/bin/Linux_x86_64_static/STAR: No such file or directory

real    0m0.002s
user    0m0.000s
sys     0m0.001s

Process a paired end read sample: From reads to counts¶

The following process a

In [4]:

Set file names for sample

In [5]:

mysample="SAMPLE-12345"
R1=$mysample"-R1.fastq"
R1=$mysample"-R2.fastq"

Now align (R1,R2) to the reference index and count according to the genes in the gtf file

In [6]:

time $mystar \
    --twopassMode Basic \
    --quantMode GeneCounts \
    --genomeDir $mystarindex \
    --sjdbGTFfile $mygtf \
    --genomeFastaFiles $myfasta \
    --runThreadN 16 \
    --readFilesIn $R1 $R2 \
    --outFileNamePrefix $mysample

bash: /opt/NGS/STAR/STAR-2.5.2b/bin/Linux_x86_64_static/STAR: No such file or directory

real    0m0.001s
user    0m0.000s
sys     0m0.001s

For additional information, see the STAR manual

http://labshare.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STAR.posix/doc/STARmanual.pdf

STAR: Pipeline¶

Synposis¶

Set program names¶

Set reference and annotation files¶

Create STAR Index¶

Process a paired end read sample: From reads to counts¶

Page contents

Previous page

Next page

This Page