Reference Genome and Annotation¶
- NCBI Pseudomonas syringae Genome Page
- FASTA: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/805/GCF_000007805.1_ASM780v1/GCF_000007805.1_ASM780v1_genomic.fna.gz
- GFF: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/805/GCF_000007805.1_ASM780v1/GCF_000007805.1_ASM780v1_genomic.gff.gz
The following can be used to download the reference genome sequence and annotation for our strain of Pseudomonas syringae:
In [1]:
ACCESSION="GCF_000007805.1_ASM780v1"
PREFIX=${ACCESSION}_genomic
GFF=${PREFIX}.gff
FNA=${PREFIX}.fna
FA=${PREFIX}.fa
GENOME_DIR=XXXXXXXXX
FIRST_PART="ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/805"
for CUR in $GFF $FNA ; do
rsync rsync://${FIRST_PART}/${ACCESSION}/${CUR}.gz ${GENOME_DIR}
done
Warning Notice!
You are accessing a U.S. Government information system which includes this
computer, network, and all attached devices. This system is for
Government-authorized use only. Unauthorized use of this system may result in
disciplinary action and civil and criminal penalties. System users have no
expectation of privacy regarding any communications or data processed by this
system. At any time, the government may monitor, record, or seize any
communication or data transiting or stored on this information system.
-------------------------------------------------------------------------------
Welcome to the NCBI rsync server.
Warning Notice!
You are accessing a U.S. Government information system which includes this
computer, network, and all attached devices. This system is for
Government-authorized use only. Unauthorized use of this system may result in
disciplinary action and civil and criminal penalties. System users have no
expectation of privacy regarding any communications or data processed by this
system. At any time, the government may monitor, record, or seize any
communication or data transiting or stored on this information system.
-------------------------------------------------------------------------------
Welcome to the NCBI rsync server.
Adapter¶
The universal adapter and the 3’ common portion of the indexed adapter
that we used for this year’s project are the same as for the 2015 data,
so you can use the same adapter file. We have provided an adapter file
with these sequences in
/home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/info/neb_adapters.fasta
Data Overview¶
Raw Data¶
We have 4 dataset this year, each of these is a subdirectory in
/data/HTS_2017_data/raw_data/
: * HTS_2017_pilot: One MiSeq run
generated from pool of 6 pilot samples * HTS_2017_miseq_1: One MiSeq
run generated from pool of all 48 samples (8 groups x 6 samples per
group) * HTS_2017_miseq_2: Second MiSeq from same pool of 48 samples
used in HTS_2017_miseq_1 * HTS_2017_nextseq: NextSeq run on pool
of 42 samples
Manifest¶
A manifest for the FASTQ files is available at:
/home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/info/fastq_manifest.csv
Notes¶
- NextSeq run does not include samples 31-36
- NextSeq has 4 contiguous lanes, so each sample has four FASTQs, one each L001-L004
- NextSeq data is 75bp single-end reads, so it is not directly comparable to the MiSeq data, which is 50bp single-end
Count Data¶
/home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/counts/
:
* HTS_2017_pilot: Counts from HTS_2017_pilot MiSeq run *
HTS_2017_miseq_1: Counts from HTS_2017_miseq_1 MiSeq run *
HTS_2017_both_miseq: Counts generated from concatenation of
HTS_2017_miseq_1 and HTS_2017_miseq_2 runs for each sample *
HTS_2017_nextseq: Counts generated from concatenation of all four
NextSeq lanes for each sampleMetadata¶
/home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/info/full_metadata.csv