Reference Genome and Annotation

The following can be used to download the reference genome sequence and annotation for our strain of Pseudomonas syringae:

In [1]:
ACCESSION="GCF_000007805.1_ASM780v1"
PREFIX=${ACCESSION}_genomic
GFF=${PREFIX}.gff
FNA=${PREFIX}.fna
FA=${PREFIX}.fa
GENOME_DIR=XXXXXXXXX

FIRST_PART="ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/007/805"
for CUR in $GFF $FNA ; do
    rsync rsync://${FIRST_PART}/${ACCESSION}/${CUR}.gz ${GENOME_DIR}
done

Warning Notice!

You are accessing a U.S. Government information system which includes this
computer, network, and all attached devices. This system is for
Government-authorized use only. Unauthorized use of this system may result in
disciplinary action and civil and criminal penalties. System users have no
expectation of privacy regarding any communications or data processed by this
system. At any time, the government may monitor, record, or seize any
communication or data transiting or stored on this information system.

-------------------------------------------------------------------------------

Welcome to the NCBI rsync server.



Warning Notice!

You are accessing a U.S. Government information system which includes this
computer, network, and all attached devices. This system is for
Government-authorized use only. Unauthorized use of this system may result in
disciplinary action and civil and criminal penalties. System users have no
expectation of privacy regarding any communications or data processed by this
system. At any time, the government may monitor, record, or seize any
communication or data transiting or stored on this information system.

-------------------------------------------------------------------------------

Welcome to the NCBI rsync server.


Adapter

The universal adapter and the 3’ common portion of the indexed adapter that we used for this year’s project are the same as for the 2015 data, so you can use the same adapter file. We have provided an adapter file with these sequences in /home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/info/neb_adapters.fasta

Data Overview

Raw Data

We have 4 dataset this year, each of these is a subdirectory in /data/HTS_2017_data/raw_data/: * HTS_2017_pilot: One MiSeq run generated from pool of 6 pilot samples * HTS_2017_miseq_1: One MiSeq run generated from pool of all 48 samples (8 groups x 6 samples per group) * HTS_2017_miseq_2: Second MiSeq from same pool of 48 samples used in HTS_2017_miseq_1 * HTS_2017_nextseq: NextSeq run on pool of 42 samples

Manifest

A manifest for the FASTQ files is available at: /home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/info/fastq_manifest.csv

Notes

  1. NextSeq run does not include samples 31-36
  2. NextSeq has 4 contiguous lanes, so each sample has four FASTQs, one each L001-L004
  3. NextSeq data is 75bp single-end reads, so it is not directly comparable to the MiSeq data, which is 50bp single-end

Count Data

We are making available pre-generated count data from all samples and sequencing runs for anyone who wants to do comparisons between groups or between runs. The count data are in subdirectories of
/home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/counts/: * HTS_2017_pilot: Counts from HTS_2017_pilot MiSeq run * HTS_2017_miseq_1: Counts from HTS_2017_miseq_1 MiSeq run * HTS_2017_both_miseq: Counts generated from concatenation of HTS_2017_miseq_1 and HTS_2017_miseq_2 runs for each sample * HTS_2017_nextseq: Counts generated from concatenation of all four NextSeq lanes for each sample

Metadata

A metadata table describing each sample is at
/home/jovyan/work/2017-HTS-materials/Data_Info_and_Results/2017_HTS/info/full_metadata.csv