top of page
DNA

Determining what is needed to upload environmental sequencing data to NCBI can be challenging. Here is a short guide to help you through the process. 

Uploading Illumina MiSeq Environmental Data to NCBI

Click here to access the portal. 

NCBI Portal

- Start at the Submission portal: https://submit.ncbi.nlm.nih.gov/ 

 

- What do you want to submit? "rRNA 16S" (or whatever your samples are) 

 

- Where do you want to submit? SRA 

 

- Provide a project name and description

 

- Choose the sample type: "Metagenome or environmental sample" for environmental samples.

  • more information can be found here

 

- Prepare sample metadata unique by sample- Download Excel template here

- Descriptions of all possible variables with descriptions can be found here.

​

- Required metadata

  • sample_name sample ID

  • collection_date (YYYY/MM/DD) Must not have the time stamp. To exclude the time in Excel, use this formula: =TEXT(DATEVALUE(TEXT(E3,"yyyy-mm-dd")),"yyyy-mm-dd")

  • isolation_source Describes the physical, environmental and/or local geographical source of the biological sample from which the sample was derived. e.g. "A mix of two samples: one from near bottom of the lake, and one from the surface, both collected at the deepest part of the lake."

  • geo_loc__name (Enter the country- i.e. "USA" for the United States- followed by a colon and more specifics. e.g. USA:Colorado:Pear Reservoir)

  • lat_lon (Can enter "not collected" if don't have easy access.)

  • organism (BioSample considers environmental samples are considered metagenomes, specified in this column e.g. lakes are "freshwater metagenomes". List of current suggestions can be found here.) (Interested in more information? Response from BioSample help email can be found here.)

 

- Provide sequence metadata

  • sample_name sample ID (needs to match the sample metadata above)

  • library_ID same as sample ID (likely)

  • instrument_model Illumina MiSeq

  • library_strategy RNA-Seq

  • library_source METAGENOMIC

  • library_selection PCR

  • library_layout paired

  • platform ILLUMINA

  • instrument_model Illumina MiSeq

  • filename forward reads (fastq.gz)

  • filename2 reverse reads (fastq.gz)

 

-Upload your fastq files for each sample. When I did this step, it took about 15 min. to upload 68 files, each around 3MB.

© 2017 by Kim Vincent 

bottom of page