Seq. Project name:
Teredinibacter turnerae T7902
( Project ID: 403887 )
Product:
Standard Draft
Proposal Name:
Marine microbial communities from multiple species of wood-boring bivalves (shipworm isolates)
(Proposal ID: 920)
Project PI:
User Program:
CSP
Program Year:
2011
Scientific Program:
Microbial
Genome Portal:
Release Date:
2012-05-10
Organism | |
---|---|
Genus/species/strain/isolate: | Teredinibacter / Teredinibacter turnerae / T7902 / |
GOLD ID: | Gp0012860 |
Contacts | |
---|---|
JGI: | IMG [email protected] |
Request DNA: | Daniel L Distel <[email protected]> |
General Information | |
---|---|
QD/SAG JGI ISOLATE QC AND ASSEMBLY REPORT - 4094038 Teredinibacter turnerae T7902
1) RAW DATA:
LibraryName NumReads ReadType FileName
ICZC 29983482 2x150 /house/groupdirs/pi/project/4094038/ill_dir/ICZC.2108.1.1763.GATCAG.fastq
QC Dir:/house/groupdirs/pi/project/4094038/ill.qd
2) READ FILTERING STATS:
Pairs of matching reads were removed from the dataset.
Total input reads: 29983482 (100%)
Artifact reads removed: 215614 (0.7%)
Trimmed reads removed: 1487426 (5.0%)
Total reads removed: 1703040 (5.7%)
Total reads remaining: 28280442 (94.3%)
3) ASSEMBLY STATS:
b) Velvet assembly:
Assembly stats of the Velvet assembly created by the velvet
optimizer. The input reads have been .
Avg GC Content: 50.38 +/- 6.53%
Largest Contig: 312.1 KB
Main genome scaffold total: 262
Main genome contig total: 262
Main genome scaffold sequence total: 5.3 MB
Main genome contig sequence total: 5.3 MB (-> 0.0% gap)
Main genome scaffold N/L50: 17/94.6 KB
Main genome contig N/L50: 17/94.6 KB
Number of scaffolds > 50 KB: 38
% main genome in scaffolds > 50 KB: 81.3%
Minimum Number Number Total Total Scaffold
Scaffold of of Scaffold Contig Contig
Length Scaffolds Contigs Length Length Coverage
-------- --------- ------- ----------- ----------- --------
All 262 262 5,269,991 5,269,991 100.00%
1 kb 100 100 5,234,133 5,234,133 100.00%
2.5 kb 86 86 5,210,510 5,210,510 100.00%
5 kb 81 81 5,192,244 5,192,244 100.00%
10 kb 68 68 5,097,975 5,097,975 100.00%
25 kb 55 55 4,883,860 4,883,860 100.00%
50 kb 38 38 4,286,070 4,286,070 100.00%
100 kb 12 12 2,211,687 2,211,687 100.00%
250 kb 1 1 312,063 312,063 100.00%
c) Allpaths + Velvet simulated read pairs:
Assembly stats of the ALLPATHS assembly. The input contains simulated
1-3 kb read pairs created from the Velvet assembly and reads that
have been .
Avg GC Content: 48.92 +/- 4.09%
Largest Contig: 353.6 KB
Main genome scaffold total: 79
Main genome contig total: 79
Main genome scaffold sequence total: 5.4 MB
Main genome contig sequence total: 5.4 MB (-> 0.0% gap)
Main genome scaffold N/L50: 11/176.4 KB
Main genome contig N/L50: 11/176.4 KB
Number of scaffolds > 50 KB: 33
% main genome in scaffolds > 50 KB: 86.5%
Minimum Number Number Total Total Scaffold
Scaffold of of Scaffold Contig Contig
Length Scaffolds Contigs Length Length Coverage
-------- --------- ------- ----------- ----------- --------
All 79 79 5,383,522 5,383,520 100.00%
1 kb 79 79 5,383,522 5,383,520 100.00%
2.5 kb 68 68 5,367,721 5,367,719 100.00%
5 kb 65 65 5,356,022 5,356,020 100.00%
10 kb 57 57 5,293,243 5,293,241 100.00%
25 kb 47 47 5,133,006 5,133,004 100.00%
50 kb 33 33 4,658,932 4,658,930 100.00%
100 kb 16 16 3,422,996 3,422,996 100.00%
250 kb 4 4 1,248,888 1,248,888 100.00%
4) KEY PIPELINE CMDS:
a) Velvet assembly step for creating simulated read pairs:
Velvet version:
Velvet optimizer version: 2.1.7
Velvet optimizer params: --v --s 51 --e 71 --i 4 --t 1 --f "-shortPaired -fastq $FASTQ" --o "-ins_length 250 -min_contig_lgth 500"
b) Simulated read pairing creation step:
Wgsim version: 0.3.0
Wgsim params: -e 0 -1 76 -2 76 -r 0 -R 0 -X 0
c) ALLPATHS assembly step:
ALLPATHS version: r40295
Contents of in_libs.csv:
library_name, project_name, organism_name, type, paired, frag_size, frag_stddev, insert_size, insert_stddev, read_orientation, genomic_start, genomic_end
STD_1,project,assembly,fragment,1,200,35,,,inward,0,0
SIMREADS,project,assembly,jumping,1,,,3000,300,inward,0,0
5) WORKFLOW STEPS:
1. Removed illumina artifacts (synthetic oligos used in the laboratory).
2. Created velvet assembly of the artifact filtered data (using velvet optimiser).
3. Created simulated 1-3 kb read pairs using velvet contigs from step 2.
4. Created allpaths assembly using velvet simulated read pairs (step 3) and the artifact filtered data.6) RELEASE DATE:
Wed Feb 15 14:45:51 PST 2012 By Andrew Tritt- [email protected]
Wed Feb 15 14:53:37 PST 2012 By Kecia Duffy- [email protected]
Wed Feb 15 16:39:04 PST 2012 By Kecia Duffy- [email protected]
7) AUTHORS:
For additional information, please contact:
Kecia Duffy - [email protected]
Stephan Trong - [email protected]
James Han - [email protected]
This file was automatically generated by the single cell pipeline software (version 1.1.8).
|
Funding | |
---|---|
The work conducted by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. |
Groups |
---|
## | Name | Type |
---|---|---|
1 | Gammaproteobacteria | |
2 | Host Associated | |
3 | Proteobacteria |