Daphnia genome assemblies, 2020 assessment, in-progress results Genome assembly information Daphnia pulex ========================================= Dpulex19/daphplx_2019ml --------------------------------------- assembed size: 190 Mb (186 Mb - gaps) 493 total scaffolds, longest 7.6 Mb NCBI Report # Assembly name: PA42 4.1 # Organism name: Daphnia pulex (common water flea) # Taxid: 6669 # BioSample: SAMEA4059181 # BioProject: PRJEB14656 # Submitter: INDIANA UNIVERSITY BLOOMINGTON # Date: 2019-09-30 # Assembly type: haploid # Release type: major # Assembly level: Scaffold # Genome representation: full # WGS project: FLTH02 # Genome coverage: 80x # RefSeq category: Representative Genome # GenBank assembly accession: GCA_900092285.2 # ## Assembly-Units: ## GenBank Unit Accession RefSeq Unit Accession Assembly-Unit name ## GCA_900092284.2 Primary Assembly --------------------------------------- Dpulex16/daphplx_2016ml --------------------------------------- assembed size: 156 Mb (143 Mb - gaps) 1823 scaffolds total, longest 1.7 Mb Paper published 2017 # GenBank assembly accession: GCA_900092285.1 replaced by GCA_900092285.2 --------------------------------------- Dpulex06/daphplx_2006jgi --------------------------------------- assembed size: 220 Mb Sanger sequenced, paired end reads, WGS assembly by JGI Population sampled differs from daphplx_2016,19ml Paper published 2010, JGI release 1b, dpulex_jgi060905 NCBI Report # Assembly name: V1.0 # Organism name: Daphnia pulex (common water flea) # Taxid: 6669 # BioSample: SAMN02744063 # BioProject: PRJNA12756 # Submitter: US DOE Joint Genome Institute (JGI-PGF) # Date: 2011-02-04 # Assembly type: haploid # Release type: major # Assembly level: Scaffold # Genome representation: full # WGS project: ACJG01 # Assembly method: JAZZ v. 1.0 (4/11/2006) # Genome coverage: 8.7x # Sequencing technology: Sanger # GenBank assembly accession: GCA_000187875.1 --------------------------------------- Daphnia magna ========================================= Daphnia magna assemblies, daphmag_genoasms1512.txt dmag10asm : 454 gdna reads, 2010 Newbler version dmag14bgi2 : illumina gdna assembly, 2014 BGI v2 nwbdmag24g7d | 454 gdna reads as 2010 with 2013 Newbler version Sizes dmag10asm/nwb24orig_asm n= 40356, totlen=131 Mb (108-gaps), n50=64 , largest 3.1 Mb nwbdmag24g7d_asm n= 41303, totlen=134 Mb (117-gaps), n50=68 , largest 2.8 Mb dmag14bgiv2_asm n=326016, totlen=230 Mb (210-gaps), n50=180 , largest 6.2 Mb Dmagna gene x chrasm mapping, Gene set is dmag7finall9b (Dapma7bEVm000000) nSingle nDup nTotal dmag10asm 25729 434 26163 dmag24nwb7d 25856 602 26458 dmag14bgi2 25312 3115 28427 #UPD dmag20 new asm set, in progress: dmag20ug6maca25v, dmag20sk4maca20ok (chr level) # .. maybe add brief parag. for other asm trials? daphmag15nwblg2r --------------------------------------- This uses published DNA and scaffolds of daphmag_2010nwb, and linkage map data from these two sources for this genome assembly: Duki, M., Berner, D., Roesti, M. et al. A high-density genetic map reveals variation in recombination rate across the genome of Daphnia magna . BMC Genet 17, 137 (2016). https://doi.org/10.1186/s12863-016-0445-7 Routtu et al. An SNP-based second-generation genetic map of Daphnia magna and its application to QTL analysis of phenotypic traits BMC Genomics 2014, 15:1033. http://www.biomedcentral.com/1471-2164/15/1033 Chromosome scaffolds are placed in the 10 linkage groups identified with this data, using http://catchenlab.life.illinois.edu/chromonomer/ software. Linkage groups and Mbase size are DC01 13.5, DC02 16.1, DC03 9.4, DC04 9.1, DC05 8.5, DC06 9.3, DC07 9.5, DC08 9.4, DC09 7.2, DC10 9.9, This table dmag15nwblg2r_chrs.agp maps scaffolds onto DC chromosomes. Some 20 Mb of unplaced scaffolds are including as well. These sequences are in dmag15nwblg2r_all.fa.gz, with dmag15nwblg2r_all.facount tabulating chromosme + unplaced scaffold sizes. daphmag_2010nwb --------------------------------------- assembed size: 134 Mb, est. size 227 Mb with unassembled reads 41303 scaffolds total, largest 2.8 Mb Published to NCBI GenBank stats for nwbdmag24g7d: Roche Solexa gDNA paired reads, Newbler 2013 assembler read sizes ~700bp, pair insert sizes 500 to 20,000bp 13x coverage peakDepth, 3405 scaffolds, N50=545719, estimatedGenomeSize = 227.0 MB (Newbler software est.) 64% of reads assembled Unassembled 36% reads: Repeats= 3332814, 21.22%; << meaningful, missing chr parts Singletons = 1736006, 11.05%; << missing chromosome parts Partial = 562419, 3.58% Unassembled gDNA reads mapped to chrasm and trasm: transcript set: dmagset7finloc9b.gff exons nx=176534; exonw=303.7; sw=53630391; =~ 46% of chrasm NCBI Report # Assembly Statistics Report # Assembly name: daphmag2.4 # Organism name: Daphnia magna (crustaceans) # Infraspecific name: strain=Xinb3 # Sex: pooled male and female # Taxid: 35525 # BioSample: SAMN03703141 # BioProject: PRJNA298946 # Submitter: Indiana University # Date: 2016-04-26 # Assembly type: haploid # Release type: major # Assembly level: Scaffold # Genome representation: full # WGS project: LRGB01 # Assembly method: Newbler v. 2.3 091027_1459 # Expected final version: no # Genome coverage: 13.0x # Sequencing technology: 454 # GenBank assembly accession: GCA_001632505.1 # ## Assembly-Units: ## GenBank Unit Accession RefSeq Unit Accession Assembly-Unit name ## GCA_001632515.1 Primary Assembly --------------------------------------- daphmag_2012bgi --------------------------------------- assembed size: 230 Mb Illumina gDNA paired reads read sizes ~100bp, pair insert sizes ?? 326016 scaffolds total, largest 6.2 Mb Unpublished draft assembly --------------------------------------- Dmagna19/daphmag_2019sk --------------------------------------- assembed size: 123 Mb 10 assembled chromosomes, 8 Mb to 16 Mb, 110 Mb total 4193 scaffolds total Population sampled differs from daphmag_2010, 2012, Paper published 2019 NCBI Report # Assembly name: ASM399081v1 # Organism name: Daphnia magna (crustaceans) # Infraspecific name: strain=SK # Sex: pooled male and female # Taxid: 35525 # BioSample: SAMN10036042 # BioProject: PRJNA490418 # Submitter: Sungkyunkwan University # Date: 2019-01-07 # Assembly type: haploid # Release type: major # Assembly level: Chromosome # Genome representation: full # WGS project: QYSF01 # Assembly method: Platanus v. 1.2.4; Chromonomer v. 1.08 # Expected final version: no # Genome coverage: 109.0x # Sequencing technology: Illumina HiSeq # RefSeq category: Representative Genome # GenBank assembly accession: GCA_003990815.1 # RefSeq assembly accession: GCF_003990815.1 # RefSeq assembly and GenBank assemblies identical: no # ## Assembly-Units: ## GenBank Unit Accession RefSeq Unit Accession Assembly-Unit name ## GCA_003990845.1 GCF_003990845.1 Primary Assembly ## GCF_011266035.1 non-nuclear --------------------------------------- Daphnia carinata ========================================= Dcari20/daphcari_2020cn --------------------------------------- assembed size: 132 Mb 10 assembled chromosomes, 6 Mb to 16 Mb, 97 Mb total 454 scaffolds total Paper published 2020 NCBI Report # Assembly name: ASM1316709v1 # Organism name: Daphnia carinata (crustaceans) # Infraspecific name: strain=WSL # Sex: female # Taxid: 120202 # BioSample: SAMN13175665 # BioProject: PRJNA587065 # Submitter: Huazhong Agricultural University # Date: 2020-05-26 # Assembly type: haploid # Release type: major # Assembly level: Chromosome # Genome representation: full # WGS project: WJBH01 # Assembly method: wtbdg v. Mar-2017; Canu v. 1.50; FALCON v. 0.30 # Expected final version: yes # Genome coverage: 262.0x # Sequencing technology: Illumina HiSeq 2500; PacBio Sequel; HiSeq X Ten # GenBank assembly accession: GCA_013167095.1 # ## Assembly-Units: ## GenBank Unit Accession RefSeq Unit Accession Assembly-Unit name ## GCA_013167105.1 Primary Assembly --------------------------------------- Drosophila pseudoobscura ========================================= Dropse20 -------- assembed size: 160 Mb NCBI Report # Assembly name: UCI_Dpse_MV25 # Organism name: Drosophila pseudoobscura (flies) # Infraspecific name: strain=MV2-25 # Sex: female # Taxid: 7237 # BioSample: SAMN13616452 # BioProject: PRJNA596268 # Submitter: University of California, Irvine # Date: 2020-03-03 # Assembly type: haploid # Release type: major # Assembly level: Chromosome # Genome representation: full # WGS project: WVEN01 # Assembly method: Canu v. 1.7 # Expected final version: yes # Genome coverage: 280.0x # Sequencing technology: PacBio Sequel # GenBank assembly accession: GCA_009870125.2 # RefSeq assembly accession: GCF_009870125.1 # RefSeq assembly and GenBank assemblies identical: no ---------------------------------------