wFleaBase | BLAST | BioMart | GBrowse Maps | Genomics | Help

Daphnia pulex: Rich in tandem gene duplications

One aspect of genome biology that is difficult to model is clusters of duplicate genes. The close, near-identical coding exons can confuse most methods that use alignment, including BLAST, BLAT, GeneWise and similar gene mappers that align a protein to find genes.

Daphnia pulex's genome appears to have 50% to twice as many gene duplications as the duplicate-rich C. elegans genome. See this table of Selected Daphnia duplicate genes with homologs to other organisms.

Gene duplication is widespread in C. elegans, more so than Drosophila or yeast [1], with about 6,000 genes identified as paralogous. Around 400 clusters of tandem genes have been found in this worm. In the exon-duplicate analysis of Daphnia duplications summarized here, about 20% of 27,000 gene predictions in Daphnia appear to be near-duplicates, compared to 10% of 20,000 C. elegans genes with the same methods, and 6% of 14,500 Dros. melanogaster genes.

daphnia hemoglobin gene cluster A stunning example is Daphnia's well known hemoglobin (Hb) genes, which are called into play as Daphnia rapidly turn blood-red when the oxygen is depleted in their ponds. Four Hb genes have been described prior to whole genome analysis [2]. We find now a tandem cluster of 8 hemoglobin genes on scaffold_4. Together with another tandem pair on scaffold 13, provides this organism with 10 near identical Hb genes.

With over 40 genes for Cytochrome P450, and large numbers of duplicates for other common gene families, this organism may be ahead of other eukaryotes, barring polyploid genomes, for gene duplications. Duplication clusters (of N >= 3) found in this tandem exon analysis are Drosophila melanogaster: 168, Dros. mojavensis: 338, C. elegans: 680, and Daphnia pulex: 919

tRNA clusters : With 3800 computed tRNAs, this genome has 10 times more than any other eukaryote genomes. This overabundance of these appears often in clusters of 1 to 4 types over a span of 1 kb, with several copies of the same over 10 kb. These also appear to be repeated 4-5 times in the same scaffold region of 100 kb. These clusters are often associated with particular stress ESTs (the same ones map to the same tRNA cluster locations), such as bacterial infection or titanium nanoparticles, and show some genome tiling expression. Associates have also located 5 copies of RNAseP genes that use tRNAs, where 2 are uncommon among other genomes. Further analysis of this pattern of overabundant tRNAs may give important clues on Daphnia's general gene duplication mechanisms and functionality. Are they caused by transposon or other mechanism? Are they functional, perhaps as part of rapid adaptive responses to environmental stresses? Find tRNA data here.

Protein duplicates : Protein duplicate analysis is the simplest, and common way to identify gene paralogs. This study of Daphnia has shown that gene finding methods often fail to accurately model tandem duplicate genes, so that predicted proteins are subject to error for this genome. With that caveat, this analysis of predicted proteins also shows a larger number of duplicates than related genomes, in particular for nearby duplicate genes.

Tandemgenes, or 'tandy', software has been developed to address a problem of gene predictions for tandem duplicate genes. Tandem duplicates can be nearly identical (95+% identity), and very close (within intron distances of each other), and very interesting biology. This is described in more detail here. Supporting data for Daphnia gene duplication is here.


1. Woollard, A. Gene duplications and genetic redundancy in C. elegans (June 25, 2005), ed. The C. elegans Research Community, WormBook, doi/10.1895/wormbook.1.2.1,

2. Gorr, T.A., J. D. Cahn, H. Yamagata, and H. F Bunn, 2004. Hypoxia-induced Synthesis of Hemoglobin in the Crustacean Daphnia magna Is Hypoxia-inducible Factor-dependent. J. Biol. Chemistry, 279(34):36038~36047, doi/10.1074/jbc.M403981200

Don Gilbert, June 2007,
      Name                          Last modified       Size  Description

[DIR] Parent Directory 30-Mar-2008 14:08 - [IMG] daphnia-hemoglobin.jpg 28-Jun-2007 21:56 178k [TXT] tandemgenes-analysis.html 29-Jun-2007 13:19 5k [DIR] tdpages/ 11-Jul-2007 19:35 - [TXT] daphnia_tandemgene_table.html 11-Jul-2007 19:36 260k [DIR] proteindups/ 05-Aug-2007 21:52 - [TXT] daphnia-gene-tandems.html 07-Aug-2007 10:19 5k [DIR] genome_dotplot/ 16-Aug-2007 16:58 - [DIR] gene_density/ 08-Feb-2008 18:07 - [DIR] tandem_runs/ 11-Jul-2009 14:56 -

Apache/1.3.29 Server at Port 7182