Tile array signal averages at Gene and Exon-Intron boundaries
Tile array scores averaged at base positions around
gene and exon boundaries are shown here
for Daphnia Nimblegen data.
Percent GC content is shown (black) with expected spikes at the boundaries (0 on x-axis):
ATG at coding start, AG intron->exon, and GT exon->intron,
and an increased GC% in exon regions.
Gene and exon starts are from base 0 to 75, (-75 .. -1 are non-genic),
while exon and gene ends are at base -75 to 0 (1..75 are non-genic).
Absolute tile scores for cDNA (red),
gDNA (blue)
and cDNA - gDNA diff (green)
|
Statistical power of tile scores at boundary regions (t-test)
|
Correlation of GC and Tile scores at boundary regions
|
Statistical power of tile scores at boundary regions (p-value)
|
Notes
- Absolute signal for cDNA is obviously tracking gene/exon boundaries.
gDNA signal is much weaker, it however also tracks these boundaries.
-
The statistical power of these scores to discriminate boundaries shows
that gDNA is weak, without statistical significance in the sample used.
Removing gDNA signal from cDNA does however reduce the power to discriminate,
as they have the same trends at boundaries. The t-statistic, and p-value,
at positions from position 0 is plotted for each score [t.test(score_0,score_x)].
- Puzzle: This Daphnia Nimblegen gDNA signal
shows the same gene-boundary signals as cDNA, if weaker and less precise.
But it shouldn't, if gDNA is not a transcription signal. What is the gDNA
signal measuring?
- Is gDNA score tracking GC content?
For the GC x tile score correlations, all are low but significantly > 0.
Intron region shows strongest GC x gDNA and cDNA correlation.
gDNA score has significantly greater correlation than cDNA to GC content.
Exon region has lower GC correlation for both.
This suggests that intronic gDNA signal and GC content are related in some way,
and a sequence effect that also occurs in cDNA signal,
but probably an indirect relation.
These might include effects of repetitive regions or other effects.
- Data currently include
Daphnia pulex sc17 (gnomon 400 genes, 2000 exons)
and Daphnia sc1 (gnomon, 800 genes, 4000 exons)
47,000 Nimblegen tiles in sc17; 130,000 tiles in sc1
with same effects seen in both scaffolds, and more precision when combined.