Identification and analysis of functional elements

This paper is about visualizing and analyzing functional landscape of human genome.

The Encyclopedia of DNA Elements (ENCODE) Project9 aims to provide a
more biologically informative representation of the human genome by using
high-throughput methods to identify and catalogue.
the functional elements encoded


30 groups involved
30MB functional seq analised as 45 regions 
         15MB known 14 regions; 
         15MB of unclassified 30 regions

Salient findings

Human genome extensively transcribed
Many non-coding transcripts - identified (miRNA?)
Many unrecogonized transcription start sites - identified
Regulatory regions are symmetrically distributed around transcription start sites
Replication timing is correlated with chromatin structure(?!)
5% of the functional seq is always with the evolutionary constraits (under negative selection)
95% of the functional elements are evolutionarily unconstrained, potentially acting like a warehouse for natural selection
Functional elements show great sequence variation among them

Transcription

GENCODE: Integrated annotations of both manual review and experimental testing procedures of cDNA and proteins

Presence of large number of unannotated transcription elements
1. validated by RT-PCR (40%)
2. RACE extension of Tx.fragments to GENCODE annotated genes is usually 50-200kb
presence of pseudo genes
presence of non protein coding RNA
Primery transcripts
1. Coverage of all 3 technologies (ENCODE, RACE and PET Tags) across encode region shows maximum coverage compared to individual technology.
Regulation of transcripts
1. used various methods to identify the regulatory elements and made a transcription start site catalogue
2. different catagories of TSS
Replication
Chromatin organisation

Evolutionary constraint and population variability

Data
- 206MB of orthologous sequence to ENCODE from 14 mammalian species
- Sequencing by targetted and isolating strategy of individual BACs
- TBA94, MAVID95 and MLAGAN alignment
- GERP87, SCONE98 and BinCons used to identify the sequences under constraint
- Intra-specific variation is by SNP data

constrained vs non-constrained
- examined measures of human variation (heterozygosity, derived allele-frequency spectra and indel rates) within the sequences of the experimentally identified functional elements
  - small portion of constrained seq, most of them (32%) are coding sequences and 40% of them are un -annotated sequences

Experimentally identified functional elements and genetic variation
- within constrained seq, coding show exceesive polymorphism
- In general, non-coding seq, show excessive polymorphism

Unexplained constrained sequences
- 40% of the ENCODE-region sequences identified as constrained are not associated with any experimental evidence of function.

Unconstrained experimentally identified functional elements
- unexpectedly large fraction of experimentally identified functional elements show no evidence of evolutionary constraint ranging from 93% for Un.TxFrags to 12% for CDS.

Hypothesis for presence of unconstrained func.Elements

Presence of miRNA: parent transcript of intronic miRNA harbours the constrained bases
transcription of intergenic regions or specific factor binding
general—the presence of neutral (or near neutral) biochemical elements, of lineagespecific functional elements, and of functionally conserved but non-orthologous elements

Identification and analysis of functional elements

Contents

Salient findings

Transcription

Evolutionary constraint and population variability

Hypothesis for presence of unconstrained func.Elements

Navigation menu

Identification and analysis of functional elements

Salient findings

Transcription

Evolutionary constraint and population variability

Hypothesis for presence of unconstrained func.Elements

Navigation menu

Search