Gain and Loss of Functional TFBs
Jump to navigation
Jump to search
Objective
- Developed a model that detects loss of constraint on individual transcription factor binding sites (TFBs)
- Found that loss of functional constraint in TFBs in 3 Yeast species according to the lineages
- There is also a case of high Gain in binding sites.
- Conclude that cis-regulatory functional binding sites are NOT CONSERVED
- And that contributes to species specific gene expression
Introduction
- Change in cis-regulatory sequences is most common
- Mostly characterized by insertion/deletion of repeats, transposable elements
- Even if the gene expression is conserved phenotypically, the TFBs are not conserved, meaning that there is a complex relationship between divergence in sequence and divergence in its function
- Investigating individual TFBs are necessary to understand the evolution.
- To compute the loss or gain of TFBs,
- the model should assume the sequence specific variation that have no effect on function or fitness
- Combining Models for natural evolution and models for conserved binding sites can estimate the gain or loss of TFBs (eg: in Drosophila)
- A phylogenetic approach can also be used to identify the loss of functional TFBs.
- Here, phylogenetic approach is used in 4 species of Yeast (Saccharomyces species)
- All the 4 species provide sufficient signal to identify individual binding sites
- Found that 44/91 different transcription factors are NOT conserved across species and in some cases they affect the genome regulation
Results
- Model to identify semi-conserved TFBs
- Usually the semi-conserved model is identified by their patterns of substitution rate
- By combining neutral + conserved + semi conserved = semi-conserved sites can be identified.
- Statistics are:
- substitution rate
- substitution rate -> strength of selection
- synonymous substitution rate -> neutral substitution rate
- Frequency of semi-conserved binding sites in 4 genome
- Aligned the sequence -> used 91 TFBs models -> for 2000 positions
- 55% -> best explained by Conserved model
- 31% -> best explained by Semi-conserved model
- 14% -> best explained by Neutral model
- Plots showing the frequency of conserved/semi-conserved binding sites
- Aligned the sequence -> used 91 TFBs models -> for 2000 positions
- Characterization of semi-conserved sites
- In principle, conserved, semi-conserved sites => is basically sites with high and low binding energy
- Graph illustrating conserved binding sites have higher binding energy
How do you measure the binding energy for each binding site computationaly?
- Evolution of semi-conserved sites
- Lineage specific loss is due to:
- species experience new environment
- Gain in more than one redundant binding sites within a promoter enables loss of previously constrained binding site
- Table1 shows the substitution rates in conserved/semi-conserved sites
- In comparison, species-specific sites are present within 50% of the promoters with conserved sites
- species-specific sites are present within 47% of the promoters with semi-conserved sites
- Rate of turnover is different for species
- 50% of sites show turnover in S. cervisiae
- 55% of sites show turnover in S. paradoxus
- 60% of sites show turnover in S. mikatae
- similarly the loss of sites are also different for each species
- Lineage specific loss is due to:
- Substitutions due to site loss cause change in expression
- Limited conservation of Experimentally identified TFBs
- Substantial fraction of experimentally identified binding sites appear to be specific or weakly conserved across species, suggesting that binding site gain may be common
- Conclusion:
- Provide an efficient method to identify loss of constraint on a putative binding site sequence