Gain and Loss of Functional TFBs

From HORTS 1993
Jump to navigation Jump to search


Objective

  • Developed a model that detects loss of constraint on individual transcription factor binding sites (TFBs)
  • Found that loss of functional constraint in TFBs in 3 Yeast species according to the lineages
  • There is also a case of high Gain in binding sites.
  • Conclude that cis-regulatory functional binding sites are NOT CONSERVED
  • And that contributes to species specific gene expression

Introduction

  • Change in cis-regulatory sequences is most common
  • Mostly characterized by insertion/deletion of repeats, transposable elements
  • Even if the gene expression is conserved phenotypically, the TFBs are not conserved, meaning that there is a complex relationship between divergence in sequence and divergence in its function
  • Investigating individual TFBs are necessary to understand the evolution.
  • To compute the loss or gain of TFBs,
    • the model should assume the sequence specific variation that have no effect on function or fitness
    • Combining Models for natural evolution and models for conserved binding sites can estimate the gain or loss of TFBs (eg: in Drosophila)
    • A phylogenetic approach can also be used to identify the loss of functional TFBs.
  • Here, phylogenetic approach is used in 4 species of Yeast (Saccharomyces species)
    • All the 4 species provide sufficient signal to identify individual binding sites
    • Found that 44/91 different transcription factors are NOT conserved across species and in some cases they affect the genome regulation

Results

  1. Model to identify semi-conserved TFBs
    1. Usually the semi-conserved model is identified by their patterns of substitution rate
    2. By combining neutral + conserved + semi conserved = semi-conserved sites can be identified.
    3. Statistics are:
      1. substitution rate
      2. substitution rate -> strength of selection
      3. synonymous substitution rate -> neutral substitution rate
  2. Frequency of semi-conserved binding sites in 4 genome
    1. Aligned the sequence -> used 91 TFBs models -> for 2000 positions
      1. 55% -> best explained by Conserved model
      2. 31% -> best explained by Semi-conserved model
      3. 14% -> best explained by Neutral model
      4. Plots showing the frequency of conserved/semi-conserved binding sites
  3. Characterization of semi-conserved sites
    1. In principle, conserved, semi-conserved sites => is basically sites with high and low binding energy
    2. Graph illustrating conserved binding sites have higher binding energy
How do you measure the binding energy for each binding site computationaly?
  1. Evolution of semi-conserved sites
    1. Lineage specific loss is due to:
      1. species experience new environment
      2. Gain in more than one redundant binding sites within a promoter enables loss of previously constrained binding site
    2. Table1 shows the substitution rates in conserved/semi-conserved sites
      1. In comparison, species-specific sites are present within 50% of the promoters with conserved sites
      2. species-specific sites are present within 47% of the promoters with semi-conserved sites
    3. Rate of turnover is different for species
      1. 50% of sites show turnover in S. cervisiae
      2. 55% of sites show turnover in S. paradoxus
      3. 60% of sites show turnover in S. mikatae
    4. similarly the loss of sites are also different for each species
  2. Substitutions due to site loss cause change in expression
  3. Limited conservation of Experimentally identified TFBs
    1. Substantial fraction of experimentally identified binding sites appear to be specific or weakly conserved across species, suggesting that binding site gain may be common
  4. Conclusion:
    1. Provide an efficient method to identify loss of constraint on a putative binding site sequence