LINEAR REGRESSION MODELS PREDICTING STRENGTH OF TRANSCRIPTIONAL ACTIVITY OF PROMOTERS

Access this Article

Author(s)

Abstract

We developed linear regression models which predict strength of transcriptional activity of promoters from their sequences. Intrinsic transcriptional strength data of 451 human promoter sequences in three cell lines (HEK293, MCF7 and 3T3), which were measured by systematic luciferase reporter gene assays, were used to build the models. The models sum up contributions of CG dinucleotide content and transcription factor binding sites (TFBSs) to transcriptional strength. We evaluated prediction accuracies of the models by cross validation tests and found that they have adequate ability for predicting transcriptional strength of promoters in spite of their simple formalization. We also evaluated statistical significance of the contributions and proposed a picture of regulatory code hidden in promoter sequences. That is, CG dinucleotide content and TFBSs mainly determine strength of transcriptional activity under ubiquitous and specific environments, respectively.

Journal

  • Genome Informatics

    Genome Informatics 25(1), 53-60, 2011

    Japanese Society for Bioinformatics

Codes

  • NII Article ID (NAID)
    130004567843
  • Text Lang
    ENG
  • ISSN
    0919-9454
  • Data Source
    J-STAGE 
Page Top