Improvement in Speed and Accuracy of Multiple Sequence Alignment Program PRIME

DOI
  • Yamada Shinsuke
    Waseda University Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST)
  • Gotoh Osamu
    Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST) Kyoto University
  • Yamana Hayato
    Waseda University

抄録

Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. We have developed an MSA program PRIME, whose crucial feature is the use of a group-to-group sequence alignment algorithm with a piecewise linear gap cost. We have shown that PRIME is one of the most accurate MSA programs currently available. However, PRIME is slower than other leading MSA programs. To improve computational performance, we newly incorporate anchoring and grouping heuristics into PRIME. An anchoring method is to locate well-conserved regions in a given MSA as anchor points to reduce the region of DP matrix to be examined, while a grouping method detects conserved subfamily alignments specified by phylogenetic tree in a given MSA to reduce the number of iterative refinement steps. The results of BAliBASE 3.0 and PREFAB 4 benchmark tests indicated that these heuristics contributed to reduction in the computational time of PRIME by more than 60% while the average alignment accuracy measures decreased by at most 2%. Additionally, we evaluated the effectiveness of iterative refinement algorithm based on maximal expected accuracy (MEA). Our experiments revealed that when many sequences are aligned, the MEA-based algorithm significantly improves alignment accuracy compared with the standard version of PRIME at the expense of a considerable increase in computation time.

収録刊行物

詳細情報 詳細情報について

  • CRID
    1390001205264232192
  • NII論文ID
    130000120674
  • DOI
    10.11185/imt.4.317
  • ISSN
    18810896
  • 本文言語コード
    en
  • データソース種別
    • JaLC
    • CiNii Articles
  • 抄録ライセンスフラグ
    使用不可

問題の指摘

ページトップへ