有機化合物の変異原性予測モデルの構築 Constructing predictive model for mutagenicity of organic compounds.

Access this Article

Author(s)

Abstract

任意の有機化合物を対象とし、変異原性の有無を高い精度で判定することが可能な統計モデルの構築を行った。構造記述子を入力とする複数のSVMモデルを構築し、それらの出力を統合することで、復帰突然変異試験の結果を高い精度で予測可能であることを示した。6,512化合物からなるデータセットを用いてモデルの構築および評価を行った結果、テストセットに対する予測正解率は79.60%であった。

The objective of this study is to construct a model which can predict results of reverse mutation test with high accuracy. For this end, we propose a novel ensemble modeling method in which a lot of support vector machine (SVM) models are constructed as a sub-model and integrated to predict mutagenicity. For constructing sub-models, a part of data matrix which is randomly selected from an original data matrix and randomly determined SVM parameters are used. After the construction of sub-models, a certain number of models which have high accuracy rate are selected and integrated to predict mutagenicity. We constructed an ensemble model using a data set of reverse mutation test which was collected by Hansen et al. [K. Hansen, et al., J. Chem. Inf. Model., 49, 2077-2081] to estimate the proposed method. As a result, the ensemble model with accuracy of 79.6% was successfully obtained. The area under ROC-curve (AUC) is 0.866, which is slightly better than that of Hansen et al. Thus we concluded that the ensemble modeling with SVM sub-models are a promising method for predicting mutagenicity of organic molecules.

Journal

  • Proceedings of the Symposium on Chemoinformatics

    Proceedings of the Symposium on Chemoinformatics 2011(0), O1-O1, 2011

    Division of Chemical Information and Computer Sciences The Chemical Society of Japan

Codes

Page Top