Using Semi-supervised Learning for Question Classification

  • Nguyent Tri Thanh
    School of Information Science, Japan Advanced Institute of Science and Technology
  • Nguyent Le Minh
    School of Information Science, Japan Advanced Institute of Science and Technology
  • Shimazu Akira
    School of Information Science, Japan Advanced Institute of Science and Technology

この論文をさがす

抄録

Question classification, an important phase in question answering systems, is the taskof identifying the type of a given question among a set of predefined types.This studyuses unlabeled questions in combination with labeled questions for semi-supervisedlearning, to improve the precision of question classification task.For semi-supervisedalgorithm, we selected Tri-training because it is a simple but efficient co-training stylealgorithm.However, Tri-training is not well suitable for question data, so we give twoproposals to modify Tri-training, to make it more suitable.In order to enable itsthree classifiers to have different initial hypotheses, Tri-training bootstrap-samplesthe originally labeled set to get different sets for training the three classifiers.Theprecisions of three classifiers are decreased because of the bootstrap-sampling.Withthe purpose to avoid this drawback by allowing each classifier to be initially trainedon the originally labeled set while still ensuring the diversity of three classifiers, ourfirst proposal is to use multiple algorithms for classifiers in Tri-training;the secondproposal is to use multiple algorithms for classifiers in combination with multipleviews, and our experiments show promising results.

収録刊行物

  • 自然言語処理

    自然言語処理 15 (1), 3-21, 2008

    一般社団法人 言語処理学会

参考文献 (16)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ