-
- Nguyent Tri Thanh
- School of Information Science, Japan Advanced Institute of Science and Technology
-
- Nguyent Le Minh
- School of Information Science, Japan Advanced Institute of Science and Technology
-
- Shimazu Akira
- School of Information Science, Japan Advanced Institute of Science and Technology
この論文をさがす
抄録
Question classification, an important phase in question answering systems, is the taskof identifying the type of a given question among a set of predefined types.This studyuses unlabeled questions in combination with labeled questions for semi-supervisedlearning, to improve the precision of question classification task.For semi-supervisedalgorithm, we selected Tri-training because it is a simple but efficient co-training stylealgorithm.However, Tri-training is not well suitable for question data, so we give twoproposals to modify Tri-training, to make it more suitable.In order to enable itsthree classifiers to have different initial hypotheses, Tri-training bootstrap-samplesthe originally labeled set to get different sets for training the three classifiers.Theprecisions of three classifiers are decreased because of the bootstrap-sampling.Withthe purpose to avoid this drawback by allowing each classifier to be initially trainedon the originally labeled set while still ensuring the diversity of three classifiers, ourfirst proposal is to use multiple algorithms for classifiers in Tri-training;the secondproposal is to use multiple algorithms for classifiers in combination with multipleviews, and our experiments show promising results.
収録刊行物
-
- 自然言語処理
-
自然言語処理 15 (1), 3-21, 2008
一般社団法人 言語処理学会
- Tweet
キーワード
詳細情報 詳細情報について
-
- CRID
- 1390001204474202368
-
- NII論文ID
- 130004291937
- 10021115853
-
- NII書誌ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL書誌ID
- 9362145
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可