Estimation of Nativeness of Documents Based on Skew Divergence
-
- FUJII HIROSHI
- Hitachi Ltd., Software Division
-
- TOMIURA YOICHI
- Graduate School of Information Science and Electrical Engineering, Kyushu University
-
- TANAKA SHOSAKU
- Ritsumeikan University, College of Letters
Bibliographic Information
- Other Title
-
- Skew Divergenceに基づく文書の母語話者性の推定
Abstract
The automatic discrimination between documents written by native speakers andones by non-native speakers is an important technique to construct a high-qualitycorpus, to help native speakers with writing, and to gather useful knowledge in Sec-ond Language Acquisition.This paper proposes the method of such a discriminationbased on the similarity of part-of-speech trigram distributions.The distributionalsimilarity is given by Skew Divergence.Skew Divergence is an improved functionof KL Divergence, and it does not suffer from the zero-frequency problem.To use Skew Divergence, it needs to decide the value of the parameter α in Skew Divergence.However, there have not been any sufficient discussions on how to decide it.This pa-per also proposes one of the methods how to set the parameter αThe experimentalresult shows the effectiveness of the proposed method.
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 12 (4), 79-96, 2005
The Association for Natural Language Processing
- Tweet
Details 詳細情報について
-
- CRID
- 1390282679453143296
-
- NII Article ID
- 130004291856
-
- ISSN
- 21858314
- 13407619
- http://id.crossref.org/issn/13407619
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- Crossref
- CiNii Articles
-
- Abstract License Flag
- Disallowed