統合的確率モデルを用いた日本語文解析

白井 清昭, 乾 健太郎, 徳永 健伸, 田中 穂積

我々は構文的優先度, 語彙的従属関係を同時に取り扱う統合的確率言語モデルを提案している. このモデルの特長は構文的優先度と語彙的従属関係を互いに独立に取り扱う点にある. これにより, 両者を独立に学習することができるだけでなく, 両者がそれぞれ曖昧性解消にどれだけ有効に作用するのかを容易に評価できる. 本稿では, このモデルを用いて日本語文の係り受け解析実験を行った結果について報告し, 構文的優先度, 語彙的従属関係のそれぞれが文節の係り先の正解率の向上に大きく貢献することを示す. また, 解析に失敗した原因について調査を行い, その主な要因と本稿で提案するモデルにおける対処法について論ずる.
We propose a new statistical language model which integrates lexical association statistics with syntactic preferences, while maintaining the modularity of these different statistics types, facilitating both the training of the model and analysis of its behavior. In this paper, we report the results of an empirical evaluation of our model, in which the model is applied to the disambiguation of Japanese sentence dependency structures. The results show that both syntactic preferences and lexical associations significantly raise the accuracy, which is the ratio of the number of Bunsetu phrases whose modifiee is correctly identified, to the total phrase number. We also discuss further room for improvement based on our error analysis.

統合的確率モデルを用いた日本語文解析

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (22)*注記

詳細情報詳細情報について

書き出し

問題の指摘

統合的確率モデルを用いた日本語文解析

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (22)*注記

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について