文脈的手がかりを考慮した機械学習による日本語ゼロ代名詞の先行詞同定(自然言語)  [in Japanese] Identifying Antecedents of Japanese Zero-pronouns Using a Machine Learning Model with Contextual Cues (Natural-Language Processing)  [in Japanese]

    • 飯田 龍 IIDA RYU
    • 奈良先端科学技術大学院大学情報科学研究科 Graduate School of Information Science, Nara Institute of Science and Technology
    • 乾 健太郎 INUI KENTARO
    • 奈良先端科学技術大学院大学情報科学研究科 Graduate School of Information Science, Nara Institute of Science and Technology

Abstract

センタリング理論のような言語学的な知見を,機械学習を用いた照応解析に統合する一方法を提案する.従来の照応解析手法は,大きく規則ベースの手法と統計的な手法に分類でき,それぞれ独立に研究がなされてきた.規則ベースの手法では,言語学的知見に基づき人手で規則を記述するが,照応現象を包括的にとらえる規則を書き尽くすことは困難である.一方,機械学習に基づく手法では,人手では扱うことのできない規則の組合せを考慮できるが,言語学的知見を有効に活用していない.これら2つの手法をうまく統合することによって,両者の利点を同時に引き出すことができれば,精度の向上がさらに期待できる.本論文では2つの手法の統合を目指し,具体的な方法として(i)センタリング理論に基づく局所的な文脈を考慮した素性(センタリング素性)の導入,および(ii)先行詞候補間を比較するモデル(トーナメントモデル)の2点を提案する.この提案手法を用いて日本語ゼロ代名詞の同定を行い,先行研究の機械学習を用いた手法より精度良く先行詞の同定かできたことを報告する.

We propose a method that enhances a machine learning model for anaphora resolution by incorporating linguistically motivated contextual clues, such as the centering theory. Conventional approaches to anaphora resolution (or more generally coreference resolution) can be classified into rule-based approaches and corpus-based empirical approaches, and they have evolved rather independently. In rule-based approaches, efforts have been directed to manual encoding of various linguistic cues into a set of rule. However it is prohibitively difficult to describe rules exhaustively. 0n the other hand, empirical approaches with a machine learning techniques are able to take into account the combination of features, which is hard to deal with in the former approaches. However, they hardly exploit the linguistic cues. Therefore, we envisaged that a method that combines the working of the two approaches will perform more effectively. Indeed, our model shows improvements arising from two sources : (i) the feature of local contextual factors and (ii) an angmentation of the learning model to take into account comparison between candidates. This model is applied to resolve Japanese zero-anaphors and outperforms earlier machine learning approaches.

Journal

Transactions of Information Processing Society of Japan   [List of Volumes]

Transactions of Information Processing Society of Japan 45(3), 906-918, 2004-03-15  [Table of Contents]

Information Processing Society of Japan (IPSJ)

References:  24

You must have a user ID to see the references.If you already have a user ID, please click "Login" to access the info.New users can click "Sign Up" to register for an user ID.

Cited by:  13

You must have a user ID to see the cited references.If you already have a user ID, please click "Login" to access the info.New users can click "Sign Up" to register for an user ID.

Preview

Preview

Codes

  • NII Article ID (NAID) :
    110002712141
  • NII NACSIS-CAT ID (NCID) :
    AN00116647
  • Text Lang :
    JPN
  • Article Type :
    Journal Article
  • ISSN :
    03875806
  • NDL Article ID :
    6885241
  • NDL Source Classification :
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No. :
    Z14-741
  • Databases :
    CJP  CJPref  NDL  NII-ELS 

Export