有限状態変換器の誤り駆動型学習を用いた固有表現抽出 Named Entity Extraction Using Error - Driven Learning of Finite - State Transducers

この論文にアクセスする

この論文をさがす

著者

抄録

本論文では、Eric Brillが提案した変換に基づく誤り駆動型学習を日本語の固有表現抽出に適用する方法について述べる。形態素解析と学習で獲得した有限状態変換器(S)を使って固有表現の抽出を行なうシステムを作成し、IREX(nformation Retrieval and Extraction Exercis)のnamed entity taskのformal run(総合ドメイン)に対して実験を行なった。約10,000文のCRL固有表現データから1428個のFSTを学習し、F?measure71.28を得た。人手作成のFSTの性能には及ばないものの、IREX NEに参加するシステムの半数よりもいい結果である。また、過学習が起きないことも確認した。This papaer describes a method of extracting named entities from Japanese text based on Eric Brill's transformation-based error-driven learning. We developed an extraction system which uses a morphological analyzer and machine-learned finite-state transducers (FSTs), and performed an experiment against the formal run (general topics) of the IREX (Information Retrieval and Extraction Exercise) NE (named entity task). Our system learned 1,428 FSTs from the CRL NE data containing about 10,000 sentences and achieved an overall named entities F-measure of 71.28. The score was lower than that of the hand-crafted FSTs. However, the machine-learned FSTs outperformed the half of the systems participating in the IREX NE. Also, we didn't encounter overfitting in the learning process.

This papaer describes a method of extracting named entities from Japanese text based on Eric Brill's transformation-based error-driven learning. We developed an extraction system which uses a morphological analyzer and machine-learned finite-state transducers (FSTs), and performed an experiment against the formal run (general topics) of the IREX (Information Retrieval and Extraction Exercise) NE (named entity task). Our system learned 1,428 FSTs from the CRL NE data containing about 10,000 sentences and achieved an overall named entities F-measure of 71.28. The score was lower than that of the hand-crafted FSTs. However, the machine-learned FSTs outperformed the half of the systems participating in the IREX NE. Also, we didn't encounter overfitting in the learning process.

収録刊行物

  • 情報処理学会研究報告自然言語処理(NL)

    情報処理学会研究報告自然言語処理(NL) 1999(62(1999-NL-132)), 1-8, 1999-07-22

    一般社団法人情報処理学会

参考文献:  17件中 1-17件 を表示

被引用文献:  1件中 1-1件 を表示

各種コード

  • NII論文ID(NAID)
    110002935140
  • NII書誌ID(NCID)
    AN10115061
  • 本文言語コード
    JPN
  • 資料種別
    Technical Report
  • ISSN
    09196072
  • NDL 記事登録ID
    5337781
  • NDL 雑誌分類
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL 請求記号
    Z14-1121
  • データ提供元
    CJP書誌  CJP引用  NDL  NII-ELS  IPSJ 
ページトップへ