サポートベクタマシンを用いた対訳表現の抽出

  • 佐藤 健吾
    慶應義塾大学大学院理工学研究科開放環境科学
  • 斎藤 博昭
    慶應義塾大学大学院理工学研究科開放環境科学

書誌事項

タイトル別名
  • Extracting Word Sequence Correspondences Based on Support Vector Machines.
  • サポート ベクタ マシン オ モチイタ タイヤク ヒョウゲン ノ チュウシュツ

この論文をさがす

抄録

This paper proposes a learning and extracting method of bilingual word sequence correspondences from aligned parallel corpora based on Support Vector Machines (SVMs), which are robust against data sparseness because of high ability of generalization and can learn dependencies of features by using a kernel function. Our method learns a translation model using features such as translation dictionaries, the number of words, part-of-speech, constituent words and neighbor words, and extracts bilingual word sequence correspondences by using the correspondence level based on SVMs. Conventional methods cannot extract bilingual word sequence correspondences which appear infrequently because of data sparseness which is caused by correspondence levels based on word co-occurrences. Our method, however, can extract them by the model which has been already learned by training corpora.

収録刊行物

  • 自然言語処理

    自然言語処理 10 (4), 109-124, 2003

    一般社団法人 言語処理学会

被引用文献 (5)*注記

もっと見る

参考文献 (19)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ