音声翻訳のための中国語対話コーパスの整備とその評価 Development and Evaluation of Chinese Conversational Corpus for a Speech-to-Speech Translation System

この論文にアクセスする

この論文をさがす

著者

抄録

現在,自由対話の音声翻訳システムにおける音声認識及び翻訳では,統計的言語モデルが広く使われている.統計言語モデルでは,信頼できる統計量を得るために,大規模で高品質な学習用コーパスが必要である. 本報告では,日中音声対話翻訳システムのための,中国語対話文のコーパスの整備(セグメンテーション及び品詞付与)の方法及びその特徴を述べる.更に,そのコーパスを用いて構築した言語モデルのパープレキシティ及び連続音声認識による性能評価結果を報告する.In speech-to-speech translation systems, statistical language models are widely used in both speech recognition and translation. In statistical language models, a large amount of training data is required to calculate reliable statistics. Performance of language model depends heavily on its quantity and quality. Therefore, development of a training corpus is one of the most important issues for speech-to-speech translation systems. In this paper, we report our conversational Chinese morphological corpora (with segmentation and part-of-speech tags), which are mainly used in our Japanese-Chinese speech translation system. Besides describing their statistical characteristics, we will also report the evaluation results of language models that are trained on these corpora.

In speech-to-speech translation systems, statistical language models are widely used in both speech recognition and translation. In statistical language models, a large amount of training data is required to calculate reliable statistics. Performance of language model depends heavily on its quantity and quality. Therefore, development of a training corpus is one of the most important issues for speech-to-speech translation systems. In this paper, we report our conversational Chinese morphological corpora (with segmentation and part-of-speech tags), which are mainly used in our Japanese Chinese speech translation system. Besides describing their statistical characteristics, we will also report the evaluation results of language models that are trained on these corpora.

収録刊行物

  • 情報処理学会研究報告自然言語処理(NL)

    情報処理学会研究報告自然言語処理(NL) 2005(50(2005-NL-167)), 47-52, 2005-05-26

    一般社団法人情報処理学会

参考文献:  6件中 1-6件 を表示

被引用文献:  1件中 1-1件 を表示

各種コード

  • NII論文ID(NAID)
    110002949459
  • NII書誌ID(NCID)
    AN10115061
  • 本文言語コード
    JPN
  • 資料種別
    Technical Report
  • ISSN
    09196072
  • NDL 記事登録ID
    7380326
  • NDL 雑誌分類
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL 請求記号
    Z14-1121
  • データ提供元
    CJP書誌  CJP引用  NDL  NII-ELS  IPSJ 
ページトップへ