confusion network と語彙制約なし音声認識を用いた動的発音モデリング  [in Japanese] Dynamic Pronunciation Modeling Using Confusion Networks and Unconstrained Speech Recognition  [in Japanese]

Access this Article

Search this Article

Author(s)

Abstract

自由発話音声認識において,発音変動をいかにモデル化するか,すなわち発音モデリングは,認識性能を左右する重要な問題の 1 つである.発音モデリングにおいては,発音辞書に発音変動を表す音韻系列を追加していくアプローチが最も一般的である.しかし,単純に多くの発音変動を追加することは,単語間の音響的混同を引き起こし,認識性能が劣化する原因となってしまう.これに対処する手段として,本研究では,発話ごとに固有の発音辞書を生成する,動的発音モデリングに着目する.本稿では,confusion network と語彙制約なし音声認識を利用した,新たな動的発音モデリング手法を提案する.日本語話し言葉コーパス (CSJ) を用いた自由発話連続音声認識実験により,本手法の有効性を確認した.To better model pronunciation variations is an important issue in spontaneous speech recognition. The most common approach is to work at lexicon level, by simply adding phonetic sequences to a basic lexicon. However, it is also known that adding too many variants increases acoustic confusability between words and sometimes even decreases the recognition performance. To solve this problem, we propose a novel dynamic pronunciation modeling using confusion networks and unconstrained speech recognition. In our experiments using the corpus of spontaneous Japanese (CSJ), the effectiveness of the proposed method was confirmed.

To better model pronunciation variations is an important issue in spontaneous speech recognition. The most common approach is to work at lexicon level, by simply adding phonetic sequences to a basic lexicon. However, it is also known that adding too many variants increases acoustic confusability between words and sometimes even decreases the recognition performance. To solve this problem, we propose a novel dynamic pronunciation modeling using confusion networks and unconstrained speech recognition. In our experiments using the corpus of spontaneous Japanese (CSJ), the effectiveness of the proposed method was confirmed.

Journal

  • IPSJ SIG Notes

    IPSJ SIG Notes 2008(68(2008-SLP-072)), 7-12, 2008-07-11

    Information Processing Society of Japan (IPSJ)

References:  13

Codes

  • NII Article ID (NAID)
    110006862653
  • NII NACSIS-CAT ID (NCID)
    AN10442647
  • Text Lang
    JPN
  • Article Type
    Technical Report
  • ISSN
    09196072
  • NDL Article ID
    9606249
  • NDL Source Classification
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No.
    Z14-1121
  • Data Source
    CJP  NDL  NII-ELS  IPSJ 
Page Top