日本語話し言葉音声認識のための音節に基づく高精度な音響モデルの検討  [in Japanese] Syllable-Based Acoustical Modeling for Japanese Spontaneous Speech Recognition  [in Japanese]

Search this Article

Author(s)

Abstract

日本語話し言葉音声認識のための音節に基づく音響モデリング手法について検討している.従来,「モーラ」単位に基づくサブワード音響モデルが検討され,読み上げ音声認識においてその効果が確認されている.それに対し本報告では,「音節」と「モーラ」をその定義から明確に区別し,話し言葉音声認識において,「音節」が「モーラ」より音響モデルの単位として適していることを示す.具体的には,話し言葉音声特有の現象で,頻繁に発生する「長母音化」に注目し,これを明確に考慮した音節モデル,並びにその高精度なモデリング手法を提案する.学会講演音声を対象とした認識実験の結果,提案モデルによって従来のtriphoneモデル,モーラモデルを上回る認識性能を得ることができた.

We study on a syllable-based acoustical modeling method for Japanese spontaneous speech recognition. Traditionally, mora-based acoustic models have been adopted for Japanese read speech recognition system. In this paper, syllable-based unit and mora-based unit are clealy distinguished in their definition, and syllables are shown to more suitable as an acoustic model in Japanese spontaneous speech recognition. In spontaneous speech, a vowel lengthening occurs frequently, and recognition accuracy is greatly affected by this phenomena. In this view point, we propose an acoustical modeling technique that emplicitly incorporates the vowel lengthening in syllable-based HMMs. Experimental results showed that the proposed model could exceed the performance of conventionally used cross-word triphone model and mora-based model in Japanese spontaneous speech recognition task.

Journal

  • IEICE technical report. Speech

    IEICE technical report. Speech 102(529), 49-54, 2002-12-12

    The Institute of Electronics, Information and Communication Engineers

References:  17

Cited by:  3

Codes

  • NII Article ID (NAID)
    110003295578
  • NII NACSIS-CAT ID (NCID)
    AN10013221
  • Text Lang
    JPN
  • Article Type
    Journal Article
  • ISSN
    09135685
  • NDL Article ID
    6434985
  • NDL Source Classification
    ZN33(科学技術--電気工学・電気機械工業--電子工学・電気通信)
  • NDL Call No.
    Z16-940
  • Data Source
    CJP  CJPref  NDL  NII-ELS 
Page Top