有声・無声休止区間の自動検出に基づく自由発話音声認識の性能改善手法  [in Japanese] Improvements of Spontaneous Speech Recognition by Using Automatic Filled and Silent Pause Detection  [in Japanese]

Access this Article

Search this Article

Author(s)

Abstract

自由発話音声認識においては,不明瞭な発声や口語表現,言い淀み,発話速度の変動など,様々な要因により認識性能が劣化してしまう.本研究では,その中でも特に,現在の音声認識では扱うことが困難な,有声休止,無声休止の2つの非言語情報に着目する.本報告では,自然発話中の有声休止,無声休止の音響的特徴をボトムアップな信号処理にて検出し,それらを認識時に考慮することで,両休止に対する頑健な音声認識手法を提案する.CIAIR車内音声コーパスを用いた自由発話連続音声認識実験を行い,提案手法の有効性を確認した.The accuracy of a spontaneous speech recognition system depends on many factors, such as various pauses, unclear pronunciation, spoken expressions, and speaking rates. In this work, we focus on filled and silent pauses, which are hesitation phenomena that degrade the accuracy of continuous speech recognition systems. We propose a speech recognition method that can handle both filled and silent pauses simultaneously. These pauses are automatically detected by using bottom-up acoustical analysis, and the detected results are incorporated into the decoding process. In our experiments using the CIAIR spontaneous speech corpus, the effectiveness of the proposed method was confirmed.

The accuracy of a spontaneous speech recognition system depends on many factors, such as various pauses, unclear pronunciation, spoken expressions, and speaking rates. In this work, we focus on filled and silent pauses, which are hesitation phenomena that degrade the accuracy of continuous speech recognition systems. We propose a speech recognition method that can handle both filled and silent pauses simultaneously. These pauses are automatically detected by using bottom-up acoustical analysis, and the detected results are incorporated into the decoding process. In our experiments using the CIAIR spontaneous speech corpus, the effectiveness of the proposed method was confirmed.

Journal

  • IPSJ SIG Notes

    IPSJ SIG Notes 2006(73(2006-SLP-062)), 1-6, 2006-07-07

    Information Processing Society of Japan (IPSJ)

References:  13

Codes

  • NII Article ID (NAID)
    110004849723
  • NII NACSIS-CAT ID (NCID)
    AN10442647
  • Text Lang
    JPN
  • Article Type
    Technical Report
  • ISSN
    09196072
  • NDL Article ID
    8003090
  • NDL Source Classification
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No.
    Z14-1121
  • Data Source
    CJP  NDL  NII-ELS  IPSJ 
Page Top