騒音環境下音声認識に対する識別的アプローチの有効性 : 第2回CHiMEチャレンジ(雑音対策,認識,理解,対話,一般)  [in Japanese] Effectiveness of discriminative approaches for speech recognition under noisy environments on the 2nd CHiME Challenge  [in Japanese]

Search this Article

Author(s)

Abstract

第2回CHiMEチャレンジは,非定常の妨害音を伴う2マイクロフォンでの困難な音声認識タスクである.我々は識別学習や様々な特徴量変換,ディープニューラルネットといった先端的な音声認識の手法の残響・騒音音声認識に対する有効性を検証した.騒音抑圧には音源到来方向を推定し,事前分布を用いてバイナリマスクを行うシンプルな方法を用いた.さらに任意の特徴量を識別的特徴量変換に導入可能な拡張識別的特徴量変換,識別的言語モデリングとベイズリスク最小化デコーディングを音声認識の後段で効率的に統合する手法を提案した.これらはCHiMEチャレンジのTrack2である中程度の語彙タスクに有効であり,参加者中最も高い性能を獲得した.

The 2nd CHiME challenge is a difficult two-microphone speech recognition task with non-stationary interference. We investigate the effectiveness of state-of-the-art ASR techniques such as discriminative training, various feature transformations and deep neural networks for reverberated and noisy speech recognition, combined with a simple noise suppression method relying on prior-based binary masking with estimated angle of arrival. Moreover, we propose an augmented discriminative feature transformation that can introduce arbitrary features to a discriminative feature transform, an efficient combination method of discriminative language modeling and minimum Bayes risk decoding in an ASR post-processing stage. These techniques are effective for middle-vocabulary sub-task (Track 2) of CHiME challenge. Our performance is the best among participants.

Journal

  • IEICE technical report. Speech

    IEICE technical report. Speech 113(161), 13-18, 2013-07-18

    The Institute of Electronics, Information and Communication Engineers

Codes

  • NII Article ID (NAID)
    110009778099
  • NII NACSIS-CAT ID (NCID)
    AN10013221
  • Text Lang
    JPN
  • Data Source
    NII-ELS 
Page Top