3×倍速実時間6万語彙連続音声認識のための40-nm,54-mW音声認識専用プロセッサ (画像工学)  [in Japanese] A 2.4x-Real-Time VLSI Processor for 60-kWord Continuous Speech Recognition  [in Japanese]

Search this Article

Author(s)

Abstract

本稿では,6万語彙の実時間連続音声認識のための低消費電力VLSIチップについて説明する.高速,高精度,低消費電力で6万語彙連続音声認識を実現するために,以前試作した音声認識プロセッサの提案手法を用いた上で,高並列な8-pass Viterbi遷移アーキテクチャを実装することで,全体処理速度のネックとなっているViterbi部分をさらに高速化させた.また,探索処理において第2パスにtri-gramを用いることで,認識精度をbi-gramのみの場合より約2%向上できた. 回路規模2.98MTr,オンチップ SRAM容量4.29Mbitsの6万語彙連続音声認識のための専用プロセッサを設計し,40nmプロセスで試作した. bi-gramのみを使う場合,実時間処理に必要な62.5MHz動作時の消費電力は54.8 mWであった.標準電圧(1.1V)で最大200MHz (177.4 mW) 動作が確認され, 3倍速動作を実現できた.また,tri-gramを使う場合,200MHzで最高処理速度は2.25倍速であり,消費電力は174.56 mWであった.

This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). We implement parallel and pipelined architecture for GMM computation and Viterbi processing. It includes a 8-path Viterbi transition architecture to maximize the processing speed and adopts tri-gram language model to improve the recognition accuracy. A two-level cache architecture is implemented for the demo system. The test chip, fabricated in 40 nm CMOS technology, occupies 1.77 mm × 2.18 mm containing 2.98 M transistors for logic and 4.29 Mbit on-chip memory. The measured results show that our implementation achieves 25% required frequency reduction (62.5 MHz) and 26% power consumption reduction (54.8 mW) for 60 k-Word real-time continuous speech recognition compared to the previous work. This chip can maximally process 3.02× and 2.25× times faster than real-time at 200 MHz using the bigram and trigram language models, respectively.

Journal

  • IEICE technical report. Image engineering

    IEICE technical report. Image engineering 113(237), 29-34, 2013-10-07

    The Institute of Electronics, Information and Communication Engineers

Codes

  • NII Article ID (NAID)
    110009817868
  • NII NACSIS-CAT ID (NCID)
    AN10013006
  • Text Lang
    JPN
  • ISSN
    0913-5685
  • NDL Article ID
    025001536
  • NDL Call No.
    Z16-940
  • Data Source
    NDL  NII-ELS 
Page Top