Search Results 1-20 of 429

  • Development of a Communication Tool that Uses Handwriting Animations to Express User Presence  [in Japanese]

    YONEZAWA Takashi , TANAKA Fumihide

    <p>Educational SNS makes communication between teachers and parents efficient, but gives an inorganic impression compared to the communication using handwritten characters that were done in the …

    Proceedings of the Annual Conference of JSAI JSAI2020(0), 3Rin414-3Rin414, 2020

    J-STAGE 

  • Deep Neural Network Training Emphasizing Central Frames for Speech Recognition  [in Japanese]

    倉田 岳人 , ヴィレット ダニエル

    It is a standard approach to concatenate several consecutive frames of acoustic features as input of a Deep Neural Network (DNN) for an acoustic model in speech recognition. A DNN is trained to map th …

    情報処理学会論文誌 58(5), 1207-1217, 2017-05-15

    IPSJ 

  • Improving Feature-space Discriminative Training and Adaptation Using Regularization Process  [in Japanese]

    福田 隆 , 市川 治 , 立花 隆輝

    In GMM/HMM systems, model-space adaptation techniques such as MAP are often used for porting old acoustic models into new domains. Although modern ASR systems leverage feature-space discriminative tra …

    情報処理学会論文誌 58(1), 288-296, 2017-01-15

    IPSJ 

  • Automatic generation of abbreviated named entities for localized speech recognition  [in Japanese]

    志賀 健太 , 能勢 隆 , 伊藤 彰則

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 115(184), 7-12, 2015-08-21

  • Improvement of Spoken Term Detection by Combining LVCSR and Syllable-based N-best Speech Recognition Results  [in Japanese]

    長野 徹 , 倉田 岳人 , 鈴木 雅之 , 立花 隆輝 , 西村 雅史

    In contact centers, it is common to check the call conversations of the call agents with the customers for quality monitoring. Recently, more and more companies have come to use Automatic Speech Recog …

    情報処理学会論文誌 56(8), 1646-1656, 2015-08-15

    IPSJ 

  • A generalized discriminative training framework for system combination  [in Japanese]

    TACHIOKA Yuuki , WATANABE Shinji , LE ROUX Jonathan , HERSHEY John R.

    This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling including Gaussian mixture models and deep neural networks, and discrimi …

    IEICE technical report. Speech 114(151), 13-18, 2014-07-24

  • Investigation of Combining Multiple Language Modeling Techniques in Japanese Spontaneous Speech Recognition  [in Japanese]

    MASUMURA Ryo , ASAMI Taichi , OBA Takanobu , MASATAKI Hirokazu , SAKAUCHI Sumitaka

    Recent large vocabulary speech recognition systems consist of two statistical models, the acoustic and language models. In acoustic modeling, deep neural networks have realized a breakthrough and sign …

    IEICE technical report. Speech 114(151), 1-6, 2014-07-24

  • Improvement of Spoken Term Detection by Combining LVCSR and Syllable-based N-best Speech Recognition Results  [in Japanese]

    Tohru Nagano , Gakuto Kurata , Masayuki Suzuki , Ryuki Tachibana , Masafumi Nishimura

    In contact centers, it is common to check the call conversations of the call agents with the customers for quality monitoring. Recently, more and more companies have come to use Automatic Speech Recog …

    IPSJ SIG Notes 2014-SLP-102(10), 1-6, 2014-07-17

  • Distant-talking Speech Recognition with Asynchronous Speech Recording  [in Japanese]

    TERAOKA Shunta , UEDA Yuma , WANG Longbiao , KAI Atsuhiko , FUKUSHIMA Taku

    Although applications using mobile terminals have attracted increasing attention, there are few studies that focus on distant-talking speech recognition with asynchronous recording using several mobil …

    IEICE technical report. Speech 114(52), 153-157, 2014-05-24

  • A 2.4x-Real-Time VLSI Processor for 60-kWord Continuous Speech Recognition  [in Japanese]

    何 光霽 , 宮本 優貴 , 松田 薫平 [他]

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113(235), 29-34, 2013-10-07

  • A 2.4x-Real-Time VLSI Processor for 60-kWord Continuous Speech Recognition  [in Japanese]

    HE Guangji , MIYAMOTO Yuki , 松田 薫平 [他] , IZUMI Shintaro , KAWAGUCHI Hiroshi , YOSHIMOTO Masahiko

    This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). We implement parallel and pipelined ar …

    Technical report of IEICE. ICD 113(236), 29-34, 2013-10-07

  • A 2.4x-Real-Time VLSI Processor for 60-kWord Continuous Speech Recognition  [in Japanese]

    HE Guangji , MIYAMOTO Yuki , MATSUDA Kumpei , IZUMI Shintaro , KAWAGUCHI Hiroshi , YOSHIMOTO Masahiko

    This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). We implement parallel and pipelined ar …

    IEICE technical report. Image engineering 113(237), 29-34, 2013-10-07

  • Implementation of Minimum Bayes-Risk Decoding Function into Open-Source Speech Recognition Engine Julius  [in Japanese]

    NANJO Hiroaki , FURUTANI Ryo , NISHIDA Masafumi

    重要な語に着目し,その誤りの最小化を行う汎用音声認識エンジンを実現したので,その実装と評価について述べる.我々はこれまでに,各語の重要度を考慮した誤り率「重みつき単語誤り率(Weighted Word Error Rate: WWER)」を,ベイズリスク最小化(Minimum Bayes-Risk: MBR)に基づいて行う音声認識の方式(MBR音声認識)の効果を確認している.しかし,これを実現する …

    The IEICE transactions on information and systems (Japanese edition) 96(10), 2530-2539, 2013-10

  • A 2.4x-Real-Time VLSI Processor for 60-kWord Continuous Speech Recognition  [in Japanese]

    HE Guangji , MIYAMOTO Yuki , 松田 薫平 , IZUMI Shintaro , KAWAGUCHI Hiroshi , YOSHIMOTO Masahiko

    This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). We implement parallel and pipelined ar …

    Technical report of IEICE. VLD 113(235), 29-34, 2013-09-30

  • 生成型アプローチによるLatent Words Language ModelのN-gram近似  [in Japanese]

    増村亮 , 政瀧浩和 , 大庭隆伸 , 吉岡理 , 高橋敏

    … 今日の大語彙連続音声認識において,デコーディングとの相性から,N-gram モデルが最も実用的な言語モデルとして利用されている.N-gram モデルは,膨大なモデルパラメータに起因するデータスパースネスの問題を持つことが知られており,この問題を解決するために,スムージングや次元削減に基づく様々なアプローチが検討されてきた.これに対して我々は,学習データ自体を新たに生成し,生成したデータに基づき N-gram …

    IPSJ SIG Notes 2013-SLP-97(5), 1-8, 2013-07-18

  • Denoising Autoencoderを用いた残響下大語彙音声認識の検討  [in Japanese]

    小宮山大樹 , 石井敬章 , 篠崎隆宏 , 堀内靖雄 , 黒岩眞吾

    … ,音声認識に必要なサブ音素レベルでの時間分解能を維持しながら時定数の大きな残響の影響をより正しく捕らえることを目的として,長さの異なる 2 つの分析窓長を併用する拡張手法を提案する.実験では,CENSREC-4 を用いた数字音声認識により提案法が従来手法よりも効果的であることを示す.さらに,JNAS を用いた音声認識を行い,提案法が大語彙連続音声認識においても耐残響フロントエンドとして有効であることを示す. …

    IPSJ SIG Notes 2013-SLP-97(1), 1-6, 2013-07-18

  • An n-gram Language Model based on BPD Backoff Method and W-B Discount  [in Japanese]

    YOSHIDA Shotaro , KAWABATA Takeshi

    A new n-gram language model is proposed for the large vocabulary continuous speech recognition system. Two approaches are combined. The probability estimation method based on the inheritance of Binomi …

    IEICE technical report. Speech 112(450), 19-20, 2013-02-21

  • Front-ending Spoken Document Retrieval with Spoken Term Detection Robust for OOV and Missrecognized Words  [in Japanese]

    瀧上智子 , 秋葉 友良

    How to deal with speech recognition errors and out-of-vocabulary (OOV) words is one of the challenging problems in spoken document processing. To deal with the problem in spoken document retrieval (SD …

    情報処理学会論文誌 54(2), 506-517, 2013-02-15

    IPSJ 

  • An Investigation of Clustering Methods using Speaker-Class Models in Lecture Speech Recognition  [in Japanese]

    今野 和樹 , 大山 拓也 , 加藤 正治 [他] , 小坂 哲夫

    In this paper, we have examined speaker clustering method using more than 100 clusters in order to improve the performance of spontaneous speech recognition. In this method, we use a soft clustering a …

    IEICE technical report. Speech 112(369), 125-130, 2012-12-20

  • A 2.4x-Real-Time VLSI Processor for 60-k Word Continuous Speech Recognition  [in Japanese]

    MIYAMOTO Yuuki , HE Guangji , IZUMI Shintaro , KAWAGUCHI Hiroshi , YOSHIMOTO Masahiko

    This paper describes a low-power VLSI chip for 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). Our implementation includes a compression&#8211;decodi …

    Technical report of IEICE. ICD 112(365), 49-53, 2012-12-17

Page Top