-
- NAKANO Alberto Yoshihiro
- Department of Information and Computer Sciences, Toyohashi University of Technology
-
- NAKAGAWA Seiichi
- Department of Information and Computer Sciences, Toyohashi University of Technology
-
- YAMAMOTO Kazumasa
- Department of Information and Computer Sciences, Toyohashi University of Technology
この論文をさがす
抄録
In this work, spatial information consisting of the position and orientation angle of an acoustic source is estimated by an artificial neural network (ANN). The estimated position of a speaker in an enclosed space is used to refine the estimated time delays for a delay-and-sum beamformer, thus enhancing the output signal. On the other hand, the orientation angle is used to restrict the lexicon used in the recognition phase, assuming that the speaker faces a particular direction while speaking. To compensate the effect of the transmission channel inside a short frame analysis window, a new cepstral mean normalization (CMN) method based on a Gaussian mixture model (GMM) is investigated and shows better performance than the conventional CMN for short utterances. The performance of the proposed method is evaluated through Japanese digit/command recognition experiments.
収録刊行物
-
- IEICE Transactions on Information and Systems
-
IEICE Transactions on Information and Systems E93-D (9), 2451-2462, 2010
一般社団法人 電子情報通信学会
- Tweet
キーワード
詳細情報 詳細情報について
-
- CRID
- 1390001204377453952
-
- NII論文ID
- 10027640401
-
- NII書誌ID
- AA10826272
-
- ISSN
- 17451361
- 09168532
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可