-
- AI Haojun
- National Engineering Research Center for Multimedia Software, Wuhan University:Graduate School of Information Science and Technology, Hokkaido University
-
- HASEYAMA Miki
- Graduate School of Information Science and Technology, Hokkaido University
この論文をさがす
抄録
This paper presents an implementation of a low-complexity speaker identification algorithm working in the compressed audio domain. The goal is to perform speaker modeling and identification without decoding the AAC bitstream to extract speaker dependent features, thus saving important system resource. The silence detection and MFCC parameters are calculated from MDCT coefficient other than from the FFT spectrum. Each speaker is modeled by a GMM, which is trained using the EM algorithm to refine the weight and the parameters of each component. The recognition accuracies of our algorithm reach 97% for ARCTIC database with 16% CPU overload comparing to the algorithms based on the analysis of the decoded PCM signals.
収録刊行物
-
- 映像情報メディア学会技術報告
-
映像情報メディア学会技術報告 32.46 (0), 31-34, 2008
一般社団法人 映像情報メディア学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390001204525609344
-
- NII論文ID
- 110007385516
-
- NII書誌ID
- AN1059086X
-
- ISSN
- 24241970
- 13426893
-
- NDL書誌ID
- 9709697
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- NDL
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可