Low Complexity Speaker Identification in AAC Domain

  • AI Haojun
    National Engineering Research Center for Multimedia Software, Wuhan University:Graduate School of Information Science and Technology, Hokkaido University
  • HASEYAMA Miki
    Graduate School of Information Science and Technology, Hokkaido University

この論文をさがす

抄録

This paper presents an implementation of a low-complexity speaker identification algorithm working in the compressed audio domain. The goal is to perform speaker modeling and identification without decoding the AAC bitstream to extract speaker dependent features, thus saving important system resource. The silence detection and MFCC parameters are calculated from MDCT coefficient other than from the FFT spectrum. Each speaker is modeled by a GMM, which is trained using the EM algorithm to refine the weight and the parameters of each component. The recognition accuracies of our algorithm reach 97% for ARCTIC database with 16% CPU overload comparing to the algorithms based on the analysis of the decoded PCM signals.

収録刊行物

参考文献 (9)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ