Low Complexity Speaker Identification in AAC Domain

AI Haojun, HASEYAMA Miki

doi:10.11485/itetr.32.46.0_31

この論文をさがす

抄録

This paper presents an implementation of a low-complexity speaker identification algorithm working in the compressed audio domain. The goal is to perform speaker modeling and identification without decoding the AAC bitstream to extract speaker dependent features, thus saving important system resource. The silence detection and MFCC parameters are calculated from MDCT coefficient other than from the FFT spectrum. Each speaker is modeled by a GMM, which is trained using the EM algorithm to refine the weight and the parameters of each component. The recognition accuracies of our algorithm reach 97% for ARCTIC database with 16% CPU overload comparing to the algorithms based on the analysis of the decoded PCM signals.

収録刊行物

映像情報メディア学会技術報告

映像情報メディア学会技術報告 32.46 (0), 31-34, 2008

一般社団法人映像情報メディア学会

キーワード

詳細情報詳細情報について

CRID: 1390001204525609344

NII論文ID: 110007385516

NII書誌ID: AN1059086X

DOI: 10.11485/itetr.32.46.0_31

ISSN: 24241970; 13426893

NDL書誌ID: 9709697

Web Site: http://id.ndl.go.jp/bib/9709697; https://ndlsearch.ndl.go.jp/books/R000000004-I9709697

本文言語コード: en

データソース種別

JaLC
NDL
CiNii Articles

抄録ライセンスフラグ: 使用不可

Low Complexity Speaker Identification in AAC Domain

この論文をさがす

抄録

収録刊行物

参考文献 (9)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

Low Complexity Speaker Identification in AAC Domain

この論文をさがす

抄録

収録刊行物

参考文献 (9)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について