A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features

ZHOU Yu, LI Junfeng, SUN Yanqing, ZHANG Jianping, YAN Yonghong, AKAGI Masato

doi:10.1587/transinf.e93.d.2813

抄録

In this paper, we present a hybrid speech emotion recognition system exploiting both spectral and prosodic features in speech. For capturing the emotional information in the spectral domain, we propose a new spectral feature extraction method by applying a novel non-uniform subband processing, instead of the mel-frequency subbands used in Mel-Frequency Cepstral Coefficients (MFCC). For prosodic features, a set of features that are closely correlated with speech emotional states are selected. In the proposed hybrid emotion recognition system, due to the inherently different characteristics of these two kinds of features (e.g., data size), the newly extracted spectral features are modeled by Gaussian Mixture Model (GMM) and the selected prosodic features are modeled by Support Vector Machine (SVM). The final result of the proposed emotion recognition system is obtained by combining the results from these two subsystems. Experimental results show that (1) the proposed non-uniform spectral features are more effective than the traditional MFCC features for emotion recognition; (2) the proposed hybrid emotion recognition system using both spectral and prosodic features yields the relative recognition error reduction rate of 17.0% over the traditional recognition systems using only the spectral features, and 62.3% over those using only the prosodic features.

収録刊行物

IEICE Transactions on Information and Systems

IEICE Transactions on Information and Systems E93-D (10), 2813-2821, 2010

一般社団法人電子情報通信学会

キーワード

詳細情報詳細情報について

CRID: 1390001204379551488

NII論文ID: 10027641285

NII書誌ID: AA10826272

DOI: 10.1587/transinf.e93.d.2813

ISSN: 17451361; 09168532

Web Site: http://hdl.handle.net/10119/9950; http://www.jstage.jst.go.jp/article/transinf/E93.D/10/E93.D_10_2813/_pdf

本文言語コード: en

データソース種別

JaLC
IRDB
Crossref
CiNii Articles

抄録ライセンスフラグ: 使用不可

A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features

この論文をさがす

抄録

収録刊行物

参考文献 (37)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features

この論文をさがす

抄録

収録刊行物

参考文献 (37)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について