The Mutual Information as a Scoring Function for Speech Recognition

OZEKI Kazuhiko

In recent speech recognition technology, the score of a hypothesis is often defined on the basis of likelihood calculated with an HMM. As is well known, however, direct use of likelihood for score causes difficult problems especially in continuous speech recognition. In this work, the mutual information between a speech segment and a hypothesized word was employed as a scoring function, and the performance was tested from various points of view. The mutual information is obtained by normalizing the likelihood by a speech probability. In order to estimate the speech probability, an ergodic HMM was utilized. Through a number of experiments, it was confirmed that the mutual information was a significantly better scoring function than the log-likelihood. There is another well known normalization method, in which the likelihood is normalized by a speech probability estimated with an all-phone model. Comparison of the two normalization methods was also carried out, leading to a conclusion that the speech probability estimated with an ergodic HMM gave a better scoring function than that estimated with an all-phone model.

The Mutual Information as a Scoring Function for Speech Recognition

この論文をさがす

抄録

収録刊行物

被引用文献 (6)*注記

参考文献 (6)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

The Mutual Information as a Scoring Function for Speech Recognition

この論文をさがす

抄録

収録刊行物

被引用文献 (6)*注記

参考文献 (6)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について