Novel Confidence Feature Extraction Algorithm Based on Latent Topic Similarity

Access this Article

Search this Article

Author(s)

    • CHEN Wei
    • Pattern Recognition and Intelligent System Laboratory, Beijing University of Posts and Telecommunications
    • LIU Gang
    • Pattern Recognition and Intelligent System Laboratory, Beijing University of Posts and Telecommunications
    • GUO Jun
    • Pattern Recognition and Intelligent System Laboratory, Beijing University of Posts and Telecommunications
    • OMACHI Masako
    • Faculty of Science and Technology, Tohoku Bunka Gakuen University
    • GUO Yujing
    • Pattern Recognition and Intelligent System Laboratory, Beijing University of Posts and Telecommunications

Abstract

In speech recognition, confidence annotation adopts a single confidence feature or a combination of different features for classification. These confidence features are always extracted from decoding information. However, it is proved that about 30% of knowledge of human speech understanding is mainly derived from high-level information. Thus, how to extract a high-level confidence feature statistically independent of decoding information is worth researching in speech recognition. In this paper, a novel confidence feature extraction algorithm based on latent topic similarity is proposed. Each word topic distribution and context topic distribution in one recognition result is firstly obtained using the latent Dirichlet allocation (LDA) topic model, and then, the proposed word confidence feature is extracted by determining the similarities between these two topic distributions. The experiments show that the proposed feature increases the number of information sources of confidence features with a good information complementary effect and can effectively improve the performance of confidence annotation combined with confidence features from decoding information.

Journal

  • IEICE Transactions on Information and Systems

    IEICE Transactions on Information and Systems 93(8), 2243-2251, 2010-08-01

    The Institute of Electronics, Information and Communication Engineers

References:  30

Codes

Page Top