Speech Recognition of English by Japanese Using Lexicon Represented by Multiple Reduced Phoneme Sets

Access this Article



Recognition of second language (L2) speech is still a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. The authors previously proposed using a reduced phoneme set (RPS) instead of the canonical one of L2 when the mother tongue of speakers is known, and demonstrated that this reduced phoneme set improved the recognition performance through experiments using English utterances spoken by Japanese. However, the proficiency of L2 speakers varies widely, as does the influence of the mother tongue on their pronunciation. As a result, the effect of the reduced phoneme set is different depending on the speakers' proficiency in L2. In this paper, the authors examine the relation between proficiency of speakers and a reduced phoneme set customized for them. The experimental results are then used as the basis of a novel speech recognition method using a lexicon in which the pronunciation of each lexical item is represented by multiple reduced phoneme sets, and the implementation of a language model most suitable for that lexicon is described. Experimental results demonstrate the high validity of the proposed method.


  • IEICE Transactions on Information and Systems

    IEICE Transactions on Information and Systems E98.D(12), 2271-2279, 2015

    The Institute of Electronics, Information and Communication Engineers


Page Top