Probabilistic Concatenation Modeling for Corpus-Based Speech Synthesis

この論文にアクセスする

この論文をさがす

著者

    • SAKAI Shinsuke
    • Academic Center for Computing and Media Studies, Kyoto University
    • KAWAI Hisashi
    • Knowledge Creating Communication Research Center, National Institute of Information and Communications Technology

抄録

The measure of the goodness, or inversely the cost, of concatenating synthesis units plays an important role in concatenative speech synthesis. In this paper, we present a probabilistic approach to concatenation modeling in which the goodness of concatenation is measured by the conditional probability of observing the spectral shape of the current candidate unit given the previous unit and the current phonetic context. This conditional probability is modeled by a conditional Gaussian density whose mean vector has a form of linear transform of the past spectral shape. Decision tree-based parameter tyingis performed to achieve robust trainingthat balances between model complexity and the amount of training data available. The concatenation models are implemented for a corpus-based speech synthesizer, and the effectiveness of the proposed method wasconfirmed by an objective evaluation as well as a subjective listening test. We also demonstrate that the proposed method generalizes some popular conventional methods in that those methods can be derived as the special cases of the proposed method.

収録刊行物

  • IEICE transactions on information and systems

    IEICE transactions on information and systems 94(10), 2006-2014, 2011-10-01

    一般社団法人 電子情報通信学会

参考文献:  24件中 1-24件 を表示

各種コード

  • NII論文ID(NAID)
    10030193499
  • NII書誌ID(NCID)
    AA10826272
  • 本文言語コード
    ENG
  • 資料種別
    ART
  • ISSN
    09168532
  • データ提供元
    CJP書誌  J-STAGE 
ページトップへ