Speech Recognition with Primarily Temporal Cues

  • Robert V. Shannon
    House Ear Institute, 2100 West Third Street, Los Angeles, CA 90057, USA.
  • Fan-Gang Zeng
    House Ear Institute, 2100 West Third Street, Los Angeles, CA 90057, USA.
  • Vivek Kamath
    House Ear Institute, 2100 West Third Street, Los Angeles, CA 90057, USA.
  • John Wygonski
    House Ear Institute, 2100 West Third Street, Los Angeles, CA 90057, USA.
  • Michael Ekelid
    House Ear Institute, 2100 West Third Street, Los Angeles, CA 90057, USA.

抄録

<jats:p>Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants, vowels, and words in simple sentences improved markedly as the number of bands increased; high speech recognition performance was obtained with only three bands of modulated noise. Thus, the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.</jats:p>

収録刊行物

  • Science

    Science 270 (5234), 303-304, 1995-10-13

    American Association for the Advancement of Science (AAAS)

被引用文献 (84)*注記

もっと見る

キーワード

詳細情報 詳細情報について

問題の指摘

ページトップへ