パラ言語の理解能力を有する対話ロボット  [in Japanese] Dialogue Robot with an Ability to Understand Para - Linguistic Information  [in Japanese]

Access this Article

Search this Article

Author(s)

Abstract

音声対話における人間同士のやり取りは,発話に含まれる言語情報だけでなく,それを補助する別の情報も活用して行なわれていると考えられる.この発話に付随して生成され言語情報の円滑な伝達を補助する情報をパラ言語情報と呼ぶ.本論文では,パラ言語情報として,韻律情報を用いた態度認識と画像情報を用いた頭部ジェスチャの認識手法を示すとともに,それを用いた対話システムを構築する.前者は,発話者の態度が肯定的か否定的かを,F0パターンと音素アライメントを用いて識別する.後者は頷き,傾げ,横振りの3ジェスチャを,オプティカルフローを特徴量,left to right HMMを確率モデルとして用いることによって認識する.実験結果からこれらの手法が,パラ言語情報としてユーザの態度を表すのに十分な性能を持っていることを示す.The human-human interactions in a spoken seem to use not only linguistic information in the utterances but also some sorts of additional information supporting linguistic information. We call these sorts of additional information "para-linguistic information". In this paper, we present a recognition method of attitudes by prosodic information, and a recognition method of head gestures. In the former method. in order to recognize two attitudes, such as "positive" and "negative", F0 pattern and phoneme alignment are introduced as features. In the latter method, in order to recognize three gestures, such as "nod", "tilt" and "shake", left-to-right HMM is introduced as the probabilistic model as well as optical flow is introduced as features. Experimental results show that these methods are sufficient to recognize user's attitude as para-linguistic information. Finally, we show a proto-type spoken dialogue system using para-linguistic information and how these sorts of information contribute the efficient conversation.

The human-human interactions in a spoken dialogue seem to use not only linguistic information in the utterances but also some sorts of additional information supporting linguistic information. We call these sorts of additional information "para-linguistic information". In this paper, we present a recognition method of attitudes by prosodic information, and a recognition method of head gestures. In the former method, in order to recognize two attitudes, such as "positive" and "negative", FO pattern and phoneme alignment are introduced as features. In the latter method, in order to recognize three gestures, such as "nod", "tilt" and "shake", left-to-right HMM is introduced as the probabilistic model as well as optical flow is introduced as features. Experimental results show that these methods are sufficient to recognize user's attitude as para-linguistic information. Finally, we show a proto-type spoken dialogue system using para-linguistic information and how these sorts of information contribute the efficient conversation.

Journal

  • IPSJ SIG Notes

    IPSJ SIG Notes 2003(104(2003-SLP-048)), 13-20, 2003-10-17

    Information Processing Society of Japan (IPSJ)

References:  12

Cited by:  5

Codes

  • NII Article ID (NAID)
    110002913710
  • NII NACSIS-CAT ID (NCID)
    AN10442647
  • Text Lang
    JPN
  • Article Type
    Journal Article
  • ISSN
    09196072
  • NDL Article ID
    6737596
  • NDL Source Classification
    ZM13(科学技術--科学技術一般--データ処理・計算機)
  • NDL Call No.
    Z14-1121
  • Data Source
    CJP  CJPref  NDL  NII-ELS  IPSJ 
Page Top