Detecting Robot-Directed Speech by Situated Understanding in Physical Interaction

Access this Article

Author(s)

    • Zuo Xiang
    • Advanced Telecommunication Research Labs and Kyoto Institute of Technology
    • Iwahashi Naoto
    • Advanced Telecommunication Research Labs and National Institute of Information and Communications Technology
    • Taguchi Ryo
    • Advanced Telecommunication Research Labs and Nagoya Institute of Technology
    • Sugiura Komei
    • National Institute of Information and Communications Technology

Abstract

In this paper, we propose a novel method for a robot to detect robot-directed speech: to distinguish speech that users speak to a robot from speech that users speak to other people or to themselves. The originality of this work is the introduction of a multimodal semantic confidence (MSC) measure, which is used for domain classification of input speech based on the decision on whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, object, and motion confidence with weightings that are optimized by logistic regression. Then we integrate this measure with gaze tracking and conduct experiments under conditions of natural human-robot interactions. Experimental results show that the proposed method achieves a high performance of 94% and 96% in average recall and precision rates, respectively, for robot-directed speech detection.

Journal

  • Transactions of the Japanese Society for Artificial Intelligence

    Transactions of the Japanese Society for Artificial Intelligence 25(6), 670-682, 2010

    The Japanese Society for Artificial Intelligence

Cited by:  1

Codes

  • NII Article ID (NAID)
    130000341877
  • Text Lang
    ENG
  • Article Type
    Journal Article
  • ISSN
    1346-0714
  • Data Source
    CJPref  J-STAGE 
Page Top