Temporal Characteristics of Utterance Units and Topic Structure of Spoken Dialogs

この論文をさがす

抄録

There are various difficulties in processing spoken dialogs because of acoustic, phonetic, and grammatical ill-formedness, and because of interactions among participants. This paper describes temporal characteristics of utterances in human-human task-oriented dialogs and interactions between the participants, analyzed in relation to the topic structure of the dialog. We analyzed 12 task-oriented simulated dialogs of ASJ continuous speech corpus conducted by 13 different participants whose total length being 66 minutes. Speech data was segmented into utterance units each of which is a speech interval segmented by pauses. There were 3876 utterance units, and 38. 9% of them were interjections, fillers, false starts and chiming utterances. Each dialog consisted of 6 to 15 topic segments in each of which participants exchange specific information of the task. Eighty-six out of 119 new topic segments started with interjectory utterances and filled pauses. It was found that the durations of turn-taking interjections and fillers including the preceding silent pause were significantly longer in topic boundaries than the other positions. The results indicate that the duration of interjection words and filled pauses is a sign of a topic shift in spoken dialogs. In natural conversations, participants' speaking modes change dynamically as the conversation develops. Response time of both client and agent role speakers became shorter as the dialog proceeded. This indicates that interactions between the participants become active as the dialog proceeds. Speech rate was also affected by the dialog structure. It was generally fast in the initiating and terminating parts where most utterances are of fixed expressions, and slow in topic segments of the body part of the dialog where both client and agent participants stalled to speak in order to retrieve task knowledge. The results can be utilized in man-machine dialog systems, e. g. , in order to detect topic shifts of a dialog, and to make the speech interface of dialog systems more natural to a human participant.

収録刊行物

被引用文献 (3)*注記

もっと見る

参考文献 (14)*注記

もっと見る

詳細情報

  • CRID
    1570291227426013312
  • NII論文ID
    110003209473
  • NII書誌ID
    AA10826272
  • ISSN
    09168532
  • 本文言語コード
    en
  • データソース種別
    • CiNii Articles

問題の指摘

ページトップへ