講義自動撮影における話者位置推定のための視聴覚情報の統合  [in Japanese] Audio and Visual Information Integration for Speaker's Localization in Automatic Shooting of Lecture  [in Japanese]

Access this Article

Search this Article

Author(s)

Abstract

It is useful for automatic video shooting in a lecture room to estimate the location of a speaker in the lecture room. The captured videos are used for distance learning and lecture archiving systems. In order to estimate the location of a speaker in a wide lecture room, multiple cameras and multiple microphones are used. However, it is difficult to estimate the precise location of a speaker using only visual or acoustic sensors because of calibration problems, noise, and other interference. Therefore, we propose a method that integrates audio and visual information from a speaker in the lecture room. A lecturer’s cell and a student’s cell ared introduced as a unit of estimation of the location of a speaker. We defined 120 cells in a real lecture room and our multi-modal method were applied to the cells. The estimation accuracy of the location of a speaker is sufficient for automatic video shooting of a speaker in a lecture room by our integrating method.

Journal

  • IEEJ Transactions on Electronics, Information and Systems

    IEEJ Transactions on Electronics, Information and Systems 124(3), 729-739, 2004-03-01

    The Institute of Electrical Engineers of Japan

References:  19

Cited by:  12

Codes

  • NII Article ID (NAID)
    10012646473
  • NII NACSIS-CAT ID (NCID)
    AN10065950
  • Text Lang
    JPN
  • Article Type
    Journal Article
  • ISSN
    03854221
  • NDL Article ID
    6868104
  • NDL Source Classification
    ZN31(科学技術--電気工学・電気機械工業)
  • NDL Call No.
    Z16-795
  • Data Source
    CJP  CJPref  NDL  J-STAGE 
Page Top