Augmenting Training Samples with a Large Number of Rough Segmentation Datasets





We revisit the problem with generic object recognition from the point of view of human-computer interaction. While many existing algorithms for generic object recognition first try to detect target objects before features are extracted and classified in processing, our work is motivated by the belief that solving the task of detection by computer is not always necessary in many practical situations, such as those involving mobile recognition systems with touch displays and cameras. It is natural for these systems to ask users to input the segmentation data for targets through their touch displays. Speaking from the perspective of usability, such systems should involve <i>rough</i> segmentation to reduce the user workload. In this situation, different people would provide different segmentation data. Here, an interesting question arises - if multiple training samples are generated from a single image by using various segmentation data created by different people, what would happen to the accuracy of classification? We created "20 wild bird datasets" that had a large number of rough segmentation datasets made by 383 people in an attempt to answer this question. Our experiments revealed two interesting facts: (i) generating multiple training samples from a single image had positive effects on classification accuracies, especially when image features including spatial information were used and (ii) augmenting training samples with artificial segmentation data synthesized with a morphing technique also had slightly positive effects on classification accuracies.


  • IEICE transactions on information and systems

    IEICE transactions on information and systems 94(10), 1880-1888, 2011-10-01

    The Institute of Electronics, Information and Communication Engineers

参考文献:  26件中 1-26件 を表示


  • 本文言語コード
  • 資料種別
  • ISSN
  • データ提供元
    CJP書誌  J-STAGE