Automated categorization in the international patent classification

  • C. J. Fall
    ELCA Informatique SA, Avenue de la Harpe 22-24, CH-1000 Lausanne 13, Switzerland
  • A. Törcsvári
    Arcanum Development, Baranyai utca 10, H-1117 Budapest, Hungary
  • K. Benzineb
    Metaread SA, 9 rue Boissonnas, CH-1227 Genève-Acacias, Switzerland
  • G. Karetka
    World Intellectual Property Organization, 34 Chemin des Colombettes, CH-1211 Genève 20, Switzerland

抄録

<jats:p>A new reference collection of patent documents for training and testing automated categorization systems is established and described in detail. This collection is tailored for automating the attribution of international patent classification codes to patent applications and is made publicly available for future research work. We report the results of applying a variety of machine learning algorithms to the automated categorization of English-language patent documents. This procedure involves a complex hierarchical taxonomy, within which we classify documents into 114 classes and 451 subclasses. Several measures of categorization success are described and evaluated. We investigate how best to resolve the training problems related to the attribution of multiple classification codes to each patent document.</jats:p>

収録刊行物

  • ACM SIGIR Forum

    ACM SIGIR Forum 37 (1), 10-25, 2003-04

    Association for Computing Machinery (ACM)

被引用文献 (4)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ