-
- C. J. Fall
- ELCA Informatique SA, Avenue de la Harpe 22-24, CH-1000 Lausanne 13, Switzerland
-
- A. Törcsvári
- Arcanum Development, Baranyai utca 10, H-1117 Budapest, Hungary
-
- K. Benzineb
- Metaread SA, 9 rue Boissonnas, CH-1227 Genève-Acacias, Switzerland
-
- G. Karetka
- World Intellectual Property Organization, 34 Chemin des Colombettes, CH-1211 Genève 20, Switzerland
抄録
<jats:p>A new reference collection of patent documents for training and testing automated categorization systems is established and described in detail. This collection is tailored for automating the attribution of international patent classification codes to patent applications and is made publicly available for future research work. We report the results of applying a variety of machine learning algorithms to the automated categorization of English-language patent documents. This procedure involves a complex hierarchical taxonomy, within which we classify documents into 114 classes and 451 subclasses. Several measures of categorization success are described and evaluated. We investigate how best to resolve the training problems related to the attribution of multiple classification codes to each patent document.</jats:p>
収録刊行物
-
- ACM SIGIR Forum
-
ACM SIGIR Forum 37 (1), 10-25, 2003-04
Association for Computing Machinery (ACM)
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1360298763530296064
-
- NII論文ID
- 30025057397
-
- ISSN
- 01635840
-
- データソース種別
-
- Crossref
- CiNii Articles