Topic Representation of Researchers' Interests in a Large-Scale Academic Database and Its Application to Author Disambiguation
-
- KATSURAI Marie
- Doshisha University
-
- OHMUKAI Ikki
- National Institute of Informatics
-
- TAKEDA Hideaki
- National Institute of Informatics
抄録
It is crucial to promote interdisciplinary research and recommend collaborators from different research fields via academic database analysis. This paper addresses a problem to characterize researchers' interests with a set of diverse research topics found in a large-scale academic database. Specifically, we first use latent Dirichlet allocation to extract topics as distributions over words from a training dataset. Then, we convert the textual features of a researcher's publications to topic vectors, and calculate the centroid of these vectors to summarize the researcher's interest as a single vector. In experiments conducted on CiNii Articles, which is the largest academic database in Japan, we show that the extracted topics reflect the diversity of the research fields in the database. The experiment results also indicate the applicability of the proposed topic representation to the author disambiguation problem.
収録刊行物
-
- IEICE Transactions on Information and Systems
-
IEICE Transactions on Information and Systems E99.D (4), 1010-1018, 2016
一般社団法人 電子情報通信学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390282679354273280
-
- NII論文ID
- 130005141367
-
- ISSN
- 17451361
- 09168532
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
- KAKEN
-
- 抄録ライセンスフラグ
- 使用不可