Accuracy and Standardized Judgment Procedures for Author Identification by Text Mining

  • Zaitsu Wataru
    Forensic Science Laboratory, Toyama Prefectural Police Headquarters
  • Jin Mingzhe
    Faculty of Culture and Information Science, Doshisha University

Bibliographic Information

Other Title
  • テキストマイニングによる筆者識別の正確性ならびに判定手続きの標準化
  • テキストマイニング ニ ヨル ヒッシャ シキベツ ノ セイカクセイ ナラビニ ハンテイ テツズキ ノ ヒョウジュンカ

Search this article

Abstract

<p>This study examined the accuracy for author identification by text mining. We conducted 16 analyses (four writing styles × four multivariate analyses) across texts of 100 Bloggers, written by approximately 1,000 characters. Specifically, we conducted (1) principal components analysis, (2) correspondence analysis, (3) multi-dimensional scaling, and (4) hierarchical cluster analysis on each writing style: (1) rate of usage of non-independent words, (2) bigram of parts-of-speech, (3) bigram of postpositional particles, and (4) positioning of commas. We obtained high accuracy: 100% on sensitivity and 95.1% on specificity. Furthermore, the results showed no effects of age and gender against accuracy for author identification. </p>

Journal

References(5)*help

See more

Related Projects

See more

Details 詳細情報について

Report a problem

Back to top