Investigating topic influence in authorship attribution

George K. Mikros, Eleni K. Argiri

Research output: Contribution to journalConference articlepeer-review

19 Citations (Scopus)

Abstract

The aim of this paper is to explore text topic influence in authorship attribution. Specifically, we test the widely accepted belief that stylometric variables commonly used in authorship attribution are topic-neutral and can be used in multi-topic corpora. In order to investigate this hypothesis, we created a special corpus, which was controlled for topic and author simultaneously. The corpus consists of 200 Modern Greek newswire articles written by two authors in two different topics. Many commonly used stylometric variables were calculated and for each one we performed a two-way ANOVA test, in order to estimate the main effects of author, topic and the interaction between them. The results showed that most of the variables exhibit considerable correlation with the text topic and their exploitation in authorship analysis should be done with caution.

Original languageEnglish
Pages (from-to)29-35
Number of pages7
JournalCEUR Workshop Proceedings
Volume276
Publication statusPublished - 2007
Externally publishedYes
EventSIGIR 2007 International Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection, PAN 2007 - Genoa, Italy
Duration: 5 Dec 20075 Dec 2007

Keywords

  • Authorship attribution
  • Stylometry
  • Topic-neutral features

Fingerprint

Dive into the research topics of 'Investigating topic influence in authorship attribution'. Together they form a unique fingerprint.

Cite this