When rank order isn't enough: New statistical-significance-aware correlation measures

Mucahid Kutlu, Tamer Elsayed, Maram Hasanain, Matthew Lease

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Because it is expensive to construct test collections for Cranfield-based evaluation of information retrieval systems, a variety of lower-cost methods have been proposed. The reliability of these methods is often validated by measuring rank correlation (e.g., Kendall's t) between known system rankings on the full test collection vs. observed system rankings on the lower-cost one. However, existing rank correlation measures do not consider the statistical significance of score differences between systems in the observed rankings. To address this, we propose two statistical-significance-aware rank correlation measures, one of which is a head-weighted version of the other. We first show empirical differences between our proposed measures and existing ones. We then compare the measures while benchmarking four system evaluation methods: pooling, crowdsourcing, evaluation with incomplete judgments, and automatic system ranking. We show that use of our measures can lead to different experimental conclusions regarding reliability of alternative low-cost evaluation methods.

Original languageEnglish
Title of host publicationCIKM 2018 - Proceedings of the 27th ACM International Conference on Information and Knowledge Management
EditorsNorman Paton, Selcuk Candan, Haixun Wang, James Allan, Rakesh Agrawal, Alexandros Labrinidis, Alfredo Cuzzocrea, Mohammed Zaki, Divesh Srivastava, Andrei Broder, Assaf Schuster
PublisherAssociation for Computing Machinery
Pages397-406
Number of pages10
ISBN (Electronic)9781450360142
DOIs
Publication statusPublished - 17 Oct 2018
Externally publishedYes
Event27th ACM International Conference on Information and Knowledge Management, CIKM 2018 - Torino, Italy
Duration: 22 Oct 201826 Oct 2018

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference27th ACM International Conference on Information and Knowledge Management, CIKM 2018
Country/TerritoryItaly
CityTorino
Period22/10/1826/10/18

Keywords

  • Evaluation
  • IR System Ranking
  • Rank Correlation

Fingerprint

Dive into the research topics of 'When rank order isn't enough: New statistical-significance-aware correlation measures'. Together they form a unique fingerprint.

Cite this