The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

  • Nailia Mirzakhmedova*
  • , Johannes Kiesel
  • , Milad Alshomary
  • , Maximilian Heinrich
  • , Nicolas Handke
  • , Xiaoni Cai
  • , Valentin Barriere
  • , Doratossadat Dastgheib
  • , Omid Ghahroodi
  • , Mohammad Ali Sadraei Javaheri
  • , Ehsaneddin Asgari
  • , Lea Kawaletz
  • , Henning Wachsmuth
  • , Benno Stein
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

While human values play a crucial role in making arguments persuasive, we currently lack the necessary extensive datasets to develop methods for analyzing the values underlying these arguments on a large scale. To address this gap, we present the Touché23-ValueEval dataset, an expansion of the Webis-ArgValues-22 dataset. We collected and annotated an additional 4 780 new arguments, doubling the dataset's size to 9 324 arguments. These arguments were sourced from six diverse sources, covering religious texts, community discussions, free-text arguments, newspaper editorials, and political debates. Each argument is annotated by three crowdworkers for 54 human values, following the methodology established in the original dataset. The Touché23-ValueEval dataset was utilized in the SemEval 2023 Task 4. ValueEval: Identification of Human Values behind Arguments, where an ensemble of transformer models demonstrated state-of-the-art performance. Furthermore, our experiments show that a fine-tuned large language model, Llama-2-7B, achieves comparable results.

Original languageEnglish
Title of host publication2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
PublisherEuropean Language Resources Association (ELRA)
Pages16121-16134
Number of pages14
ISBN (Electronic)9782493814104
Publication statusPublished - May 2024
EventJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italy
Duration: 20 May 202425 May 2024

Publication series

Name2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

Conference

ConferenceJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
Country/TerritoryItaly
CityHybrid, Torino
Period20/05/2425/05/24

Keywords

  • Corpus (Creation, Annotation, etc.)
  • Document Classification
  • Text categorisation

Fingerprint

Dive into the research topics of 'The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments'. Together they form a unique fingerprint.

Cite this