GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge

Liam Dugan, Andrew Zhu, Firoj Alam, Preslav Nakov, Marianna Apidianaki, Chris Callison-Burch

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Recently there have been many shared tasks targeting the detection of generated text from Large Language Models (LLMs). However, these shared tasks tend to focus either on cases where text is limited to one particular domain or cases where text can be from many domains, some of which may not be seen during test time. In this shared task, using the newly released RAID benchmark, we aim to answer whether or not models can detect generated text from a large, yet fixed, number of domains and LLMs, all of which are seen during training. Over the course of three months, our task was attempted by 9 teams with 23 detector submissions. We find that multiple participants were able to obtain accuracies of over 99% on machine-generated text from RAID while maintaining a 5% False Positive Rate-suggesting that detectors are able to robustly detect text from many domains and models simultaneously. We discuss potential interpretations of this result and provide directions for future research.

Original languageEnglish
Title of host publicationGenAIDetect 2025 - Proceedings of the 1st Workshop on GenAI Content Detection, Proceedings of the Workshop - 31st International Conference on Computational Linguistics, COLING 2025
EditorsFiroj Alam, Preslav Nakov, Nizar Habash, Iryna Gurevych, Iryna Gurevych, Shammur Chowdhury, Artem Shelmanov, Yuxia Wang, Ekaterina Artemova, Mucahid Kutlu, George Mikros
PublisherAssociation for Computational Linguistics (ACL)
Pages377-388
Number of pages12
ISBN (Electronic)9798891762053
DOIs
Publication statusPublished - 15 Jan 2025
Event1st Workshop on GenAI Content Detection, GenAIDetect 2025 - Abu Dhabi, United Arab Emirates
Duration: 19 Jan 2025 → …

Publication series

NameProceedings - International Conference on Computational Linguistics, COLING
ISSN (Print)2951-2093

Conference

Conference1st Workshop on GenAI Content Detection, GenAIDetect 2025
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period19/01/25 → …

Fingerprint

Dive into the research topics of 'GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge'. Together they form a unique fingerprint.

Cite this