TY - GEN
T1 - GREENER
T2 - 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
AU - Panayotov, Panayot
AU - Shukla, Utsav
AU - Sencar, Husrev T.
AU - Nabeel, Mohamed
AU - Nakov, Preslav
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and “fake news” detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in advance. Unlike previous work, which has focused primarily on text (e.g., on the articles published by the target website, or on the textual description in their social media profiles or in Wikipedia), here we focus on modeling the similarity between media outlets based on the overlap of their audience. This is motivated by homophily considerations, i.e., the tendency of people to have connections to people with similar interests, which we extend to media, hypothesizing that similar types of media would be read by similar kinds of users. In particular, we propose GREENER (GRaph nEural nEtwork for News mEdia pRofiling), a model that builds a graph of inter-media connections based on their audience overlap, and then uses graph neural networks to represent each medium. We find that such representations are quite useful for predicting the factuality and the bias of news media outlets, yielding improvements over state-ofthe-art results reported on two datasets. When augmented with conventionally used representations obtained from news articles, Twitter, YouTube, Facebook, and Wikipedia, we improve over previous work by 2.5-27 macro-F1 points absolute for the two tasks and datasets.
AB - We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and “fake news” detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in advance. Unlike previous work, which has focused primarily on text (e.g., on the articles published by the target website, or on the textual description in their social media profiles or in Wikipedia), here we focus on modeling the similarity between media outlets based on the overlap of their audience. This is motivated by homophily considerations, i.e., the tendency of people to have connections to people with similar interests, which we extend to media, hypothesizing that similar types of media would be read by similar kinds of users. In particular, we propose GREENER (GRaph nEural nEtwork for News mEdia pRofiling), a model that builds a graph of inter-media connections based on their audience overlap, and then uses graph neural networks to represent each medium. We find that such representations are quite useful for predicting the factuality and the bias of news media outlets, yielding improvements over state-ofthe-art results reported on two datasets. When augmented with conventionally used representations obtained from news articles, Twitter, YouTube, Facebook, and Wikipedia, we improve over previous work by 2.5-27 macro-F1 points absolute for the two tasks and datasets.
UR - https://www.scopus.com/pages/publications/85140193803
U2 - 10.18653/v1/2022.emnlp-main.506
DO - 10.18653/v1/2022.emnlp-main.506
M3 - Conference contribution
AN - SCOPUS:85140193803
T3 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
SP - 7470
EP - 7480
BT - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
A2 - Goldberg, Yoav
A2 - Kozareva, Zornitsa
A2 - Zhang, Yue
PB - Association for Computational Linguistics (ACL)
Y2 - 7 December 2022 through 11 December 2022
ER -