TY - GEN
T1 - Latent Concept-based Explanation of NLP Models
AU - Yu, Xuemin
AU - Dalvi, Fahim
AU - Durrani, Nadir
AU - Nouri, Marzia
AU - Sajjad, Hassan
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024/11
Y1 - 2024/11
N2 - Interpreting and understanding the predictions made by deep learning models poses a formidable challenge due to their inherently opaque nature. Many previous efforts to explain these predictions rely on input features, specifically, the words within NLP models. However, such explanations are often less informative due to the discrete nature of the words and their lack of contextual verbosity. To address this limitation, we introduce Latent Concept Attribution (LACOAT), which generates explanations for predictions based on latent concepts. Our intuition is that a word can exhibit multiple facets depending on the context in which it is used. Therefore, given a word in context, the latent space derived from our training process reflects a specific facet of that word. LACOAT functions by mapping the representations of salient input words into the training latent space, enabling it to provide latent context-based explanations of the prediction.
AB - Interpreting and understanding the predictions made by deep learning models poses a formidable challenge due to their inherently opaque nature. Many previous efforts to explain these predictions rely on input features, specifically, the words within NLP models. However, such explanations are often less informative due to the discrete nature of the words and their lack of contextual verbosity. To address this limitation, we introduce Latent Concept Attribution (LACOAT), which generates explanations for predictions based on latent concepts. Our intuition is that a word can exhibit multiple facets depending on the context in which it is used. Therefore, given a word in context, the latent space derived from our training process reflects a specific facet of that word. LACOAT functions by mapping the representations of salient input words into the training latent space, enabling it to provide latent context-based explanations of the prediction.
UR - https://www.scopus.com/pages/publications/85217789240
U2 - 10.18653/v1/2024.emnlp-main.692
DO - 10.18653/v1/2024.emnlp-main.692
M3 - Conference contribution
AN - SCOPUS:85217789240
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
SP - 12435
EP - 12459
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
A2 - Al-Onaizan, Yaser
A2 - Bansal, Mohit
A2 - Chen, Yun-Nung
PB - Association for Computational Linguistics (ACL)
T2 - 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
Y2 - 12 November 2024 through 16 November 2024
ER -