Skip to main navigation Skip to search Skip to main content

The Art of Saying “Maybe”: A Conformal Lens for Uncertainty Benchmarking in VLMs

  • Asif Azad*
  • , Mohammad Sadat Hossain
  • , M. D. Sadik Hossain Shanto
  • , M. Saifur Rahman
  • , Md Rizwan Parvez
  • *Corresponding author for this work
  • Bangladesh University of Engineering and Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Vision-Language Models (VLMs) have achieved remarkable progress in complex visual understanding across scientific and reasoning tasks. While performance benchmarking has advanced our understanding of these capabilities, the critical dimension of uncertainty quantification has received insufficient attention. Therefore, unlike prior conformal prediction studies that focused on limited settings, we conduct a comprehensive uncertainty benchmarking study, evaluating 18 state-of-the-art VLMs (open and closed-source) across 6 multimodal datasets with 3 distinct scoring functions. For closed-source models lacking token-level logprob access, we develop and validate instruction-guided likelihood proxies. Our findings demonstrate that larger models consistently exhibit better uncertainty quantification; models that know more also know better what they don’t know. More certain models achieve higher accuracy, while mathematical and reasoning tasks elicit poorer uncertainty performance across all models compared to other domains. This work establishes a foundation for reliable uncertainty evaluation in multimodal systems.

Original languageEnglish
Title of host publication19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026
PublisherAssociation for Computational Linguistics (ACL)
Pages5185-5201
Number of pages17
ISBN (Electronic)9798891763869
DOIs
Publication statusPublished - 2026
Event19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026 - Rabat, Morocco
Duration: 24 Mar 202629 Mar 2026

Publication series

Name19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026

Conference

Conference19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026
Country/TerritoryMorocco
CityRabat
Period24/03/2629/03/26

Fingerprint

Dive into the research topics of 'The Art of Saying “Maybe”: A Conformal Lens for Uncertainty Benchmarking in VLMs'. Together they form a unique fingerprint.

Cite this