Q-SAM: A HYBRID QUANTUM-CLASSICAL SEGMENT ANYTHING MODEL FOR SHAPE-BASED IMAGE SEGMENTATION

  • Maryam Al-Marri

Student thesis: Master's Dissertation

Abstract

This thesis investigates the integration of quantum computing into large-scale vision foundation models, with a particular focus on prompt-driven segmentation. The Seg- ment Anything Model (SAM), developed by Meta AI, represents a recent breakthrough in general-purpose segmentation, capable of producing high-quality masks from sim- ple prompts (such as points or bounding boxes) without the need for retraining. In parallel, quantum machine learning (QML) has shown increasing potential to improve high-dimensional data processing by leveraging principles such as superposition and entanglement. However, quantum approaches to segmentation have so far been lim- ited to small-scale, task-specific models, leaving their role in general-purpose segmen- tation largely unexplored. To address this gap, we propose Q-SAM, the first hybrid quantum-classical variant of the Segment Anything Model. Unlike prior SAM ex- tensions, which remain entirely classical, Q-SAM incorporates a lightweight quantum adapter into SAM’s Vision Transformer (ViT) encoder while preserving the model’s pretrained weights and zero-shot prompt-based structure. This adapter enables quantum feature transformations to be introduced non-invasively, allowing the model to benefit from quantum processing without compromising its original segmentation capabilities. We evaluate Q-SAM through a series of controlled experiments using a highly cus- tomizable, procedurally generated dataset designed to simulate a range of segmentation scenarios under constrained conditions. The results show that Q-SAM consistently out- performs classical SAM in low-resolution and low-data settings, demonstrating mea- surable gains when classical models typically degrade. Although classical SAM re- tains an edge at higher resolutions and dataset sizes, Q-SAM remains competitive and stable, confirming the feasibility of integrating quantum components into large-scale pretrained vision systems. This thesis provides the first demonstration of quantum- enhanced segmentation within a foundation model and opens interesting research direc- tions in quantum-vision integration, including the potential for increased qubit capacity, more expressive quantum layers, and real-world data applications.
Date of Award2025
Original languageAmerican English
Awarding Institution
  • HBKU College of Science and Engineering

Keywords

  • None

Cite this

'