This thesis investigates the integration of quantum computing into large-scale vision
foundation models, with a particular focus on prompt-driven segmentation. The Seg-
ment Anything Model (SAM), developed by Meta AI, represents a recent breakthrough
in general-purpose segmentation, capable of producing high-quality masks from sim-
ple prompts (such as points or bounding boxes) without the need for retraining. In
parallel, quantum machine learning (QML) has shown increasing potential to improve
high-dimensional data processing by leveraging principles such as superposition and
entanglement. However, quantum approaches to segmentation have so far been lim-
ited to small-scale, task-specific models, leaving their role in general-purpose segmen-
tation largely unexplored. To address this gap, we propose Q-SAM, the first hybrid
quantum-classical variant of the Segment Anything Model. Unlike prior SAM ex-
tensions, which remain entirely classical, Q-SAM incorporates a lightweight quantum
adapter into SAM’s Vision Transformer (ViT) encoder while preserving the model’s
pretrained weights and zero-shot prompt-based structure. This adapter enables quantum
feature transformations to be introduced non-invasively, allowing the model to benefit
from quantum processing without compromising its original segmentation capabilities.
We evaluate Q-SAM through a series of controlled experiments using a highly cus-
tomizable, procedurally generated dataset designed to simulate a range of segmentation
scenarios under constrained conditions. The results show that Q-SAM consistently out-
performs classical SAM in low-resolution and low-data settings, demonstrating mea-
surable gains when classical models typically degrade. Although classical SAM re-
tains an edge at higher resolutions and dataset sizes, Q-SAM remains competitive and
stable, confirming the feasibility of integrating quantum components into large-scale
pretrained vision systems. This thesis provides the first demonstration of quantum-
enhanced segmentation within a foundation model and opens interesting research direc-
tions in quantum-vision integration, including the potential for increased qubit capacity,
more expressive quantum layers, and real-world data applications.
| Date of Award | 2025 |
|---|
| Original language | American English |
|---|
| Awarding Institution | - HBKU College of Science and Engineering
|
|---|
Q-SAM: A HYBRID QUANTUM-CLASSICAL SEGMENT ANYTHING MODEL FOR SHAPE-BASED IMAGE SEGMENTATION
Al-Marri, M. (Author). 2025
Student thesis: Master's Dissertation