The rapid rise of generative AI, particularly text-to-image models, has transformed content creation but often introduces biases, misinterprets linguistic ambiguity, and lacks cultural sensitivity, leading to distorted or underrepresented identities. This dissertation addresses these gaps through a literature review, identifying key challenges and introducing methods to enhance cultural relevance via improved prompt engineering, ambiguity resolution, and cultural assessment.
First, it examines linguistic ambiguity in text-to-image models, highlighting struggles with polysemy, syntactic complexity, and figurative language. To evaluate these issues, the Visual Linguistic Ambiguity Benchmark (V-LAB) is introduced, identifying failure modes and proposing mitigation guidelines for multimodal AI systems.
Second, the dissertation develops prompt augmentation techniques, using large language models to refine prompts for better cultural alignment, particularly for Arabic content. Structured prompt engineering is shown to produce more accurate, diverse, and culturally rich imagery, addressing cultural distortions.
The work also advances cultural evaluation in generative models by introducing the Cultural Relevance Index (CRI), a metric combining human cultural perceptions with vision-language models. CRI is extended into CRIX by incorporating context analysis and visual question answering, with extensive evaluations confirming CRIX’s superior performance. Image similarity metrics are also developed to quantify cultural distance and representation quality.
Finally, applied frameworks like CulEval and Ara-Pic integrate these solutions. CulEval combines prompt augmentation with CRIX to assess cultural relevance before and after prompt refinement, while Ara-Pic iteratively refines prompts to enhance Arabic cultural representation. Together, they bridge gaps in cultural representation, promoting more inclusive
| Date of Award | 2025 |
|---|
| Original language | American English |
|---|
| Awarding Institution | - HBKU College of Science and Engineering
|
|---|
- Cultural Analysis
- Generative modles
- Model Evaluation
- Text-to-image models
Bridging AI and Culture: Enhancing and Evaluating Culture Representation in AI-Generated Images
Elsharif, W. (Author). 2025
Student thesis: Doctoral Dissertation