ArtInsight: A Multimodal AI Framework for Interpreting Children's Drawings and Enhancing Emotional Understanding

Uzair Shah, Naseem Khan, Mahmood Alzubaidi, Marco Agus, Mowafa Househ

Research output: Contribution to journalArticlepeer-review

Abstract

Recent advancements in multimodal image-to-text models have greatly enhanced the interpretation of children's drawings for emotional understanding purposes. This paper introduces a framework that analyzes these drawings to fully automatically generate detailed reports, covering art descriptions, emotional themes, assessments, and personalized recommendations. Our approach involved annotating 5,000 images by exploiting a Large Language Model (ChatGPT) and by fine-tuning the BLIP (Bootstrapping Language-Image Pre-training) multimodal model. We performed fine-tuning in two steps: 1) we applied Low-Rank Adaptation (LoRA) to the image encoder to preserve its pre-trained features while adapting it to our task, and 2) we refined the text decoder to capture the language patterns needed for comprehensive assessments. The system processes children's artwork as input, using multimodal image-to-text techniques to derive meaningful insights. Although these reports are initial evaluations rather than formal clinical assessments, they provide a valuable starting point for understanding children's emotional and psychological states. This tool can assist art therapists, educators, and parents in gaining a deeper understanding of children's inner worlds. Our research highlights the intersection of artificial intelligence and child psychology, showing how technology can complement human expertise in nurturing children's emotional well-being. By offering a structured, AI-driven analysis of children's drawings, this framework creates new opportunities for early intervention, personalized support, and enhanced communication between children and their caregivers. The impact of this work may extend beyond individual assessments, potentially informing broader strategies in child development, art therapy, and educational practices.

Original languageEnglish
Pages (from-to)808-812
Number of pages5
JournalStudies in Health Technology and Informatics
Volume327
DOIs
Publication statusPublished - 15 May 2025

Keywords

  • Art Therapy
  • Children’s Drawings
  • Emotional Assessment
  • Image-to-Text Models

Fingerprint

Dive into the research topics of 'ArtInsight: A Multimodal AI Framework for Interpreting Children's Drawings and Enhancing Emotional Understanding'. Together they form a unique fingerprint.

Cite this