TY - GEN
T1 - ArtInsight
T2 - 35th Medical Informatics Europe Conference, MIE 2025
AU - Shah, Uzair
AU - Khan, Naseem
AU - Alzubaidi, Mahmood
AU - Agus, Marco
AU - Househ, Mowafa
N1 - Publisher Copyright:
© 2025 The Authors.
PY - 2025/5/15
Y1 - 2025/5/15
N2 - Recent advancements in multimodal image-to-text models have greatly enhanced the interpretation of children's drawings for emotional understanding purposes. This paper introduces a framework that analyzes these drawings to fully automatically generate detailed reports, covering art descriptions, emotional themes, assessments, and personalized recommendations. Our approach involved annotating 5,000 images by exploiting a Large Language Model (ChatGPT) and by fine-tuning the BLIP (Bootstrapping Language-Image Pre-training) multimodal model. We performed fine-tuning in two steps: 1) we applied Low-Rank Adaptation (LoRA) to the image encoder to preserve its pre-trained features while adapting it to our task, and 2) we refined the text decoder to capture the language patterns needed for comprehensive assessments. The system processes children's artwork as input, using multimodal image-to-text techniques to derive meaningful insights. Although these reports are initial evaluations rather than formal clinical assessments, they provide a valuable starting point for understanding children's emotional and psychological states. This tool can assist art therapists, educators, and parents in gaining a deeper understanding of children's inner worlds. Our research highlights the intersection of artificial intelligence and child psychology, showing how technology can complement human expertise in nurturing children's emotional well-being. By offering a structured, AI-driven analysis of children's drawings, this framework creates new opportunities for early intervention, personalized support, and enhanced communication between children and their caregivers. The impact of this work may extend beyond individual assessments, potentially informing broader strategies in child development, art therapy, and educational practices.
AB - Recent advancements in multimodal image-to-text models have greatly enhanced the interpretation of children's drawings for emotional understanding purposes. This paper introduces a framework that analyzes these drawings to fully automatically generate detailed reports, covering art descriptions, emotional themes, assessments, and personalized recommendations. Our approach involved annotating 5,000 images by exploiting a Large Language Model (ChatGPT) and by fine-tuning the BLIP (Bootstrapping Language-Image Pre-training) multimodal model. We performed fine-tuning in two steps: 1) we applied Low-Rank Adaptation (LoRA) to the image encoder to preserve its pre-trained features while adapting it to our task, and 2) we refined the text decoder to capture the language patterns needed for comprehensive assessments. The system processes children's artwork as input, using multimodal image-to-text techniques to derive meaningful insights. Although these reports are initial evaluations rather than formal clinical assessments, they provide a valuable starting point for understanding children's emotional and psychological states. This tool can assist art therapists, educators, and parents in gaining a deeper understanding of children's inner worlds. Our research highlights the intersection of artificial intelligence and child psychology, showing how technology can complement human expertise in nurturing children's emotional well-being. By offering a structured, AI-driven analysis of children's drawings, this framework creates new opportunities for early intervention, personalized support, and enhanced communication between children and their caregivers. The impact of this work may extend beyond individual assessments, potentially informing broader strategies in child development, art therapy, and educational practices.
KW - Art Therapy
KW - Children's Drawings
KW - Emotional Assessment
KW - Image-to-Text Models
UR - https://www.scopus.com/pages/publications/105005823239
U2 - 10.3233/SHTI250471
DO - 10.3233/SHTI250471
M3 - Conference contribution
C2 - 40380579
AN - SCOPUS:105005823239
T3 - Studies in Health Technology and Informatics
SP - 808
EP - 812
BT - Intelligent Health Systems - From Technology to Data and Knowledge, Proceedings of MIE 2025
A2 - Andrikopoulou, Elisavet
A2 - Gallos, Parisis
A2 - Arvanitis, Theodoros N.
A2 - Austin, Rosalynn
A2 - Benis, Arriel
A2 - Cornet, Ronald
A2 - Chatzistergos, Panagiotis
A2 - Dejaco, Alexander
A2 - Dusseljee-Peute, Linda
A2 - Mohasseb, Alaa
A2 - Natsiavas, Pantelis
A2 - Nakkas, Haythem
A2 - Scott, Philip
PB - IOS Press BV
Y2 - 19 May 2025 through 21 May 2025
ER -