TY - JOUR
T1 - Dual-stage segmentation and classification framework for skin lesion analysis using deep neural network
AU - Manzoor, Khadija
AU - Gilal, Nauman U.
AU - Agus, Marco
AU - Schneider, Jens
N1 - Publisher Copyright:
© The Author(s) 2025. This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
PY - 2025/7/13
Y1 - 2025/7/13
N2 - ObjectiveSkin diseases, caused by various pathogens including bacteria, viruses, and fungi, are prevalent globally and significantly affect patients’ physical, emotional, and social well-being. Early and accurate detection of such conditions is critical to prevent progression, especially in cases of malignant skin lesions. This study aims to develop a dual-stage deep learning framework for the segmentation and classification of skin lesions, addressing challenges such as imbalanced data, lesion variability, and low contrast.MethodsWe propose a two-phase framework: (i) Precise instance segmentation using U-Net with a Visual Geometry Group (VGG16 encoder) to isolate skin lesions and (ii) classification using EfficientFormer and SwiftFormer networks to evaluate performance on both balanced and imbalanced datasets. Experiments were conducted on three benchmark datasets: Human against machine with 10,000 training images (HAM10000), International Skin Imaging Collaboration (ISIC) 2018, and the newly released ISIC 2024 SLICE-3D dataset. For SLICE-3D, we evaluated both tabular-only and image + metadata fusion approaches using XGBoost classifier and ResNet-based classifier, respectively.ResultsOn the balanced HAM10000 dataset, EfficientFormerV2 achieved 97.11% accuracy, a 97.14% F1-score, 96.85% sensitivity, and 96.70% specificity. On the ISIC 2018 dataset, the segmentation model achieved 97.59% accuracy, 89.12% Jaccard index, and 94.24% Dice similarity coefficient. For the ISIC 2024 SLICE-3D challenge, the tabular-only XGBoost classifier achieved a partial area under the receiver operating characteristic curve score of 0.16752, while the image + tabular fusion model achieved a score of 0.15792 using ResNet, demonstrating competitive performance in a highly imbalanced and clinically realistic setting.ConclusionThe proposed dual-stage deep learning framework demonstrates high accuracy and robustness across segmentation and classification tasks on diverse datasets. Its adaptability to large-scale, non-dermoscopic data such as SLICE-3D confirms its potential for deployment in real-world skin cancer triage and teledermatology applications.
AB - ObjectiveSkin diseases, caused by various pathogens including bacteria, viruses, and fungi, are prevalent globally and significantly affect patients’ physical, emotional, and social well-being. Early and accurate detection of such conditions is critical to prevent progression, especially in cases of malignant skin lesions. This study aims to develop a dual-stage deep learning framework for the segmentation and classification of skin lesions, addressing challenges such as imbalanced data, lesion variability, and low contrast.MethodsWe propose a two-phase framework: (i) Precise instance segmentation using U-Net with a Visual Geometry Group (VGG16 encoder) to isolate skin lesions and (ii) classification using EfficientFormer and SwiftFormer networks to evaluate performance on both balanced and imbalanced datasets. Experiments were conducted on three benchmark datasets: Human against machine with 10,000 training images (HAM10000), International Skin Imaging Collaboration (ISIC) 2018, and the newly released ISIC 2024 SLICE-3D dataset. For SLICE-3D, we evaluated both tabular-only and image + metadata fusion approaches using XGBoost classifier and ResNet-based classifier, respectively.ResultsOn the balanced HAM10000 dataset, EfficientFormerV2 achieved 97.11% accuracy, a 97.14% F1-score, 96.85% sensitivity, and 96.70% specificity. On the ISIC 2018 dataset, the segmentation model achieved 97.59% accuracy, 89.12% Jaccard index, and 94.24% Dice similarity coefficient. For the ISIC 2024 SLICE-3D challenge, the tabular-only XGBoost classifier achieved a partial area under the receiver operating characteristic curve score of 0.16752, while the image + tabular fusion model achieved a score of 0.15792 using ResNet, demonstrating competitive performance in a highly imbalanced and clinically realistic setting.ConclusionThe proposed dual-stage deep learning framework demonstrates high accuracy and robustness across segmentation and classification tasks on diverse datasets. Its adaptability to large-scale, non-dermoscopic data such as SLICE-3D confirms its potential for deployment in real-world skin cancer triage and teledermatology applications.
KW - Deep learning
KW - Image augmentation
KW - Skin cancer
KW - Skin disease classification
KW - Skin lesion segmentation
UR - https://www.scopus.com/pages/publications/105012501581
U2 - 10.1177/20552076251351858
DO - 10.1177/20552076251351858
M3 - Article
AN - SCOPUS:105012501581
SN - 2055-2076
VL - 11
JO - Digital Health
JF - Digital Health
M1 - 20552076251351858
ER -