TY - GEN
T1 - Multi-scale Dynamic Network for Document Shadow Removal
AU - Li, Jiarui
AU - Ma, Jiaqi
AU - Xiao, Zeyu
AU - Zhuang, Ziyi
AU - Lu, Zhihe
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/12/8
Y1 - 2025/12/8
N2 - Document clarity is pivotal for reliable information transmission, yet shadows remain a pervasive degradation that harms visual quality and downstream processing efficiency. Existing approaches often falter under extreme illumination, struggle with multi-scale shadow patterns, and exhibit limited generalization across document types and capture conditions. We present an end-to-end multi-modal, multi-scale framework that integrates advanced architectures with dynamic feature fusion to adaptively suppress shadows of varying sizes while preserving fine textual details. Concretely, we introduce a Multi-scale Dynamic Network (MDN) that performs scale-aware gating and cross-branch aggregation, enabling the model to emphasize informative cues and attenuate shadow bias at each resolution level. The pipeline is streamlined to reduce manual pre-/post-processing, thereby improving both efficiency and effectiveness in practical document workflows. Experimental results show consistent gains over traditional and deep learning baselines on the Jung and Kligler datasets, confirming superior accuracy and robustness under challenging lighting.
AB - Document clarity is pivotal for reliable information transmission, yet shadows remain a pervasive degradation that harms visual quality and downstream processing efficiency. Existing approaches often falter under extreme illumination, struggle with multi-scale shadow patterns, and exhibit limited generalization across document types and capture conditions. We present an end-to-end multi-modal, multi-scale framework that integrates advanced architectures with dynamic feature fusion to adaptively suppress shadows of varying sizes while preserving fine textual details. Concretely, we introduce a Multi-scale Dynamic Network (MDN) that performs scale-aware gating and cross-branch aggregation, enabling the model to emphasize informative cues and attenuate shadow bias at each resolution level. The pipeline is streamlined to reduce manual pre-/post-processing, thereby improving both efficiency and effectiveness in practical document workflows. Experimental results show consistent gains over traditional and deep learning baselines on the Jung and Kligler datasets, confirming superior accuracy and robustness under challenging lighting.
KW - Deep Learning
KW - Document Shadow Removal
KW - Image Processing
KW - Image Shadow Removal
UR - https://www.scopus.com/pages/publications/105024963045
U2 - 10.1145/3769748.3773342
DO - 10.1145/3769748.3773342
M3 - Conference contribution
AN - SCOPUS:105024963045
T3 - Workshop Proceedings of the 7th ACM International Conference on Multimedia in Asia, MMAsia 2025 Workshops
BT - Workshop Proceedings of the 7th ACM International Conference on Multimedia in Asia, MMAsia 2025 Workshops
A2 - Chua, Tat-Seng
A2 - Wong, Lai-Kuan
A2 - Chan, Chee Seng
A2 - Tang, Jinhui
A2 - Ngo, Chong-Wah
A2 - Schoeffmann, Klaus
A2 - Liu, Jiaying
A2 - Ho, Yo-Sung
PB - Association for Computing Machinery, Inc
T2 - 7th ACM International Conference on Multimedia in Asia, MMAsia 2025 Workshops
Y2 - 9 December 2025 through 12 December 2025
ER -