TY - CHAP
T1 - Pruning and neural architectures redesigning for deep neural networks compression in mobiles
T2 - A review
AU - Ferwana, Ibtihal
AU - Chaffar, Soumaya
AU - Belhaouari, Samir Brahim
N1 - Publisher Copyright:
© 2025, IGI Global Scientific Publishing.
PY - 2024/12/13
Y1 - 2024/12/13
N2 - Mobile applications have been ubiquitous in our daily life. Given the success of Deep Neural Networks (DNNs) in image recognition tasks, DNNs are widely implemented on mobile phone applications. Due to the limited memory and energy on mobile phones, DNNs size and execution time are still roadblocks for efficient processing and instant inferences. Many transformative efforts were able to compress DNNs to the desired size for efficient speed, energy, and memory consumption. In this chapter, two areas of compression: pruning and redesigning efficient neural architectures were discussed. For each, recent advancements and highlight their strengths and limitations are discussed along with showing the improvements brought up by each selected methods and compare them. Comparisons are based on evaluating compression rate, inference time, and accuracy. The aim of this chapter would help practitioners who are implementing DNN based mobile applications to choose a compression approach that satisfies their requirements.
AB - Mobile applications have been ubiquitous in our daily life. Given the success of Deep Neural Networks (DNNs) in image recognition tasks, DNNs are widely implemented on mobile phone applications. Due to the limited memory and energy on mobile phones, DNNs size and execution time are still roadblocks for efficient processing and instant inferences. Many transformative efforts were able to compress DNNs to the desired size for efficient speed, energy, and memory consumption. In this chapter, two areas of compression: pruning and redesigning efficient neural architectures were discussed. For each, recent advancements and highlight their strengths and limitations are discussed along with showing the improvements brought up by each selected methods and compare them. Comparisons are based on evaluating compression rate, inference time, and accuracy. The aim of this chapter would help practitioners who are implementing DNN based mobile applications to choose a compression approach that satisfies their requirements.
UR - https://www.scopus.com/pages/publications/105009395093
U2 - 10.4018/978-1-6684-3795-7.ch005
DO - 10.4018/978-1-6684-3795-7.ch005
M3 - Chapter
AN - SCOPUS:105009395093
SN - 9781668437957
SP - 107
EP - 126
BT - Integrating Machine Learning Into HPC-Based Simulations and Analytics
PB - IGI Global
ER -