Pruning and neural architectures redesigning for deep neural networks compression in mobiles: A review

Ibtihal Ferwana*, Soumaya Chaffar, Samir Brahim Belhaouari

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

Mobile applications have been ubiquitous in our daily life. Given the success of Deep Neural Networks (DNNs) in image recognition tasks, DNNs are widely implemented on mobile phone applications. Due to the limited memory and energy on mobile phones, DNNs size and execution time are still roadblocks for efficient processing and instant inferences. Many transformative efforts were able to compress DNNs to the desired size for efficient speed, energy, and memory consumption. In this chapter, two areas of compression: pruning and redesigning efficient neural architectures were discussed. For each, recent advancements and highlight their strengths and limitations are discussed along with showing the improvements brought up by each selected methods and compare them. Comparisons are based on evaluating compression rate, inference time, and accuracy. The aim of this chapter would help practitioners who are implementing DNN based mobile applications to choose a compression approach that satisfies their requirements.

Original languageEnglish
Title of host publicationIntegrating Machine Learning Into HPC-Based Simulations and Analytics
PublisherIGI Global
Pages107-126
Number of pages20
ISBN (Electronic)9781668437964
ISBN (Print)9781668437957
DOIs
Publication statusPublished - 13 Dec 2024

Fingerprint

Dive into the research topics of 'Pruning and neural architectures redesigning for deep neural networks compression in mobiles: A review'. Together they form a unique fingerprint.

Cite this