We introduce Model Surgery, a novel approach for optimizing Deep Neural Network (DNN) models for efficient inference on resource-constrained embedded processors. Model Surgery tackles the challenge of deploying complex DNN models on edge devices by selectively pruning or replacing computationally expensive layers with more efficient alternatives. We examined the removal or substitution of layers such as Squeeze-And-Excitation, SiLU, Swish, HSwish, GeLU, and Focus layer to create lightweight ``lite'' models. Subsequently, these lite models are trained using standard training scripts for optimal performance. The benefits of Model Surgery are showcased through the development of several lite models which demonstrate efficient execution on the hardware accelerators of commonly used embedded processors. To quantify the effectiveness of Model Surgery, we conducted a comparison of accuracy and inference time between the original and lite models via training and evaluating them on the Imagenet1K and COCO datasets. Our results suggest that Model Surgery can significantly enhance the applicability and efficiency of DNN models in edge-computing scenarios, paving the way for broader deployment on low-power devices. The source code for model surgery is also publically available as a part of model optimization toolkit at https://github.com/TexasInstruments/edgeai-modeloptimization/tree/main/torchmodelopt.
Kunal Ranjan Patel, Parakh Agarwal, Manu Mathew, Arthur Redfern, Debapriya Maji, Kumar Desappan, Pramod Swami, Do-Kyoung Kwon, "Model Surgery : Run any Neural Network on Embedded Processors" in Electronic Imaging, 2024, pp 322-1 - 322-4, https://doi.org/10.2352/EI.2024.36.3.MOBMU-322