Deep Neural Networks Optimization for Embedded Devices


Details
In this deep learning meetup, we will review the evolution of the DNN models from higher accuracy to optimized inference driven by embedded devices requirements.
We will discuss about various hardware accelerators options, the emergence of dedicated benchmarks and the need for system performance analysis.
We will explore state-of-the art approaches and principles for architecture optimizations and various techniques to speedup inference performance on target, with concrete examples for speech command application.
The presentation is based on a material delivered at Design Automation Conference 2018.
References:
[1] Vivienne S., Yu-Hsin C., Tien-Ju Y. & Joel E. Efficient Processing of Deep Neural Networks: A Tutorial and Survey, IEEE 2017, https://arxiv.org/abs/1703.09039
[2] Yu C., Duo W., Pan Z. & Tao Z. A Survey of Model Compression and Acceleration for Deep Neural Networks, IEEE 2017, https://arxiv.org/abs/1710.09282v5
[3] Barret Z., Vijay V., Jonathon S. & Quoc V.L. Learning Transferable Architectures for Scalable Image Recognition, 2018, https://arxiv.org/abs/1707.07012
[4] Esteban R., Sherry M., Andrew S., Saurabh S., Yutaka L. S., Jie T., Quoc L. & Alex K. Large-Scale Evolution of Image Classifiers, ICML 2017, https://arxiv.org/abs/1703.010
[5] Darryl D. L., Sachin S. T. & V. Sreekanth A., Fixed Point Quantization of Deep Convolutional Networks, ICML 2016, https://arxiv.org/abs/1511.06393
[6] Song H., Huizi M. & William J. D., Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR 2016, https://arxiv.org/abs/1510.00149
[7] Antonio P., Razvan P. & Dan A., Model compression via distillation and quantization, ICLR2018, https://arxiv.org/abs/1802.05668
[8] Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks, 2016, https://arxiv.org/abs/1603.05279
[9] Yundong Zhang, Naveen Suda, Liangzhen Lai, Vikas Chandra, Hello Edge: Keyword Spotting on Microcontrollers, 2018, https://arxiv.org/abs/1711.07128
[10] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen, MobileNetV2: Inverted Residuals and Linear Bottlenecks, 2018, https://arxiv.org/abs/1801.04381
[11] Alexander Wong, Mohammad Javad Shafiee, Michael St. Jules, muNet: A Highly Compact Deep Convolutional Neural Network Architecture for Real-time Embedded Traffic Sign Classification, 2018, https://arxiv.org/abs/1804.00497
[12] Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, Kevin Murphy, Speed/accuracy trade-offs for modern convolutional object detectors, CVPR 2017, https://arxiv.org/abs/1611.10012
[13] Alexander Wong, Mohammad Javad Shafiee, Francis Li, Brendan Chwyl, Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection, 2018, https://arxiv.org/abs/1802.06488

Deep Neural Networks Optimization for Embedded Devices