Low rank approximations for neural networks

idShaposhnikova N.V.

UDC 004.89
DOI: 10.26102/2310-6018/2020.30.3.018

Abstract
List of references
About authors

Today, artificial neural networks (hereinafter ANN) and deep learning have become almost indispensable in applications related to the tasks of machine vision, machine translation, speech to text conversion, text rubrication, video processing, etc. However, despite the presence of a number of classical theorems substantiating the approximating capabilities of neural network structures, the current successes in the field of ANNs in most cases are associated with the heuristic construction of the network architecture applicable only for the specific problem under consideration. On the other hand, deep ANNs have millions of parameters and require powerful computing devices for their functioning, which limits the possibilities of their application, for example, on mobile devices. Significant progress in solving these problems can be obtained using modern powerful algorithms of low-rank approximations for the parameters of the ANN layers, which will both simplify the process of developing a neural network architecture and will lead to significant compression and acceleration of the training of deep ANNs. Considering, for example, the core of the convolutional ANN as a four-dimensional array (tensor), we can construct a lowrank approximation for it with the effective implementation of its convolution with the vector (direct signal propagation in the network when generating the prediction) and differentiation with respect to the parameters (back signal propagation in the network when training). In this paper, we will consider the modern paradigm of machine learning and low-rank tensor approximations, and we will demonstrate the prospects for the tensorization of deep ANNs using a specific model numerical example corresponding to the task of automatic recognition of handwritten digits.

1. LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521(7553):436-444.

2. Zhang C., Patras P., Haddadi H. Deep learning in mobile and wireless networking: A survey. IEEE Communications Surveys & Tutorials. 2019;21(3):2224-2287.

3. Zhao Z., Zheng P., Xu S. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems. 2019;30(11):3212-3232.

4. Cybenko G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Systems. 1989;2(4):303–314.

5. Hornik K. Approximation capabilities of multilayer feedforward networks. Neural Networks. 1991;4(2):251–257.

6. Cohen N., Sharir O., Shashua, A. On the expressive power of deep learning: a tensor analysis. arXiv preprint. 2015;arXiv:1509.05009.

7. Cichocki A. Tensor networks for dimensionality reduction and large-scale optimization. Foundations and Trends in Machine Learning. 2016;9.4-5.

8. Lebedev V. Speeding-up convolutional neural networks using fine-tuned cpdecomposition. arXiv preprint, 2014;arXiv:1412.6553.

9. Novikov A., Podoprikhin D., Osokin A., Vetrov D. Tensorizing neural networks. In Advances in neural information processing systems. 2015;442-450.

10. Deng L. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine. 2012;29(6):141-142.

11. Bottou L. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010. 2010;177–186.

12. Rumelhart D., Hinton G., Williams R. Learning representations by back-propagating errors. Nature. 1986;323(6088):533–538.

13. Grasedyck L., Kressner D., Tobler C. A literature survey of low‐rank tensor approximation techniques. GAMM‐Mitteilungen. 2013:36(1):53-78.

14. Harshman R. Foundations of the Parafac procedure: Models and conditions for an explanatory multimodal factor analysis. UCLA Working Papers in Phonetics. 1970;1–84.

15. Tucker L. Some mathematical notes on three-mode factor analysis. Psychometrika. 1966;31:279–311.

16. . PyTorch, machine learning framework [Electronic resource]. – Mode of access: https://pytorch.org – Date of access: 10.06.2020

17. Colab, interactive cloud framework [Electronic resource]. – Mode of access: https://colab.research.google.com – Date of access: 10.06.2020

Shaposhnikova Nina V.

Email: shapninel@gmail.com

ORCID |

Federal State Budgetary Educational Institution of Higher Education «Reshetnev Siberian State University of Science and Technology»

Krasnoyarsk, Russian Federation

Keywords: machine learning, neural network, deep convolutional network, low rank approximation

For citation: Shaposhnikova N.V. Low rank approximations for neural networks. Modeling, Optimization and Information Technology. 2020;8(3). Available from: https://moit.vivt.ru/wp-content/uploads/2020/08/Shaposhnikova_3_20_1.pdf DOI: 10.26102/2310-6018/2020.30.3.018 (In Russ).

701

Full text in PDF