Reducing the Memory Cost of Training Convolutional Neural Networks by CPU Offloading

HTML  XML Download Download as PDF (Size: 694KB)  PP. 307-320  
DOI: 10.4236/jsea.2019.128019    797 Downloads   2,976 Views  Citations

ABSTRACT

In recent years, Convolutional Neural Networks (CNNs) have enabled unprecedented progress on a wide range of computer vision tasks. However, training large CNNs is a resource-intensive task that requires specialized Graphical Processing Units (GPU) and highly optimized implementations to get optimal performance from the hardware. GPU memory is a major bottleneck of the CNN training procedure, limiting the size of both inputs and model architectures. In this paper, we propose to alleviate this memory bottleneck by leveraging an under-utilized resource of modern systems: the device to host bandwidth. Our method, termed CPU offloading, works by transferring hidden activations to the CPU upon computation, in order to free GPU memory for upstream layer computations during the forward pass. These activations are then transferred back to the GPU as needed by the gradient computations of the backward pass. The key challenge to our method is to efficiently overlap data transfers and computations in order to minimize wall time overheads induced by the additional data transfers. On a typical work station with a Nvidia Titan X GPU, we show that our method compares favorably to gradient checkpointing as we are able to reduce the memory consumption of training a VGG19 model by 35% with a minimal additional wall time overhead of 21%. Further experiments detail the impact of the different optimization tricks we propose. Our method is orthogonal to other techniques for memory reduction such as quantization and sparsification so that they can easily be combined for further optimizations.

Share and Cite:

Hascoet, T. , Zhuang, W. , Febvre, Q. , Ariki, Y. and Takiguchi, T. (2019) Reducing the Memory Cost of Training Convolutional Neural Networks by CPU Offloading. Journal of Software Engineering and Applications, 12, 307-320. doi: 10.4236/jsea.2019.128019.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.