Researchers at the Massachusetts Institute of Technology, MIT-IBM Watson AI Lab, Underwood International College, and the University of Brasilia have found that we are reaching computational limits for deep learning. The new study states that deep learning's progress has come with a "voracious appetite for computing power" and that continued development will require “dramatically” more computationally efficient methods.
“We show deep learning is not computationally expensive by accident, but by design. The same flexibility that makes it excellent at modeling diverse phenomena and outperforming expert models also makes it dramatically more computationally expensive,” the coauthors wrote.
The researchers analyzed 1,058 research papers found in the arXiv pre-print repository, as well as other benchmark sources, to understand how deep learning performance depends on computational power in the domains of image classification, object detection, question answering, named entity recognition, and machine translation. In order to understand why deep learning is so computationally expensive, the researchers further analyzed its statistical and computational scaling in theory.
They did so by conducting two separate analyses of computational requirements: (1) Computation per network pass (the number of floating-point operations required for a single pass in the network), and (2) Hardware burden (the computational capability of the hardware used to train the model). The researchers found that just three years of algorithmic improvement was equivalent to a 10 times increase in computing power.
They concluded that if progress continues along the same lines, deep learning's computational requirements will quickly become technically, economically, and environmentally prohibitive. However, all is not lost.
“Despite this, we find that the actual computational burden of deep learning models is scaling more rapidly than (known) lower bounds from theory, suggesting that substantial improvements might be possible," wrote the coauthors.
The researchers found that there are deep learning improvements at the algorithmic level taking place all the time. Some of these include hardware accelerators, field-programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs). Time will tell whether deep learning will become more efficient or be replaced altogether.