In a bid to achieve a mechanism that is more similar to how the human brain processes language, a team of researchers from MIT has come up with a combined software-hardware system, named SpAtten, according to a press release by the university.
The system is specialized to run the attention mechanism, which focuses on keywords rather than treating every word with equal importance. This is especially important since such mechanisms yield better results in natural language processing (NLP) tasks such as detecting positive or negative tone or predicting the words that might follow through context.
The system replicates how the human brain processes languages and enables more streamlined NLP with less computing power required, according to Hanrui Wang, the paper’s lead author and a Ph.D. student in the Department of Electrical Engineering and Computer Science. “We read very fast and just focus on keywords. That’s the idea with SpAtten.”
The attention mechanism explained
The attention mechanism was first introduced in 2015 and has seen great attention from the communities since it focuses on selectivity. The mechanism enables a system to understand which words or phrases in a sentence are most important by comparing them with the word patterns the algorithm previously encountered during its training phase. However, it comes with a cost since such power has high memory demands.
The system developed by the MIT researchers is special because, by utilizing both specialized software and hardware, they were able to run the attention mechanism more efficiently.
One key advance that enables this is SpAtten’s use of "cascade pruning." After the attention mechanism helps pick the keywords, SpAtten gets rid of the unimportant ones and eliminates similar computations and data movements. The unimportant heads are also deleted, reducing computational load and memory access in the process.
In addition to the software advances, the researchers had to develop a hardware architecture specialized to run their special system and the attention mechanism. The design they devised enables SpAtten to rank the importance of the keywords and heads in a small number of computer clock cycles.
While they haven't done a physical chip yet, the researchers were able to test the hardware design via simulation. When run against its next best competitor, a TITAN Xp GPU, SpAtten ran more than 100 times faster and was more than 1,000 times more energy-efficient than its other competitors.
The companies that use NLP models for AI workloads could benefit from SpAtten, according to the researchers. "Our vision for the future is that new algorithms and hardware that remove the redundancy in languages will reduce cost and save on the power budget for data center NLP workloads," Wang said.
"We can improve the battery life for mobile phone or IoT devices. That’s especially important because, in the future, numerous IoT devices will interact with humans by voice and natural language, so NLP will be the first application we want to employ."