New Device Lets Personal Computers Process Massive Graphs Quickly and Easily
Computer science is filled with a variety of critical data points, and just one of those data relationships is found in graphs filled with nodes and connecting lines. Graphs help Google to rank its web pages, Facebook analyze its database for political trends, or even researchers isolate and identify certain neuron structures in the brain.
Gathering and storing this much information, however, is no small feat. That's why a team from the Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory created a device to alleviate storage issues. The small MIT flash storage system could eventually help offset costs of dynamic random access memory (DRAM) stored over multiple power-draining servers.
While flash storage is slower than DRAM in processing graph data, the researchers had an innovative twist on the flash chip. They added a flash chip array and "accelerator" to help the flash storage get to DRAM-like speeds. An algorithm quickly sorts through requests for graph data into an order that flash can access easily. The algorithm also merges requests in order to save on time, memory, bandwidth and other factors.
“The bottom line is that we can maintain performance with much smaller, fewer, and cooler — as in temperature and power consumption — machines.”
“The bottom line is that we can maintain performance with much smaller, fewer, and cooler — as in temperature and power consumption — machines,” said Sang-Woo Jun, a CSAIL graduate student and first author on a paper describing the device, which is being presented at the International Symposium on Computer Architecture (ISCA).
The flash tool could help save energy and money for servers dedicated to graph analytics, and the potential applications are extensive, according to the MIT team.
“Graph processing is such a general idea,” said co-author Arvind, the Johnson Professor in Computer Science Engineering. “What does page ranking have in common with gene detection? For us, it’s the same computation problem — just different graphs with different meanings. The type of application someone develops will determine the impact it has on society.”
One of those potential impacts could be in cancer research. The MIT team is currently working on a process that could quickly identify and sort through cancer-causing genes.
“Accelerators are supposed to help the host compute, but we’ve come so far [with the computations] that the host becomes unimportant,” Arvind said.
The innovation's impact hasn't stopped at MIT either. Computer scientists from throughout the U.S. not involved in the flash chip's development have commented as to its perceived usefulness.
“The MIT work shows a new way to perform analytics on very large graphs: Their work exploits flash memory to store the graphs and exploits ‘field-programmable gate arrays’ [custom integrated circuits] in an ingenious way to perform both the analytics and the data processing required to use flash memory effectively,” said Keshav Pingali, a professor of computer science at the University of Texas at Austin. “In the long run, this may lead to systems that can process large amounts of data efficiently on laptops or desktops, which will revolutionize how we do big-data processing.”
According to Jun, a long-term goal of the project would be to create a general purpose platform and library for users to develop their own algorithms for sorting graph analytics.
“You could plug this platform into a laptop, download [the software], and write simple programs to get server-class performance on your laptop,” Jun said.