July 10, 2015

Cutting cost and power consumption for big data




New network design exploits cheap, power-efficient flash memory without sacrificing speed.

(july 10, 2015)  Random-access memory, or RAM, is where computers like to store the data they’re working on. A processor can retrieve data from RAM tens of thousands of times more rapidly than it can from the computer’s disk drive.
But in the age of big data, data sets are often much too large to fit in a single computer’s RAM. The data describing a single human genome would take up the RAM of somewhere between 40 and 100 typical computers.
Flash memory — the type of memory used by most portable devices — could provide an alternative to conventional RAM for big-data applications. It’s about a tenth as expensive, and it consumes about a tenth as much power.
The problem is that it’s also a tenth as fast. But at the International Symposium on Computer Architecture in June, MIT researchers presented a new system that, for several common big-data applications, should make servers using flash memory as efficient as those using conventional RAM, while preserving their power and cost savings.

The researchers also presented experimental evidence showing that, if the servers executing a distributed computation have to go to disk for data even 5 percent of the time, their performance falls to a level that’s comparable with flash, anyway.

In other words, even without the researchers’ new techniques for accelerating data retrieval from flash memory, 40 servers with 10 terabytes’ worth of RAM couldn’t handle a 10.5-terabyte computation any better than 20 servers with 20 terabytes’ worth of flash memory, which would consume only a fraction as much power.

read entire press  release >>