Last
August, we discussed the launch of a new type of NAND memory design that eschewed PCIe slots or SATA
interfaces in favor of attaching flash directly to the DIMM channel. Developed
by Diablo Technologies, this new approach promised incredibly low latencies and
consistent performance that conventional PCIe architectures have difficulty
matching. Now, SanDisk has taken an interest Diablo Technologies and has
partnered with the company to release a shipping product.
Dubbed
ULLtra DIMM (Ultra Low Latency), the company has already signed IBM to ship the
new hardware in System x3850 and x3950 X6 servers, with up to 12.8TB of
installed flash capacity. The reason IBM is pushing ahead to adopt the modules,
as Diablo Technologies indicated to us back last summer, is latency. Early
benchmarks show that eXFlash (IBM’s only-slightly-better name for the tech)
hitting a write latency of 5-10 microseconds — far lower than anything else in
the NAND industry. The listed performance, per DIMM, is 1GB/sec read and
750MB/sec write.
The
market for these DIMMs is high-frequency
stock trading — SanDisk mentions
virtual desktop interfaces, transaction processing, cloud computing, and
virtualization, but most of these workloads aren’t so latency critical as to
demand 5-10 microsecond response times. HFT, on the other hand, is a market
where microseconds in response time really can make the difference between
making and losing money on a trade. It makes sense to implement the tech in
those fields first and wait to see if demand scales up to justify production
for other kinds of servers.
Flash grows up
This
is, in some sense, the natural progression for flash memory to take. Over the
years we’ve seen multiple attempts to reduce
NAND latency by marrying the NAND
to SATA controllers, SAS, PCI-Express, and now main memory. The key to
remember, before anyone starts jonesing for this product in the consumer space,
is that NAND is still orders of magnitude slower than conventional DRAM. A
latency of 5 microseconds is amazing for non-volatile storage, but RAM write
latency is measured in nanoseconds — thousands of times faster.
Of
course, keeping terabytes of database information sitting in local main memory
is prohibitively expensive — a four-socket server based on Intel’s upcoming Ivy
Bridge v2 Xeons with 24 DIMMs per socket and 16GB DIMMs would “only” allow for
about 1.536TB of local memory. If you instead had 12.8TB of local NAND flash,
that might change things somewhat. What we’re seeing here, at the high end, is
NAND reaching up to reduce the impact of the access-time pyramid gap (shown
above).
We’re
unlikely to see this technology showing up in consumer hardware at any point in
the next few years, but it’s not impossible. PCIe-based NAND storage has been
inching towards the consumer market as conventional SSDs drop well below the
$1/GB mark.
Hadn't really considered how important this was until we started shooting high speed video (48 fps) with non-compression. I can see why these newer technologies are so important! Keep up the good work...
ReplyDelete