What Went Wrong With TLC NAND

When Samsung pushed the envelope and introduced their TLC-NAND flash memory for general use, it had the makings of a landmark innovation. TLC (triple-level cell) NAND is cheaper to manufacture than either SLC or MLC NAND, as it works by fitting more data  into the same NAND cell—three bits per cell, rather than the one bit or two bits that single level (SLC) and multi-level (MLC) NAND can put away in one cell space.  You’d think TLC NAND would take over the market in short order; no reason to waste resources manufacturing more expensive SLC or MLC NAND.

When introduced the new TLC-NAND solid state drives seemed to have conquered all previous difficulties of TLC NAND with some state of the art firmware. Read speeds looked pretty; Samsung SSD 840’s 500MB/s is nothing to sneeze at and reliability was a non-issue.

But, mounting excitement over the potentially cost-effective storage innovation waned as performance problems were discovered.

In fact, it wasn’t long before users began reporting a new and extremely debilitating challenge. Those pretty read-speeds, that near 100% reliability: those only counted for new, freshly written data. Data that had been sitting on the drive for, say, all of eight weeks, would have deteriorated to a level that it could only be read at much slower speeds.

Meaning, by the time you had data sitting static on your drive for six months or a year, those previously high read-speeds would have been reduced to processing at a snail’s pace.

It turns out that this is a problem inherent in the TLC system. Although there’s a voltage drift that happens in every NAND drive over time, in SLC and MLC NAND, this drift is small, consistent, and can be accounted for in the reading algorithms. When you lock three data bits in a cell, though, data deterioration speeds up immensely. What’s worse, there’s no longer a generalized algorithm that can take all the shifting into account, so the old data is simply blurred.

Samsung has introduced two firmware updates in an attempt to smooth over the problem. The first, a fancy algorithm that was meant to take account of the voltage drift and factor it in where necessary, completely failed at solving the issue.

The second, while more successful, offers a somewhat unpleasant workaround: The drive is set to rewrite all data regularly, so nothing ever is old.  It does manage to get around the problem: if all data is new data, it will all be readable and quickly accessible. However, since every NAND SSD has a finite number of writes or rewrites, this isn’t an ideal fix.

What does this all boil down to?
Simply that TLC NAND is not the future of data storage, and it doesn’t even have a good seating in the present. If your data matters in the long term, you’ll want to go with a higher-quality NAND: MLC-NAND for your basic SSD needs, or SLC-NAND for industrial use or super-sensitive data storage. There’s no other way about it.

Related posts: