SSDs and Parallel Storage, Part II
November 6, 2012 - 6:05pm
This post is the second in a series of many on solid state storage and its use in parallel storage solutions.
How SSDs Work
Inside all SSDs are two major components: Number of NAND flash memory chips and a controller. The number of memory chips determines the capacity of the drive. The controller, being the “brain” of the SSD, has the responsibility of making the collection of NAND flash chips look like fast HDDs to the host system.
This is not easy.
In order accomplish its job the SSD controller must perform the following tasks:
- Host interface protocol management - As a very fast analog to a HDD, a SSD must communicate to the host via a storage protocol such as SATA, SAS, or Fibre Channel. There are PCI Express-based SSDs in the market today. However, PCI Express is not a storage interface/protocol, yet.
- Bad block mapping - Same as with magnetic media, NAND flash blocks do go bad from time to time. If this occurs during a write operation, then it can be as simple as marking the block “bad” and remapping from a pool of spares. Should this occur during a read operation, then the SSD controller needs to attempt to recover the data, if possible, before remapping the block.
- Caching and power-fail protection -Use of a small amount of DRAM to speed up reads and writes is a common practice. However, as DRAM is volatile, data meant to be written to the NAND flash can be lost during an unexpected power outage or drive removal. SSDs thus have a secondary power circuit with either batteries or capacitors to allow for time to flush the data in the cache. Thus, in addition to running the caching policies, the SSD controller must also monitor the health of the secondary power circuit to ensure its ability to protect data in the cache.
- Data compression - Some SSD controllers implement data compression. The principle advantage is the possible endurance (more on this later) improvements for the SSD, if the data is compressible. For the SSD controller, implementing compression means managing the statistics and tables for the compression engine.
- Data encryption - Similar to data compression, some SSD controllers implement data encryption. Unlike data compression, however, encryption has become more necessary as a method to prevent data theft. To the SSD controller, this means the need to manage a crypto engine for all legitimate data traffic in and out of the SSD.
- Wear leveling - NAND flash can wear out over time. The mechanism for write operations in NAND flash is different than magnetic media. For magnetic media, a write operation can occur over an area that has been previously written to, by simply writing over it. For NAND flash, a previously written to area will have to be erased, before being able to store new data; this is referred to as a Program/Erase (PE) cycle. Each block in the NAND flash has a finite amount of P/E cycles before wear-out. Thus, to prevent any single block from wearing out earlier than the rest, the SSD controller must maintain a history of how many times each has been erased/programmed and spread the writes evenly across all available blocks.
- Garbage collection - Given previously written to blocks must be erased before they are able to receive data again, the SSD controller must, for performance, actively pre-erase blocks so new write commands can always get an empty block. With operating systems support for TRIM (SATA) and UNMAP (SAS) commands, the SSD controller needs to go erase the blocks that the operating system deemed to have no valid data.
- Media scrubbing and error correction - One interesting quirk about NAND flash is the concept of Read Disturb and Write Disturb. As read or write operation progress, it is possible for blocks adjacent to the one being accessed, to be “disturbed” where bit flips can occur. This is basically a form of silent data corruption. Thus, the SSD controller must proactively look for these bit flips and correct them before the number of bit flips grows beyond the ability of the SSD controller to correct them.
And, finally, the controller must aggregate the performance of the flash chips in the SSD to achieve the desired performance for the end user, with no performance degradation from any of the aforementioned tasks.
In the next blog post on the subject, we will discuss usage and application for SSDs.
As always, we’d love to hear what you think. Please send us your thoughts or feedback at firstname.lastname@example.org.