SSDs and Parallel Storage, Part I
With the recent launch of the SSD-accelerated ActiveStor 14, we think it will be interesting to share some of our thoughts around SSD and its use in a parallel storage solution. This post is the first in a series of many on solid state storage.
What is a SSD?
A Solid State Drive (SSD) is an umbrella term for a device that behaves like a traditional Hard Disk Drive (HDD), but uses memory technology instead of a magnetic medium as its method of recording data. It is interesting to note that this is actually not a new concept, as SSD products have been around since the 1970s.
For the most part, SSDs are offered in the same form factors as traditional HDDs. This allows for easy, drop-in replacements into existing storage infrastructure.
Today, SSDs are available with either DRAM or NAND flash as the medium. Given the higher density NAND flash compared to DRAM, flash-based SSDs are now more prevalent, despite the fact that DRAM-based SSDs offer more performance.
For the purpose of this blog post series on SSD, we will focus on NAND flash-based SSDs.
What’s All the Hype?
The game-changing event that made SSD such a hot topic in recent years is simply the reduction of cost. With the rise in use of USB memory sticks, smartphones and tablets, the price of NAND flash memory has come down significantly in the last 10 years. SSDs in the 1970s through the early 2000s were extremely expensive in terms of $/GB, and mainly seen in government and military applications. Now, a consumer grade SSD can be easily obtained in retail channels at less than a dollar per GB. This, of course, gives storage vendors expanded component options in order to create faster solutions at prices suitable for almost all pricing tiers.
Why is SSD Faster than HDD?
The statement “SSD is faster than HDD” is not entirely true. It depends on the workload.
Data on a HDD is stored in concentric tracks on platters (the recording media). To read or write the data, an actuator arm with a read/write head moves on top of the platter to perform the actual read or write operation, from track to track, much like a DVD/Blu-ray drive.
The analogy to a DVD/Blu-ray drive does not stop there. A movie on a DVD or Blu-ray is really just a large file on the disc. Playing the movie from the disc or a HDD is a large block, sequential read operation. As the movie plays, there’s not a lot of movement on the part of the actuator arm as the motor spins the platter. The mechanical design lends itself to this kind of streaming, sequential file access, making the HDD quite capable at delivering good sequential read or write performance.
Now, imagine if there are multiple small files being accessed at the same time. In order to keep up with all of the read/write requests in this case, the actuator arm will have to move away from current file, across the platter to the next correct track, land on the correct block of the next file, read/write the data, and then move on to the next file and repeat the process all over again.
A HDD is simply not designed to accommodate this type of small block, random workload. HDD manufacturers, over the years, have done as much as possible to increase the performance of HDDs in this type of workload, mainly by increasing the spin rate of the motor.
SSDs are not limited by the mechanical movements of an actuator arm. Instead, there are multiple channels inside the SSD; each channel operates like an HDD’s actuator. Thus, the multiple channels operate independently, accessing multiple files at the same time. A SSD is better suited for small block, random workloads, where there are lots of concurrent requests for data. A HDD cannot physically do this.
Does this mean SSDs are going to replace HDDs? In a word: No.
HDD manufacturers have been quite good at optimizing their products for dollar per gigabyte. Compared to a SSD, a HDD simply provides more capacity at a lower price. At the same time, SSD manufacturers, up to this point, have been focused on optimizing for dollar per random I/Os per second (IOPS).
From the comparison chart above, it is clear that HDD and SSD have distinct uses in the big picture of any storage environment: HDDs for applications where sequential performance is important and the capacity requirement is large; and use SSDs for applications where random performance is important and capacity requirements are relatively small.
In the next blog post on the subject, we will dig in a little deeper into the internals of SSDs and how they work.
We’d love to hear what you think. Please send us your thoughts or feedback at firstname.lastname@example.org
We will be at the Supercomputing show in Salt Lake City, booth # 3421, so come see a demo of the SSD-accelerated ActiveStor 14 parallel storage system with Hadoop.