SSDs and Parallel Storage, Part III
December 4, 2012 - 12:06pm
Now that we’ve talked about Solid State Disk (SSD) technology and what it’s good and not good for, it’s worth talking about how and why SSD technology is used in our new flagship parallel storage solution, ActiveStor 14.
The heart of any scale-out storage system is ultimately the parallel file system that runs as part of its storage operating system. For Panasas ActiveStor, that operating system is called PanFS. Unlike most other storage systems, PanFS is an object storage system. Objects can be best thought of as being at a level of abstraction half way between block storage and file storage. By using objects, PanFS can be extremely smart about how to store file data. Among other things, PanFS can detect small file reads and writes and differentiate them from large file streaming throughput. It can also choose where to store the file system namespace and file attributes (also known as file system metadata).
As you probably already know (possibly from reading my previous blog posts on SSDs and Parallel Storage Part I and Part II), SSD technology vastly outperforms traditional spinning hard drives when it comes to small block, random I/O performance (especially reads). So as we started designing ActiveStor 14 it was important to figure out the best approach to leverage SSDs in the PanFS file system, with the goal of transforming the product’s performance while keeping the final product cost effective. Fortunately the PanFS object storage architecture provides the intelligence needed to allow ActiveStor 14 to leverage both SSD and Enterprise SATA HDD technology in a compelling way—accelerating access of the file system namespace (metadata) and access of small files with SSD and large file performance with more economical enterprise SATA HDDs.
Understanding Big Data Workloads
An important piece of Panasas research involved determining how much SSD capacity customers would need and whether it would make a big enough difference in system performance to be worth the incremental cost of including SSD storage in the system. To do this, we extracted key data from a number of production file systems in the field to learn more about file size distributions and overall number of files stored. Customers for this study spanned multiple big data markets, including scientific research (government laboratories and universities), energy, manufacturing, life sciences, and financial services.
It may surprise you that typical HPC data sets involve a very large number of small files under 64KB. This remains true even for data sets associated with what are normally thought of as large-file throughput workloads. In the chart above, you can see that most of these HPC file systems consist of approximately 70% of small files by count. Without any additional data, you might at this point be thinking that an all-SSD storage solution might be the right approach even though it would be extremely costly on a per-TB basis.
Fortunately, the opposite is true. Even though most of the files are small files by count, large files dominate in terms of capacity. These large files are the ones primarily accessed in streaming workloads where SATA HDDs excel. You can see in the graph above that all of the small files under 64KB in size as a group typically consume less than 1% of file system capacity (not including file system RAID overhead or metadata consumption).
ActiveStor 14 and SSD Sizing
Next, we looked at how much SSD capacity would be actually needed to store small files and all file system metadata for these workloads, assuming a Panasas storage blade with 8TB of enterprise SATA disk.
The result was three ActiveStor models, each with a successively larger amount of SSD capacity to address a wide variety of big data workloads. The 81TB ActiveStor 14 configuration uses ten storage blades, each with two 4TB SATA HDDs and one 120GB 1.8” SSD. Even with SSD representing only 1.5% of storage capacity, this model meets the needs of most HPC workloads and maximizes $/TB of the overall system. The 83TB configuration steps up to a 300GB SSD per storage blade for workloads that are more metadata and/or small file oriented. Finally, the turbocharged 45TB ActiveStor 14T uses storage blades with two 2TB SATA HDDs and a 480GB SSD—a full 10.7% of capacity on the SSD for workloads in Finance and other markets that are particularly skewed towards small file performance. ActiveStor 14T also has double the memory per storage blade (16GB instead of 8GB) in order to maximize the amount of data that can be served out of RAM, contributing to its particularly high metadata and small file performance capability.
ActiveStor 14: File System Responsiveness Delivered
We are particularly proud of ActiveStor 14 because it is fundamentally designed for both large file throughput and small file IOPS workloads alike. We are confident that it is the first truly general purpose parallel file system in the HPC/big data space capable of handling mixed workloads with high performance. In contrast to storage systems that use SSD for caching where you are not guaranteed of having the namespace and/or small files that you need on the SSD tier when you need it, ActiveStor 14 makes accessing and managing your data fast and easy—one unified SATA HDD/SSD global namespace based on a unified, high performance SATA HDD/SSD tier.
When it comes to performance, the results are clear. Relative to our previous performance leading product, ActiveStor 12, we have measured random small file reads performing up to 11x faster, with multiple-times faster performance in a whole host of other small file and namespace access metrics as well—especially directory listing speed, file deletes, NFS v3 performance, and most other namespace-focused benchmarks. The end result is higher file system responsiveness across the board with the HDD and SSD each doing what they do best, all at a price point that represents solid value.
We hope this series on SSD storage was interesting and informative. Please visit our ActiveStor 14 product web page for more information.