High-Performance Computing (HPC) used to be the sole domain of university research and government labs. While there are some commercial use cases, those were traditionally limited to specific fields like manufacturing and the energy sector. However, HPC is now making its way into commercial organizations of all types to serve a variety of use cases. But traditional enterprise storage infrastructures focus on random IO common in databases and virtual infrastructures not on HPC’s high bandwidth transfers and mixed (large and small) IO.
There is a variety of commercial HPC use cases. Most organizations now have a large pool of unstructured data that needs sequential access for computational analysis. These active archives are active in every sense. They need to be rapidly accessible and fully responsive to the requesting application while also needing to retain data for long-term retention requirements. Those retention requirements are not just to meet legal or regulatory compliance; they are also to meet a corporate desire to reanalyze any data set at any point in time, potentially comparing it to a current data set.
Other use cases include Big Manufacturing, which is the design, simulation, assembly and testing of physically large and complex products like airliners. The Media and Entertainment industry has also become an industry where HPC use is on the rise, not only with the ubiquitous use of special effects, but also as projects like virtual reality and augmented reality start to become a necessity at more companies.
The growth of HPC in the commercial market is only beginning. Emerging use cases like precision medicine, deep learning, artificial intelligence and autonomous vehicles continue to drive HPC further into the enterprise and make its storage infrastructures even more critical.
Commercial IT doesn’t have time for the science project approach to HPC storage, it needs a more turnkey approach which includes optimized file systems, optimized hardware, and enterprise-class support.
The fundamental change caused by the expanding number of organizations using HPC and the expanding number of use cases create the need for a turnkey HPC storage architecture. Traditionally, HPC storage architectures were a bit of a science project, which included a cobbling together of open source software, white box servers and no name network hardware. Given the initial customer set, universities, the goal was to spend extra time driving down the cost. Organizations could overlook complexity in setup and operation if it saved money.
Commercial IT doesn’t have time for the science project approach to HPC storage, it needs a more turnkey approach which includes optimized file systems, optimized hardware, and enterprise-class support.
Designing or finding a storage solution that can meet the challenges of commercial HPC is critical for these organizations to continue to advance their HPC efforts. The problem is that both traditional enterprise solutions fall short, and as discussed, traditional HPC solutions are often not a good fit for the enterprise.
The first step in selecting a storage system for an HPC environment is to understand the HPC architecture itself. The first component of HPC is the compute cluster. The compute cluster is a series of dozens if not hundreds of servers acting as nodes in a cluster. Typically, the cluster software assigns a processing request to a particular node in that cluster or segments it across multiple nodes in the cluster.
The nodes with the assigned job then make requests of the HPC storage architecture, which is often also a scale-out cluster of servers acting as nodes managed as a single entity. Unlike HPC compute clusters though, most legacy enterprise and even HPC storage systems can’t target IO to a specific or most-available node. Instead, in most cases all IO flows through a single control node, which pulls the required data from the other nodes.
As a result, the scalability of the storage infrastructure is unlike the compute cluster. It’s only real value in scaling is to keep up with capacity requirements. In most cases it can’t leverage the additional storage nodes to offload IO, and it doesn’t typically leverage all of the available compute resources in the storage infrastructure.
The inability of the storage infrastructure to take ad-vantage of all the available compute and IO resources effectively is particularly problematic in the HPC use case. Ideally, an organization wants to centralize all HPC storage, across all use cases, into a single storage infrastructure.
The problem is that HPC IO patterns can vary greatly. While sequential access is more common than random access, the file sizes, especially today, vary greatly. Some workloads are similar to typical HPC workloads and deal with large file IO. But, new commercial HPC workloads are appearing that require the rapid sequential reading of a very large quantity (thousands or millions) of small files. It is not uncommon to find a mixture of both IO patterns within commercial HPC.
The unique and mixed IO workloads common in HPC create a problem for traditional enterprise storage systems. These systems were, for the most part, designed for database IO workloads, with a finite number of files and relatively small capacities (hundreds of terabytes instead of dozens of petabytes). They also were designed for relatively small IO accesses rather than the sequential access of large files. Enterprise storage systems do offer the turnkey experience and high-quality support that organizations want but fall short in dealing with the unique IO complexities of HPC.
The alternative is for the enterprise to look at more traditional HPC architectures. As mentioned, these architectures have a science project feel to them, which deters many IT professionals.
A commercial HPC solution needs to take the right attributes from the enterprise data center and combine them with the right attributes of the traditional HPC environment to create a single system that can span the variety of HPC use cases.
A commercial HPC solution needs to take the right attributes from the enterprise data center and combine them with the right attributes of the traditional HPC environment to create a single system that can span the variety of HPC use cases. In HPC, both commercial and traditional, the file system is at the core of the storage infrastructure.
The HPC file system must leverage both the CPU and storage resources of the storage cluster as it scales through the addition of nodes. The HPC file system needs to process requests in parallel across the cluster, putting every node to work and ensuring that all resources are used to their full potential.
Another responsibility of the file system is adequate protection of data from media failure. Most traditional HPC file systems use RAID for data protection but the RAID is often hardware based and operates per node. The per node RAID function becomes a bottleneck, limiting scalability, increasing complexity, isn’t as efficient across the cluster, and when using high capacity hard disk drives it is susceptible to long rebuild times.
Beyond the file systems is the hardware. The majority of commercial customers prefer a turnkey solution. Today, given the advances in server and media these turnkey solutions can leverage common, off-the-shelf (COTS) hardware. That hardware though should still be delivered by the HPC storage vendor, responsible for the software. A turnkey approach allows the commercial customer to see value in the investment sooner and saves them from having to vet all the components. Ensuring interoperability should be the responsibility of the HPC storage vendor.
Doing business with a historically vetted vendor is essential to the commercial HPC organization, certainly more so than the traditional HPC account. These HPC architectures have the potential to store data for multiple decades, and it is critical that the company is built to last.
There is little doubt that commercial HPC is here to stay and on the rise. The problem is that the storage systems, both from traditional enterprise vendors and traditional HPC vendors fall well short. IT needs an HPC class system that is also enterprise class. That system needs to deliver on the demands of the legacy HPC workload profile and also the new emerging, small file IO profile.
Storage Switzerland is an analyst firm focused on the storage, virtualization and cloud marketplaces. Our goal is to educate IT Professionals on the various technologies and techniques available to help their applications scale further, perform better and be better protected. The results of this research can be found in the articles, videos, webinars, product analysis and case studies on our website storageswiss.com
Panasas is a premier provider of high-performance storage solutions. Their ActiveStor® scale-out network-attached storage (NAS) supports industry and research innovation around the world, with the fastest plug-and-play parallel data storage system, optimized to accelerate workflows, simplify data management, and deploy easily as an appliance. Delivered as a fully integrated clustered appliance solution, ActiveStor incorporates flash and SATA storage nodes, a distributed performance-optimized file system, and client protocols. Using the Panasas PanFS® parallel file system and Panasas DirectFlow® parallel data access protocol, ActiveStor redefines what it means to scale. Performance, data protection, and manageability all grow as the solution scales. More information at www.panasas.com
High-Performance Computing (HPC) used to be the sole domain of university research and government labs. While there are some commercial use cases, those were traditionally limited to specific fields like manufacturing and the energy sector. However, HPC is now making its way into commercial organizations of all types to serve a variety of use cases. But traditional enterprise storage infrastructures focus on random IO common in databases and virtual infrastructures not on HPC’s high bandwidth transfers and mixed (large and small) IO.
There is a variety of commercial HPC use cases. Most organizations now have a large pool of unstructured data that needs sequential access for computational analysis. These active archives are active in every sense. They need to be rapidly accessible and fully responsive to the requesting application while also needing to retain data for long-term retention requirements. Those retention requirements are not just to meet legal or regulatory compliance; they are also to meet a corporate desire to reanalyze any data set at any point in time, potentially comparing it to a current data set.
Other use cases include Big Manufacturing, which is the design, simulation, assembly and testing of physically large and complex products like airliners. The Media and Entertainment industry has also become an industry where HPC use is on the rise, not only with the ubiquitous use of special effects, but also as projects like virtual reality and augmented reality start to become a necessity at more companies.
The growth of HPC in the commercial market is only beginning. Emerging use cases like precision medicine, deep learning, artificial intelligence and autonomous vehicles continue to drive HPC further into the enterprise and make its storage infrastructures even more critical.
Commercial IT doesn’t have time for the science project approach to HPC storage, it needs a more turnkey approach which includes optimized file systems, optimized hardware, and enterprise-class support.
The fundamental change caused by the expanding number of organizations using HPC and the expanding number of use cases create the need for a turnkey HPC storage architecture. Traditionally, HPC storage architectures were a bit of a science project, which included a cobbling together of open source software, white box servers and no name network hardware. Given the initial customer set, universities, the goal was to spend extra time driving down the cost. Organizations could overlook complexity in setup and operation if it saved money.
Commercial IT doesn’t have time for the science project approach to HPC storage, it needs a more turnkey approach which includes optimized file systems, optimized hardware, and enterprise-class support.
Designing or finding a storage solution that can meet the challenges of commercial HPC is critical for these organizations to continue to advance their HPC efforts. The problem is that both traditional enterprise solutions fall short, and as discussed, traditional HPC solutions are often not a good fit for the enterprise.
The first step in selecting a storage system for an HPC environment is to understand the HPC architecture itself. The first component of HPC is the compute cluster. The compute cluster is a series of dozens if not hundreds of servers acting as nodes in a cluster. Typically, the cluster software assigns a processing request to a particular node in that cluster or segments it across multiple nodes in the cluster.
The nodes with the assigned job then make requests of the HPC storage architecture, which is often also a scale-out cluster of servers acting as nodes managed as a single entity. Unlike HPC compute clusters though, most legacy enterprise and even HPC storage systems can’t target IO to a specific or most-available node. Instead, in most cases all IO flows through a single control node, which pulls the required data from the other nodes.
As a result, the scalability of the storage infrastructure is unlike the compute cluster. It’s only real value in scaling is to keep up with capacity requirements. In most cases it can’t leverage the additional storage nodes to offload IO, and it doesn’t typically leverage all of the available compute resources in the storage infrastructure.
The inability of the storage infrastructure to take ad-vantage of all the available compute and IO resources effectively is particularly problematic in the HPC use case. Ideally, an organization wants to centralize all HPC storage, across all use cases, into a single storage infrastructure.
The problem is that HPC IO patterns can vary greatly. While sequential access is more common than random access, the file sizes, especially today, vary greatly. Some workloads are similar to typical HPC workloads and deal with large file IO. But, new commercial HPC workloads are appearing that require the rapid sequential reading of a very large quantity (thousands or millions) of small files. It is not uncommon to find a mixture of both IO patterns within commercial HPC.
The unique and mixed IO workloads common in HPC create a problem for traditional enterprise storage systems. These systems were, for the most part, designed for database IO workloads, with a finite number of files and relatively small capacities (hundreds of terabytes instead of dozens of petabytes). They also were designed for relatively small IO accesses rather than the sequential access of large files. Enterprise storage systems do offer the turnkey experience and high-quality support that organizations want but fall short in dealing with the unique IO complexities of HPC.
The alternative is for the enterprise to look at more traditional HPC architectures. As mentioned, these architectures have a science project feel to them, which deters many IT professionals.
A commercial HPC solution needs to take the right attributes from the enterprise data center and combine them with the right attributes of the traditional HPC environment to create a single system that can span the variety of HPC use cases.
A commercial HPC solution needs to take the right attributes from the enterprise data center and combine them with the right attributes of the traditional HPC environment to create a single system that can span the variety of HPC use cases. In HPC, both commercial and traditional, the file system is at the core of the storage infrastructure.
The HPC file system must leverage both the CPU and storage resources of the storage cluster as it scales through the addition of nodes. The HPC file system needs to process requests in parallel across the cluster, putting every node to work and ensuring that all resources are used to their full potential.
Another responsibility of the file system is adequate protection of data from media failure. Most traditional HPC file systems use RAID for data protection but the RAID is often hardware based and operates per node. The per node RAID function becomes a bottleneck, limiting scalability, increasing complexity, isn’t as efficient across the cluster, and when using high capacity hard disk drives it is susceptible to long rebuild times.
Beyond the file systems is the hardware. The majority of commercial customers prefer a turnkey solution. Today, given the advances in server and media these turnkey solutions can leverage common, off-the-shelf (COTS) hardware. That hardware though should still be delivered by the HPC storage vendor, responsible for the software. A turnkey approach allows the commercial customer to see value in the investment sooner and saves them from having to vet all the components. Ensuring interoperability should be the responsibility of the HPC storage vendor.
Doing business with a historically vetted vendor is essential to the commercial HPC organization, certainly more so than the traditional HPC account. These HPC architectures have the potential to store data for multiple decades, and it is critical that the company is built to last.
There is little doubt that commercial HPC is here to stay and on the rise. The problem is that the storage systems, both from traditional enterprise vendors and traditional HPC vendors fall well short. IT needs an HPC class system that is also enterprise class. That system needs to deliver on the demands of the legacy HPC workload profile and also the new emerging, small file IO profile.
Storage Switzerland is an analyst firm focused on the storage, virtualization and cloud marketplaces. Our goal is to educate IT Professionals on the various technologies and techniques available to help their applications scale further, perform better and be better protected. The results of this research can be found in the articles, videos, webinars, product analysis and case studies on our website storageswiss.com
Panasas is a premier provider of high-performance storage solutions. Their ActiveStor® scale-out network-attached storage (NAS) supports industry and research innovation around the world, with the fastest plug-and-play parallel data storage system, optimized to accelerate workflows, simplify data management, and deploy easily as an appliance. Delivered as a fully integrated clustered appliance solution, ActiveStor incorporates flash and SATA storage nodes, a distributed performance-optimized file system, and client protocols. Using the Panasas PanFS® parallel file system and Panasas DirectFlow® parallel data access protocol, ActiveStor redefines what it means to scale. Performance, data protection, and manageability all grow as the solution scales. More information at www.panasas.com