HPC Storage versus Legacy NAS: a Crazy Comparison
October 29, 2012 - 6:05pm
Browsing one of my favorite channel-focused magazines, CRN, I read with some amusement the comments of Jay Kidd, senior vice-president of product strategy and development at NetApp. Kidd was highlighting one of the major new features of ONTAP 8.1.1 – the nine-year-long integration of Spinnaker Systems’ clustering capabilities. “In the past we brought Ethernet storage…to the mainstream,” Kidd said. ”Now we are doing the same with clustering. We’re bringing it from the crazy world of high-performance computing to the world of commercial storage.”
As one of the HPC “crazies,” I can assure Mr. Kidd, there is nothing crazy about the world of high-performance computing, and as an industry we are indebted to the leadership that the HPC community has provided by funding the next-generation storage technologies required for enterprise big data. The truth is, distributed, scalable storage was developed to get around the technology limitations of NAS as pioneered by Kidd’s own company. Legacy scale-up NAS systems suffer from “islands of storage” residing on different physical nodes throughout the enterprise, making it hard to share data sets and limiting performance to the capabilities of individual nodes. To increase performance you have to purchase a bigger, beefier filer head.
By contrast, the HPC community has been instrumental in developing the computing, networking and storage architectures applicable to next generation big data workloads. It was the first to promote the use of commodity Intel architecture server platforms to emulate the functionality of expensive monolithic symmetric multi-processing (SMP) systems. As a result, Linux clusters are now the dominant computing architecture for big data applications.
In addition, file sharing is an essential feature required by high performance Linux clusters. Providing shared file access requires a central repository (a metadata server) that keeps track of each block of every file stored on disk. It also keeps intelligence about which compute node of the cluster is allowed to access that file. In contrast, legacy NAS architectures have a shared path for metadata and data, creating a major bottleneck for scaling both performance and capacity. A distributed file system with object storage manages the metadata services separate from the I/O path, providing very high performance and the ability to easily and massively scale.
Unfortunately for enterprise big data customers, NetApp has not delivered the performance or scale of HPC. Rather, the “islands of storage” have just gotten bigger while the underlying architectural issues remain.
Applications like financial modeling, computer aided design, genomic sequencing, semiconductor manufacturing, automotive and aeronautic design, oil exploration, satellite imaging, and Hadoop analytics are the value creators for companies. With its long tenure in HPC applications, Panasas has created a simple-to-use appliance that delivers the same ease of use that NetApp pioneered with legacy NAS, but with the performance and scale required for high performance computing and big data workloads. You might say our systems are insanely scalable, and if that makes us crazy, we’ll take it.