You are here
Panasas and IDC Scale-Out NAS
Panasas is at the forefront of Scale Out NAS. This question and answer video explores the Panasas approach to data storage, users, and how Panasas is prepared for the future.
Panasas has been in the marketplace for 10 years now. What are some of the most important things you’ve seen happen in the HPC storage market? and of course Panasas has been a part of all that and a driver.
For me, I think that Panasas took a radical approach to storage, starting from the installation-- first ideas of the product--through being established for ten years now, is in the field satisfying customers for ten years. We developed an architecture which was different and radical. It was based on object models rather than on blocks and the object models were blobs of data that we partitioned into separate storage things. We didn't protect them locally, we protected them as a distributed system with RAID, from the beginning. We didn’t use traditional hardware RAID ever. We were a software RAID from the beginning and, in fact, we were a software RAID operating per file. Separate RAID equation, a separate reconstruction (????) gave us parallel reconstruction early on so we could speed up the reconstruction and recovery from problems as the systems got bigger. That whole object RAID architecture that we envisioned from very beginning has really worked and with the latest versions, where we add RAID 6+ for very high availability even in the presence of an unacceptably large number of failures, has continued to emphasize the success of decomposing the whole storage thing into a series of independent RAID equations per file and providing service to every file we can. I'm very pleased with that overall architecture I think it has really succeeded. I think the integration into an appliance model where we have tried very hard to let our customer specify what their goals are and allow us to manage the data, to manage the layout, to low balance capacity, balance the infrastructure we get it to thousands of computers, those thousands of computers are all working together on the overall solution for the customer. I think this has really come along nicely.
How closely do you work with your users to understand their requirements as they evolve?
Talking to your users and trying to understand what it is that really matters to them is a very important thing. It can't be underestimated. Finding the right match between technology and application needs is a big deal. Whether it has been the high availability-reliability emphasis in life sciences, whether it’s about the extremely high random access in media, whether it's about supercomputing need for speed, these are the things that shape the way you design a system, absolutely.
In the ten years that Panasas has been out there, how have the main requirements changed for storage?
I think that the key one, the biggest one the people will recognize and has happened in the last decade is this transition from storage performance depending upon the most expensive, fastest disks to storage performance depending upon the best integration of Flash technology and Disk technology. I think we started that trend about five years ago in products. I think we've had, the market as a whole, a lumpy set of choices where you could, sort of, jump completely to another approach and now we're starting to get it better integrated into automated tiering within the same offerings. But we're nowhere near done, the radical effects flash is having on storage is just starting.
Flash, when it started--because of the expense--was deployed mainly, in selective ways, metadata management and so forth. Do you see it broadening?
Well yes, you pin some technology, some files it the technology that you know to be the right one. We can get smarter about detecting that automatically, as in tiering as opposed to straight caching but the biggest advantage for flash is the really, really low seek time, if you will, for reads. That is dramatic and so metadata, and all things small, absolutely have to make their way into flash unless they really aren’t going to be used. And then they should be packaged up into big bundles and pushed out. But we are seeing a much richer strategy for using flash. The other thing is, disk has this wonderful advantage. You can write it as many times as you want and that's not so true for flash and the other solidstate technologies. So when we optimize flash in amongst itself, we evolve in a different direction where we'll pay extra reads to avoid extra rights where in disks, that was never a necessity.
One of the things we hear very often is the kind that keep users up at night, is just the sheer number the files that they're creating and keeping track of the file names and (????) moved to different parts of the system, that whole thing which I guess falls to the general category of metadata management. Are you hearing the same thing?
Absolutely. In the practical issue, you just have to make your metadata systems fast because you’re going to accommodate all the new tools and the old tools aren’t going to offer much improvement. The new tools are going to offer search but search is not going to replace the need for... it is an implementation of fast metadata. In the scalability of your metadata system, we've partitioned our metadata over many components in our big installations are thousands of machines and hundreds of metadata managers. Being able to do that well is essential. When you get into, you mentioned a little while ago, analytics and machine learning, when you get into that space, it creates opportunities to actually innovate in what you understand about your data up into the limit that your customers will let you.
Another question is, it seems that storage is becoming, in a lot of instances, more and more distributed. We’re even having things like some storage happens in the cloud, some storage happens on (????) and so forth, the hybrid kinds of things. How challenging is that for a storage company to deal with?
You know we've always been in hierarchies. We've always had to push really cold data to some less expensive, less accessible interface. I think that it's not so much a challenge technologically. It’s much more of a question of who's responsible for what when you start to do that. I think that products are being experimented with to decide what the customer will be comfortable with, with the integrity of their overall solution. We're still a strong believer that we want to be the front... we want to directly make promises to our customers and we want to follow through and take care of their data.
One of the other things that has been cropping up more and more is the need, and this is not exactly the business of a storage company, for policy regarding what gets stored and what doesn't get stored. Tom Lange of Procter & Gamble famously said that nobody's going to store data in a business that the business can't make money from. Do you see much happening there in the way of policy formation about what gets stored and what doesn’t?
The storage management initiative has been very focused on this question of establishing a forum for creation and regularization of policies on data but I think the big trend is companies are finding more and more ways to make value from their data. As they can afford to keep online any kind of information which characterizes their workload, their users, their operations, their keeping that, the Big Data trend. They're keeping that data and they're asking their teams to find new ways to make value out of that and I think the price structure pretty much works for that, now. So while I agree with Tom, I think that the the trend will be finding new ways to make value out of my data.