The amount of data processed in HPC and AI environments is increasing aggressively.1
HPC and Al/ML data drive key processes for critical workloads, so preventing data loss and corruption while ensuring uptime has become just as crucial as peak performance.
As data volumes expand, HPC storage systems must grow to keep pace. Scaling necessitates the use of additional hardware, which inevitably decreases system reliability, resulting in longer rebuild and restoration times.
AI projects can quickly scale from terabytes to petabytes with 300-400PB capacities becoming increasingly common.2
HPC users experience approximately 10 storage system failures per year (nearly once a month) with an average recovery time of 1.7 days.3
As rebuild times increase, so does the chance of widespread, secondary failures requiring significant rebuild and restoration times. Effective recovery typically requires multiple storage experts whose services are costly and may be in short supply.
In addition to the loss of valuable data, failures can lead to potentially serious implications for project completion and business continuity.
Typical HPC downtime costs approach $127,000 per day, including the lost productivity of an expensive compute cluster sitting idle.4
At Panasas, reliability is in our DNA. We develop HPC and Al/ML solutions that deliver the reliable storage and data mobility you need to support critical operational priorities. With our parallel file system, PanFS®, data reliability increases with scale and your HPC system is protected in 5 ways:
1 Built-in reliability; no extras required
2 Eliminated tiering with Dynamic data acceleration
3 Patented per-file object erasure coding
4 Uniquely stable architecture
5 Data visibility and mobility
The system architecture in PanFS has been shown to reduce rebuild complexity, time, and expense.5
1 https://hyperionresearch.com/wp-content/uploads/2021/11/Hyperion-Research-SC21-Market-Update-Briefing_AI-HPDA-Growth_Norton.pdf
2 https://www.techtarget.com/searchstorage/opinion/Why-the-future-of-AI-storage-may-have-to-exclude-flash
3 https://www.panasas.com/wp-content/uploads/2020/04/Hyperion_Importance-of-TCO-for-HPC-Storage-Buyers_Q1-20_FINAL_2020-04-22.pdf
4 Ibid.
5 “Characterizing and Modeling Reliability of Declustered RAID for HPC Storage Systems” Z. Qiao, S. Laing, S. Fu, H. -B. Chen, B. Settlemyer – 2019
The amount of data processed in HPC and AI environments is increasing aggressively.1
HPC and Al/ML data drive key processes for critical workloads, so preventing data loss and corruption while ensuring uptime has become just as crucial as peak performance.
As data volumes expand, HPC storage systems must grow to keep pace. Scaling necessitates the use of additional hardware, which inevitably decreases system reliability, resulting in longer rebuild and restoration times.
AI projects can quickly scale from terabytes to petabytes with 300-400PB capacities becoming increasingly common.2
HPC users experience approximately 10 storage system failures per year (nearly once a month) with an average recovery time of 1.7 days.3
As rebuild times increase, so does the chance of widespread, secondary failures requiring significant rebuild and restoration times. Effective recovery typically requires multiple storage experts whose services are costly and may be in short supply.
In addition to the loss of valuable data, failures can lead to potentially serious implications for project completion and business continuity.
Typical HPC downtime costs approach $127,000 per day, including the lost productivity of an expensive compute cluster sitting idle.4
At Panasas, reliability is in our DNA. We develop HPC and Al/ML solutions that deliver the reliable storage and data mobility you need to support critical operational priorities. With our parallel file system, PanFS®, data reliability increases with scale and your HPC system is protected in 5 ways:
1 Built-in reliability; no extras required
2 Eliminated tiering with Dynamic data acceleration
3 Patented per-file object erasure coding
4 Uniquely stable architecture
5 Data visibility and mobility
The system architecture in PanFS has been shown to reduce rebuild complexity, time, and expense.5
1 https://hyperionresearch.com/wp-content/uploads/2021/11/Hyperion-Research-SC21-Market-Update-Briefing_AI-HPDA-Growth_Norton.pdf
2 https://www.techtarget.com/searchstorage/opinion/Why-the-future-of-AI-storage-may-have-to-exclude-flash
3 https://www.panasas.com/wp-content/uploads/2020/04/Hyperion_Importance-of-TCO-for-HPC-Storage-Buyers_Q1-20_FINAL_2020-04-22.pdf
4 Ibid.
5 “Characterizing and Modeling Reliability of Declustered RAID for HPC Storage Systems” Z. Qiao, S. Laing, S. Fu, H. -B. Chen, B. Settlemyer – 2019