For the past few years, the value of object storage in the HPC space has been to provide economical storage that can be metered with multi-protocol support and interfaces to traditional and new RESTful applications. At the heart of the value proposition is being able to offload primary storage. In most cases, we are talking SSD-based devices running parallel file systems needed for modeling and rendering. That said, we are starting to see some new emerging use cases that, for read-intensive workloads, perform better than parallel file systems.
The primary requirements driving these new use cases are the need for scalable multi-protocol support to enable storage for variable file types/sizes and to enable access to various types of clients. I point at “scalable” here because these use cases are exposing the limitations of some object storage solutions based on underlying Linux file systems AND file-system-based solutions that provide an S3 interface.
As with any technology, implementations vary. For other vendors’ object storage solutions, we often see the underlying file system and reliance on beefy cache devices (to meet performance requirements) as the limit in scalable, multi-protocol support. In the parallel file system world, we often see the object interface layer, usually some form of S3 interface, as the bottleneck.
This leads us to why Caringo Swarm’s design (as detailed in this whitepaper) is quite advantageous for read-intensive HPC workloads. Operational benefits aside (like booting from bare metal and using any type or size of hardware), there are 3 high-level technical benefits to leveraging Swarm’s parallel architecture for HPC workloads:
- Optimized S3 protocol support with fast parallel uploading that leverages Swarm’s parallel architecture
- Pure object storage with no front-end caching; all nodes can handle all tasks
- True read/write/edit multi-protocol interoperability between NFS and S3
When you add these 3 benefits of Swarm’s architecture to the super-fast, low-latency networks and custom hardware configurations most HPC organizations have access to, you begin to solve the primary challenge for enabling collaboration—supporting variable file types and sizes and scaling data sets well beyond 100s of Petabytes.
If you are interested in learning more, we have a webinar coming up with our partner Boston Limited, a systems integrator focused on the HPC market. The webinar, titled Object Storage for High Performance Computing (HPC), will feature Konstantinos Mouzakitis, Boston Limited Senior HPC Systems Engineer, and Alex Oldfield, Caringo Solutions Architect. This is an excellent chance for you to have access to highly experienced technical resources and to ask questions. Register now to watch live or on demand.
From monitoring volcanoes and earthquakes to crop yield analysis, wildlife and insect migratory patterns, JASMIN is giving mankind unrivaled insight into our natural world. The JASMIN facility is a "super-data-cluster" that delivers infrastructure for data … More Details »
Last April, I had the pleasure of speaking at the Salishan Conference on High Speed Computing where I presented two interesting use cases for object storage in an HPC ecosystem. The first, and more traditional, … More Details »