Valentine’s Day is a good day to reflect on the people and things that you love. And, one of the things the Caringo Engineering team loves the most is a good challenge. One problematic aspect of big data is handling individual objects that are very large. Case in point: one of our customers stores memory dumps from supercomputer runs that are multiple terabytes in size. Video streaming is another common application that can create files of arbitrarily large size.
Traditional file systems have problems with these applications because individual files are tied to specific disks, each of which has finite capacity. Swarm deals with large objects differently, internally breaking them up into pieces of manageable size and seamlessly assembling the pieces when they are needed. This means, for example, that a Swarm cluster can easily store objects that are much larger than any of the disks managed by the cluster. Swarm can even do this on streaming (HTTP chunked transfer encoded) writes without missing a beat. If you have lots of objects like this, it’s easy to just add more resources to a running cluster and the cluster will keep accepting more data, large or small.
Swarm supports writing up to 4 terabyte objects, by default, as this configurable limit is large enough for most applications. However, with some simple settings changes, Swarm can accept objects of 64 terabytes and beyond. Swarm will protect these large objects just as well as smaller ones, and with Swarm’s elastic content protection, users can find the right balance of data footprint and protection.
Few people have the patience to write a single multi-terabyte object. Because of this, we developed a parallel upload capability, which automatically breaks up the object into parts and uploads those parts in parallel. Once the parts are in Swarm, a final Swarm operation stitches the parts together into the final object. So, a larger cluster not only helps to store more data, it also provides resources that allow more parallelism during ingest.
Thanks to this and other features that support extremely large objects, the flexibility and capability of Swarm really shine for easily storing, protecting, and providing access to data sets that are too cumbersome for traditional file systems.
Want to fall in love with Swarm object storage? Download our complimentary Swarm Developer Edition 10TB Cloud.
STFC uses Caringo Swarm object-based storage software to streamline data management and data access for long-term distribution. More Details »