CAStor™ Content Storage
Access, store and distribute content from a single storage infrastructure.
- Hardware agnostic — Never get locked into expensive or dead end hardware
- Massively scalable — Add nodes to your cluster when and where you need them
- High performance — For primary storage duty and to support extreme scalability
- Guaranteed data integrity — For true long term viability, provable in court
- Self configuring — Easy to add nodes and tune for application requirements
- Self managing and healing — Reduce management time and expense
- Easy to interface to — No API's, just a simple HTTP-based interface
How does it work?
Start with a few nodes and grow to petabytes
Plug the CAStor USB memory key into a server node, turn it on and 60 seconds later, the CAStor server is up and running. Repeat as often as necessary to implement the size environment you require for content storage. CAStor scales from a few server nodes and 1.5TB up to hundreds of nodes and over 1PB. CAStor enables cluster in a box by taking advantage of multi-core processers in which a virtual storage node runs in each core.
No provisioning, configuring or impact to your applications
CAStor automatically configures and manages your cluster. When you need to upgrade hardware simply power up a more powerful node and retire an older node while the system is up and running! There is no need to stop the system, migrate your data or do any conversions. This operational ease and scalability in itself is enough to make many users choose CAStor.
All nodes work together as a cohesive unit
All nodes in the cluster function as a cohesive unit delivering scale on two critical dimensions, performance and capacity.
Stored files are automatically replicated
All files are replicated by default in CAStor. If one disk goes down, there is always another copy. The system will automatically detect and re-replicate your file in the background. In fact, the larger the cluster, the faster the recovery will be. There is no longer any need to waste time and money on RAID configurations. CAStor clusters can recover significantly faster than RAID.
| Hardware Agnostic |
Runs on standard hardware. Your choice. Storage clusters can be built on blades, towers, PCs and servers. Typically any X86 architecture computer with a Gig or more of RAM, 1 or more hard drives and Gigabit Ethernet. Tuning the RAM vs disk ratio optimizes for the average file size in your applications.
Mix and match hardware. As new hardware becomes available, just plug it into your CAStor cluster. Over time, you can remove the older technology if you wish or just let it continue to work for you. CAStor organically optimizes utilization of nodes in the cluster so it will make the best use of all hardware available. Running a small cluster (let's say 10 Terabytes) in a department is as viable as running a larger cluster (let's say 100 Terabytes) in a data center. CAStor does not dictate the hardware you use. You do.
No specialize servers. Leverage consumer-level pricing and performance for enterprise applications. No specialize hardware means less hardware to purchase and manage in you data center. CAStor makes it happen. You'll never be locked into one vendor's hardware again.
| Massively scalable |
Just-in-time procurement. A system to store content does you no good if it doesn't scale as your business grows. There are vendors in the market today who actually require you to move to an entirely different machine once you run out of space. CAStor simply scales. Start small and pay for a small system. Grow as your business or application needs grow and pay as you go.
CAStor was architected from the ground up for scalability. Scale to as many nodes as you need seamlessly without unloading and reloading data. Your information integrity can be provably guaranteed, over the short, medium or even very long term.
Scalability in Caringo's world means you can mix and match hardware. You can transparently migrate huge data stores to new technology during normal operation, without ever taking your system down.
| High performance |
Scalability means nothing unless you have the performance to support it. Robustness in a storage system depends heavily on its capability to swiftly re-replicate data copies orphaned by a disk failure. In other systems — because of lackluster performance — that doesn't happen before a second disk failure occurs, data loss is inevitable.
CAStor was designed for high performance from the ground up. Our world-class development team looked beyond the usual to maintain performance at near hardware speeds. CAStor's aggregate throughput increases with the number of nodes. Unlike RAID-based systems, CAStor actually recovers faster as cluster size grows.
High reliability with high performance. It delivers the performance demands necessary for cloud storage and various file workloads up to HPC environments as well as all the features and functionality you expect in archive storage for information governance and compliance. Caringo knows and respects that your time is money — and that's not just the time spent reading and writing to the cluster. Your users expect high reliability coupled with high performance access. In traditional storage, a lot of time is spent — or wasted! — installing software, interfacing applications, adding and installing nodes, migrating data, managing and maintaining the system. Caringo pays attention to the holistic experience of developing and deploying a massive content storage system. Performance and respect for the users time is always on our mind.
| Guaranteed data integrity |
Early hashing algorithms. Paul Carpentier, Caringo CTO, invented and patented the concepts that define CAS (Content Addressed Storage). These patents, together with FilePool, the company he co-founded, were purchased by EMC in 2001. Part of the initial CAS concept involved running a hashing algorithm over an object and using the resulting hash as the unique fingerprint/identifier for that object. This technique offers a natural method to guarantee that the content was never changed, since re-running the same hashing algorithm should yield the same hash fingerprint if the content were untouched. Unfortunately that approach only remains valid as long as the hashing algorithms being used are beyond suspicion, i.e. no one can "reverse-engineer" different content having the same fingerprint.
The need for upgradable hashing technique. In 2005, the MD5 hashing technique employed by many CAS vendors was successfully attacked by Chinese academic cryptographers; thereby starting the decay of integrity guarantees on petabytes of information secured using these algorithms. It is a fact of life and mathematics that hashing algorithms naturally go stale as computing power increases and cryptographers engage that compute power to successfully attack yet another hash. It became clear that the CAS market was in dire need of a method to transparently upgrade the hashing algorithms without needing to unload and reload the data or modifying applications.
Caringo's Content Integrity Seal can last a lifetime. Caringo has invented a breakthrough process (patent pending) for transparent and uncompromising field upgrading of the hashing algorithm used in CAStor, effectively scaling the strength against readily available computer power. The result is that your hash never will go stale. Caringo's Content Integrity Seal provides the first industry example of guaranteed data integrity that can literally last the lifetime of your information.
True long term viability, provable in court. Caringo provides varying levels of information integrity based on your application, regulatory and compliance needs. For compliance and regulatory applications, our patent pending integrity seals are externalized and logged so that their validity can be confirmed in court by independent experts, without ever referencing any Caringo software in the process.
| Self configuring |
Easy to implement. Plug a Caringo USB key into a computer, and the key will load all operating system and application software needed to turn the system into a CAStor node. Add another machine and it automatically configures itself, joins the network and activities of the other CAStor nodes in the cluster. This CAStor self-configuration is fast and automatic. After all, in a highly scalable system like CAStor, it is a requirement, not an option.
| Self managing and self healing |
Simple to administer. A CAStor cluster manages itself. That is, processes running in real time organically balance the storage and CPU loads and check object replicas to be sure they match the policy set for that object. Should a stray cosmic ray flip a bit somewhere, the system automatically recovers. If a disk goes down, all other disks on all other nodes participate to recover any data that was on the bad disk. The larger the cluster, the faster the recovery time.
Content protection. Your content is automatically replicated and distributed for protection and preservation as standard procedure. The health of your storage is continuously monitored and any issues are automatically addressed. Recovery works down to the disk level — so that a node can keep operating even when it has a bad disk. In fact, if your hardware allows hot swapping of disks, you can swap in a new disk while the system is up and running. Otherwise, wait until the system has systematically transferred all node contents to spare capacity to retire, refurbish and recycle the node. While you run your business, CAStor will run the cluster.
Lower total cost of ownership. Reduce your management costs with self-managing software. Lower your hardware requirements by using standard x86 servers running CAStor. Purchase fewer, expensive specialized servers for your data center.
| Easy to interface to |
API's tend to be complex and must be ported to each new environment. This means many test beds for a vendor and many ports. For a customer, it means waiting for a port to your environment, it means complex and convoluted methods of interfacing to programs, it means training and maintaining platform specialists and it means long development cycles.
CAStor takes a different route with no APIs. Applications interface to CAStor using industry standard HTTP 1.1. No API's. No porting. Quick development and the ability to utilize existing HTTP client libraries in use throughout the industry.

