Bookmark and Share

The rate of growth of corporate unstructured data combined with the cost and complexity of current backup and archiving solutions are driving many organizations to roll the dice and eliminate continuous protection of their data. As data sets grow in size, backup windows elongate from hours to days and sometimes even weeks, making it unrealistic to do continuous backups. Storing data on tape makes it impossible to guarantee integrity without significant effort and makes data inaccessible increasing the risk of a lapse in operations in the case of a disaster. Caringo object storage software powered by CAStor solves these issues by offering a ubiquitously accessible storage system with built in replication that enables you to meet your recovery point objectives (RPOs) and recovery time objectives (RTOs).

Intelligent Automation & Responsive Access

The intelligent automation built into our platform covers storage management, system optimization, and continuous data protection. Complete granular control of an object is based on metadata that you specify including how many replicas you want, the location, if it can be modified and when it should be deleted. Ongoing system optimization, plug-and-play expansion and instant provisioning make adding additional storage and locations easy; enabling you to maintain the responsiveness you need to meet your recovery and business continuity objectives.

  • Customizable metadata — Metadata is the DNA of digital content. It describes where an object came from, what's in it, who can access it, how it's protected, replicated, and distributed, and how long it is destined to live. CAStor manages object life cycles from creation to expiration, all without intervention.
  • Instant object identification — Universally Unique Identifiers (UUIDs) are assigned to each object for life. Because the UUIDs are location independent, applications do not need to know where a specific piece of content is, yet they can find it anywhere in a node, cluster or remote cluster – instantly.
  • Self-managing clusters — Symmetric cluster architecture evenly distributes processing across all nodes. Operation requests go to the nodes best able to handle them to optimize performance. CAStor organically balances storage and CPU loads.
  • Self-healing clusters — Polices set for each object are continuously checked and enforced. Should something fail, the system automatically recovers. If a disk goes down, all disks participate in the recovery effort.
  • Continuous integrity checks — CAStor's Health Processor continuously monitors data integrity and cardinality (number of replicas) and will automatically heal any degradation or non-conformity. Metadata life point rules are also checked and enforced.

Johns Hopkins University CIDR protect their genotyping and statistical genetics data.

Johns Hopkins University Center of Inherited Disease Research (CIDR) is a centralized facility providing genotyping and statistical genetics services for investigators seeking to identify genes that contribute to human disease. The CIDR is a high throughput genotyping lab for projects funded by the National Institute of Health (NIH) and can generate thousands of Gigabytes of new data daily performing genetic analyses.

The Challenge

  • Huge storage capacity needs on a shoestring budget
  • NAS systems could not deliver throughput needed
  • Need to protect legacy hardware investment while progressing to object storage
  • Requires integration to industry standard CIFS file system

The Solution

  • Mix and match old and new hardware in the cluster for massive CapEx savings
  • Start small and seamlessly scale configuration 70 fold since initial deployment
  • Multi-protocol support of CIFS and HTTP to integrate scan images and workstations
  • Expand capacity and server nodes at lower TCO each and every year guaranteed
  • Storage expansion with no provisioning ever versus SAN and NAS complexity

The Results

  • Investment protection and future proof with standard storage servers
  • Self-managing and self-healing cluster hits the bottom line with compelling OpEx savings
  • Exceed expectations for the most stringent throughput and retention SLAs
  • Dynamic configuration management equates to continuous data availability and no downtime

The Configuration

  • A CAStor storage solution for genome research applications utilizing multi-protocols to integrate TBs per day of real-time scanned images loaded into the object storage repository and accessed by a network of remote research workstations.
CIDR uses Caringo CAStor object storage