Caringo: Fixed Content Storage
Home | Products | Solutions | Partners | Customers | Resources | FAQ | News | Company | Contact
     




CAStor Content Storage Software
CAStor in Publishing  
When Johannes Gutenberg invented his printing press in the 1400s, he couldn't have envisioned the impact it would have on communication throughout the world. From the rapid exchange of thoughts and ideas to mass production of books and periodicals, publishing has become one of the most important advancements in mankind's history

While the basic tenets of printing and publishing have remained the same through the years, one thing that has changed is how text actually makes it to the page. Technology has dramatically changed the business to the point where having the most cost-effective technology can be a competitive advantage. Storage was never an issue at the publishing house - unless you call storing racks of leaden letters to format typefaces storage. Today, storage in printing houses and newsrooms is digital in nature and requires massive amounts of computing equipment to manage workflows and to safeguard the words and pictures produced.

The history of data storage is much shorter than that of publishing itself and changes in the industry occur at a much more frenetic pace. While few publishing houses are formatting and printing manuscripts as Gutenberg originally did, many of these same places are storing their information using decades-old file systems. These complex systems - with various tiers of on-line, near-line and off-line storage - force companies to choose between the economics of long-term archiving versus ready access to their content.

Digital content in publishing includes any number of files in their final format, including articles, photos, illustrations, charts, graphs, maps, typefaces, drawings, templates, advertisements, PDFs and other corporate documents. This includes nearly everything that finds its way to the printed pages of books, magazines, newspapers and even online media but not database files, such as subscription information, that are subject to frequent changes and updating. Analysts predict that more than half of all information to be stored will be content, which can be stored in an infrastructure at a fraction of the cost of other storage systems.

An Example of How Content Addressed Storage (CAS) Works
The Library of Congress is one of the largest libraries in the world, spread over three buildings in Washington, D.C. It holds about 130 million items with 29 million books. To help identify the items in its collection, the Library of Congress catalogues each book with a serially based system of numbering called the Library of Congress Control Number (LCCN). This number becomes the unique identifier for the book and serves to ensure that one title is not confused for another.

Think of the CAS storage infrastructure like the Library of Congress for data objects. Your data object is submitted for storage and is assigned a control number that serves as a universally unique identifier for that object. Just as you don't need to know that a book is being stored on the third shelf in the tenth row of the second floor of the Thomas Jefferson Building or that it has been moved to a special collections room for the summer, neither do you need to know where your data object is physically being stored. Simply ask for the object using its control number and it will be delivered to you when needed.

The Benefits of CAS in Publishing Environments
Like many organizations, publishers should concentrate their time, effort and money to core business activities rather than attempting to discover the best ways to store their mission-critical data. Content creation and distribution of novels, articles, photo essays and other creative works should be the primary focus of publishing and media companies. Even those that have fully dedicated IT departments can better focus their energies to other projects that may improve overall productivity. CAS technology makes this possible through a number of groundbreaking improvements.

    Scalability — CAS solves one of a file system's biggest problems by seamlessly growing a flat address space that never needs reconfiguration when adding or deleting disk drives. Hardware can expand the storage space with no effect on the applications or users. Applications are free to access information anywhere, anytime. All they need is the universally unique id.

    Performance — Lots of small files can bring a file system to its knees. Not so in a properly designed CAS system. The CAS concept lends itself to a fully distributed implementation without bottleneck concentrators, mount points, file hierarchies and other collateral complexity that limits scalability and kills economics.

    Hardware Independence — Well-designed CAS systems are able to run on inexpensive commodity hardware architectures linked by standard Gigabit Ethernet to reduce overall costs significantly. Publishers can leverage the fastest and most cost-effective offerings available for maximum flexibility. There's no need to ever be locked into expensive proprietary systems again.

    Data Integrity — Similar to the process that the Library of Congress uses in assigning its LCCN to books, CAS creates a unique fingerprint/identifier for a data object by running a hashing algorithm over it. This technique offers a natural method to guarantee that the content was never changed, since re-running the same hashing algorithm should yield the same hash fingerprint if the content were untouched.

    Simplicity — A proper CAS system stores, retrieves and protects all reference information (aka content) on a fast-access, scalable single tier of storage that is less expensive than tape and less complex than traditional file systems. Using this real-time reference information, data is available to users when they need it.
Caringo's CAStor Provides State-of-the-art Technology
Although CAS is a surprisingly simple and straightforward concept, not all CAS architectures are created equal. CAStor was designed to overcome some of the shortcomings found in other vendors' CAS offerings. The company's CAStor software is so simple and compact it fits on a USB memory key. The self-configuration feature means that users can plug that key into the commodity hardware of choice, power it up and 60 seconds later a working CAStor node is available.
 CAStor in Health Care