The world of storage has changed radically. The transition of fixed content from shared network drives and home folders is overshadowed by a massive increase in machine- and user-generated data (audio, photos and video). Expensive high-performance block or file storage can no longer economically hold this volume of data, nor is it advantageous to do so. This shift in datasets is bringing with it new requirements for accessibility and distribution—flipping the storage paradigm upside down. If the definition of Tier 1 is based on importance or value to the organization, then for many data-driven organizations, object storage will become the new Tier 1.
Databases or analytical environments (SPARK, HDFS, etc.) are considered transient destinations to run analysis, and cloud compute is ideal for processing data sets. However, once results are obtained, data sets are often deleted because of ongoing storage costs. In addition, web applications now need to service millions of distributed customers. The impact on the infrastructure? Expensive file or block storage (with single-file write or block-write high-performance) has now become the temporary target for analyzing data; and, object storage that was designed to handle massive throughput has become the more permanent environment that supports storage and management of ever-expanding and distributed data sets.
To remain competitive and manage the costs of storage, you need to go with the only infrastructure that can absorb the massive influx of data, which is object or scale-out storage. Object storage is capable of handling many connections, writing in parallel, providing massive throughput, consolidating data from distributed sources leveraging the web, and handling many protocol inputs (for seamless integration with analysis and cloud platforms). This provides the most cost-effective platform for data-driven organizations.
The management of data should be thought of in terms of “gathering,” “cataloging,” “analyzing,” “annotating” and “distributing.” A proper object storage platform with built-in metadata management leveraging NoSQL infrastructure can serve four of of these five functions. With the many parallel options for inputting and outputting data, object storage can readily place data into temporary compute space for analysis. The output can be stored with rich metadata and the data process can have its metadata annotated as necessary to constantly improve the organization of your corporate data resource.
Object storage optimizes your compute infrastructure and costs by allowing the sharing of these resources or leveraging the temporary compute infrastructure that the cloud provides. So, if object storage provides an instantly accessible platform for your data that reduces your current storage TCO while enabling analysis and distribution in an elastic fashion that you can quickly scale up and scale down, then I argue that object storage should be the new Tier 1 for any organization that is serious about extracting value from continued access to its unstructured data sets.
In last week’s blog, I looked at the challenges GDPR presents. In this week’s blog, we will take a look at how Caringo Scale-Out Hybrid Storage provides a simple and cost-effective solution that can enable … More Details »