By Paul Speciale, CMO, Scality
IT professionals are looking for scalable, flexible, affordable storage platforms for their mushrooming volumes of unstructured data. Gartner forecasts that large enterprises will triple their unstructured data capacity stored as file or object storage on-premises, at the edge, or in the public cloud by 2026 as compared to 2021.
It’s important, then, to be clear about what object storage is and how it relates to unstructured data so you can make the smart storage decisions that will best serve your organization. To that end, here’s a basic primer on object storage, also known as object-based storage.
Object storage and unstructured data
To understand the need for object storage, we first need to understand the challenge of unstructured data, which is data that doesn’t come neatly organized — in a spreadsheet, for instance — or adhere to a traditional database format. Common examples include emails, images, videos, audio files and IoT sensor readings.
More and more data is now unstructured. In fact, analysts at IDC expect that unstructured data will account for at least 80-90% of enterprise data by 2025.
What is object storage?
Put simply, object storage is a type of data storage that manages data as objects, consisting of the data itself plus the object’s descriptive attributes (metadata). This is unlike other storage architectures that manage data as a file hierarchy (file systems) or as blocks within sectors and tracks (block storage.) I’ll get into more of the differences later on.
Object storage, also known as object-based storage, offers a really simple way to store data. It organizes information into distinct containers of flexible sizes and uses keys to retrieve the specific data you’re looking for. The idea of a key is that there’s a number or an ID, like your Social Security number, that uniquely identifies each object. That key is used to identify or reference the value, which is the actual data, be it a Word file or an image or something else.
The simplicity of object storage is a big deal compared to a file system, for instance. In that scenario, you need to know which directory and sub-directory your file is in. Think about Google Drive. You have to go from your drive to the marketing folder, to the shared folder, etc.
With object storage, you don’t need to know where the data you want is located; you just provide the key. Then the system fetches your object for you.
What are the three object storage elements?
There are really just three elements in object storage:
- The previously mentioned key (the unique identifier)
- The value (the data itself)
- The metadata, which identifies descriptive properties of the object (such as when it was created, how big it is and who owns it) as well as specifies how the object should be handled when it’s accessed.
As for metadata, custom attributes can be added to object storage systems to handle extra file-related information. A new application and database (called “extended attributes”) would be needed to manage the metadata in a traditional storage system. Once again, object storage wins at simplicity.
Object storage = simplicity and scalability
Simplicity facilitates seamless scaling. Given the overhead of keeping track of folders, file locations within folders — there is very significant background bookkeeping and overhead involved in the processes managing file systems. This overhead impacts the scalability of file systems – and is why file systems could never store billions of files or more. Since object storage has no such folder location hierarchy, and instead manages objects in a simple flat model, it enables efficient storage, management, and access to data at petabyte-scale and beyond. Object storage does keep objects grouped into logical containers (often called buckets), but this is still much simpler and lower overhead than a file system. It also leads to much greater scalability.
Designed to be massively scalable to meet the needs of today’s digital enterprises and their data, object storage is the ideal solution.
What is object storage vs. block or file storage?
All of the above makes object storage fundamentally different from traditional block or file storage systems. Every object includes the data itself, as well as its related metadata, and has a globally unique key (instead of a file name and a file path). The keys are arranged in a flat address space, which eliminates the complexity and scalability challenges of a hierarchical file system based on complex file paths.
Block storage is just the raw disc capacity. For instance, back when people regularly bought disc drives from, say, a big box store, you’d get the disc and it simply had storage capacity with no formatting. There’s no structure to it. It’s just a set of fixed-size “blocks” of binary numbers, usually 4 or 8 kilobytes worth.
To make meaning out of block storage, you put a file system on top of it. A file system is just a way to organize files. Think of block storage like an empty parking lot and file storage creates the little parking spaces within that lot. To take the parking analogy a step further, object storage is the parking valet that fetches your car for you based on your ticket (the object’s key).
Scale out seamlessly with modern object storage
That’s object storage in a nutshell. Its simplicity and scalability make it ideal for organizations that have a high volume of unstructured data to store. And these days, that’s just about everyone.
How does object storage help overcome the growing security risks posed by unstructured data? Read more on that here.
If object storage sounds like it could be a good fit for your needs, you can learn about the Scality RING and ARTESCA here.