Object storage systems are enjoying increasing popularity – and not without reason. An S3 API is a particularly highly valued characteristic: the Simple Storage Service, S3 for short, lives up to its name by making application development much easier. Object storage is ideal for storing large volumes of data. This is thanks to its enormous scalability, attractive price per GB and high availability. The latter is often given by the system manufacturer or cloud provider as a percentage representing the system’s uptime. An availability of 99.9%, for example, permits a maximum downtime of 8 hours, 45 minutes and 58 seconds per year.
These systems achieve such high availability through the use of redundant components, erasure coding and replication. But despite these impressive statistics, high availability should not be confused with data security. There are a number of dangers lurking just out of sight that will not be halted even by object storage.
This raises two questions: What are the threats, and how secure is object storage really?
1. Hard drive failure
Depending on their state of expansion, object storage systems can quickly integrate hundreds or even thousands of hard drives. Hard drive failure is therefore a day-to-day occurrence in such systems. Erasure coding is used to defend against potential data loss. This keeps restoration times significantly lower than with classical RAID systems. The number of hard drives that can fail at a time depends on the specified erasure coding rate. Ultimately, the higher the rate, the higher the overhead.
As an alternative to erasure coding, companies can use replication instead, storing one or more copies of an object on separate nodes.
2. Node failure
The causes of node failure are many and varied. For example, nodes can fail as a result of an interrupted network connection, a disruption in the power supply, or a hardware fault bringing the whole node down. Thanks to erasure coding and replication, normal operations can still continue in such cases. However, smaller installations with just a few nodes have a particularly low tolerance for such outages. If the failure is irreparable, and a hard drive in another node fails – or if a second entire node goes down – this can lead to data loss. The low minimum node requirements of object storage are therefore a luxury that businesses should enjoy responsibly.
3. Rack failure
Depending on the object storage system used, data centres can be designed so that the failure of one rack does not prevent other operations from continuing. The use of separate fire zones can reduce the risk of a total site-wide outage. Erasure coding and replication can again be useful here.
4. Site failure
Many businesses have two data centres which they operate on an active/active or active/passive basis, in order to keep their IT systems operating in case one site goes down entirely. With two sites, replication can ensure data is stored redundantly, but with three or more sites, erasure coding can bring its strengths to bear and keep storage overheads low.
5. Software errors
Even given maximum availability, software errors can still lead to data loss within object storage systems. Businesses can protect themselves by keeping copies of data on a separate, independent system, ideally using a different storage technology from the primary system. Tape technology can be a particularly cost-effective option.
The PoINT Archival Gateway is a solution offering every advantage of S3 object storage, but holding data on tape instead of hard drives or SSDs. This makes it an outstanding replication target or backup storage location for your primary object storage system. This also achieves the popular goal of maintaining an air gap between media, while still preserving the format of data. Objects are stored as objects and can be accessed via S3.
6. Accidental or malicious deletion
A simple S3 operation can delete up to 1,000 objects all at once, provided the user in question has been given the necessary permissions or obtained them surreptitiously. In such cases, neither erasure coding nor mirroring data across multiple locations will help. Versioning can act as a hurdle to this kind of action, provided that the object storage system in question offers this option. MinIO, for example, does not.
If versioning is activated within the bucket, an object will not actually be deleted. Instead, it will have a delete marker attached to it. This can be found in the response to the deletion command, which will contain the tag <DeleteMarker>true</DeleteMarker>. If the delete marker is itself deleted, the original object will become visible again. However, if a specific version of an object is deleted, versioning cannot prevent this. In this case, the specific version ID of an object will be specified as part of the deletion operation. In line with the S3 API reference, a bucket can only be deleted if it is empty. However, entire buckets can still be deleted, including the objects they contain, in a way that versioning cannot prevent. First, the user must request a list of all object versions, delete each version of each object one by one, then finally delete the empty bucket. Versioning therefore only provides conditional protection.
What may be very practical on the one hand can nevertheless rapidly increase S3 costs. Each object version takes up storage space and therefore creates associated costs. Businesses that use versioning should consider moving older versions of files to more cost-effective storage systems. For example, NetApp StorageGRID and Cloudian HyperStore include integrated ILM and storage tiering that allow you to create a rule for “non-current” object versions. In combination with the PoINT Archival Gateway, old versions of objects can therefore be transferred to tape.
Using AWS with multi-factor authentication (MFA) can make it harder for malicious actors to delete data. In this case, objects cannot be deleted and the versioning status of a bucket cannot be changed without sending a separately generated authentication code. Whether MFA can be integrated into the workflow depends heavily, of course, on the specific use case.
A different option for making it harder to delete objects is to use WORM (write once, read many). But the same applies here: it depends on the use case. Although WORM makes sense for long-term archiving, it can prove a hindrance in cases where changes to data are common.
As well as the dangers that come with the S3 API, the GUI or CLI used to manage an object storage system also needs to be considered. Anyone with root access to this interface will find that nothing stands in their way.
If data is deleted from an object storage system in spite of all safety precautions, the only thing that can help is to restore this data from a backup – provided that a backup exists. Unfortunately, backups are exactly the thing many businesses neglect. Many object storage systems allow so-called cross-region replication (CRR), which lets you specify an external third-party system such as the PoINT Archival Gateway. As soon as the primary object storage system receives a new object, this triggers a replication process via S3 to the PoINT Archival Gateway, which writes the copy to tape. Deletion processes are not replicated; instead, a delete marker is simply attached to the file.
Ransomware has been plying its terrible trade and wreaking havoc on systems for several years now. This harmful software encrypts data on infected computers. The perpetrators then demand money from the system owners in order to decrypt the data. Although ransomware has historically targeted file systems, object storage is not immune to such an attack. Rhino Security Labs, a penetration testing provider, has sketched out what an attack of this kind might look like against an object storage system, in this case AWS.
AWS offers a so-called Key Management Service (KMS) designed to make managing cryptographic keys easier. This means, for example, that KMS can be used to encrypt S3 objects. An interesting fact to note here is that keys can be used across multiple AWS accounts.
The proof of concept envisioned by Rhino Security Labs anticipates the following steps:
- The attacker creates a KMS key and makes it available to others for encryption, but not decryption.
- The attacker gains write access to one of the victim’s buckets via a hole in the victim’s security.
- The attacker checks whether the bucket is configured for S3 versioning and MFA.
- The attacker uses the S3 API to replace each object with a copy of the object. However, each copy is now encrypted using the attacker’s KMS key.
- The attacker sets a deadline by which the KMS key will be deleted.
- The attacker uploads a set of unencrypted instructions for the victim.
In this way, Rhino Security Labs was able to encrypt a dataset measuring 100 GB and containing approximately 2,000 objects in 1 minute and 47 seconds. If only 10 minutes go by before a logging alarm is triggered and countermeasures take effect, more than 500 GB have already been lost.
In the end, the same rule applies for both file systems and object storage alike: The most important measure in the fight against ransomware is to back up data.
8. The human factor
Humans make errors, and IT administrators are only human too. Incorrectly configured S3 buckets can again lead to breaches in data security. Here are just a few examples:
- “Data on 123 Million US Households Exposed Due to Misconfigured AWS S3 Bucket”
- “2 Misconfigured Databases Breach Sensitive Data of Nearly 90K Patients”
- “119,000 Passports and Photo IDs of FedEx Customers Found on Unsecured Amazon Server”
- “Accenture left a huge trove of highly sensitive data on exposed servers”
Encryption likewise needs to be treated with care. Encryption improves security, but if the key is lost, everyone is locked out.
Meanwhile, maintenance work is always necessary, including for object storage systems. Errors when applying patches or updating a software platform can happen quickly.
In conclusion: Secure your object storage systems
Object storage systems offer high availability, without a doubt. Thanks to erasure coding, object storage can withstand the failure of hard drives, nodes, racks and entire data centres.
Versioning offers only very conditional protection against accidental or malicious deletion. WORM and MFA make such actions much more difficult, provided these options are feasible for the use case in question.
Object storage systems are not immune to ransomware or human error.
In one respect, data on object storage systems is no different from data on NAS or SAN infrastructure: it needs to be secured. The immense data volumes do present a challenge, however. The solution must also make it possible to quickly restore individual objects.
With the PoINT Archival Gateway, PoINT Software & Systems offers a highly available tape-based object storage system for securing primary S3 object storage. Customers can use cross-region replication (CRR) to set up an automatic asynchronous replication process that sends all new objects from their hard drive-based object storage system to the PoINT Archival Gateway, which stores data on tape. Thanks to its native S3 API, accessing objects secured in this way is exceptionally easy.
Additional options are presented by the PoINT Archival Gateway’s “pull” approach, which works incrementally at the object level and independent of any vendor. This approach avoids the need for the primary object storage system to have write access to the target system, increasing security. This further isolates the data, meaning the PoINT Archival Gateway can run in a highly isolated environment.
We appreciate your feedback about the PoINT blog and this blog post. Please contact us at firstname.lastname@example.org.