When talking to SMB customers, most of them don’t want to talk about their backup strategy. It’s paradox: They know that data loss can ruin their business, but they don’t want to invest money into a fully tested recovery concept (I try to avoid the word “backup concept” – Recovery is the key). Because of tight budgets and lacking knowledge, many customers use traditional concepts in a virtualized world. This often ends in traditional backup applications with agents deployed into guest OS, and backups that are written to tape (or worse: On USB disks). If you ask a customer “Why do you store your data on tape?”, only a few argue with costs per GB or performance. Most the customer argue with something like
- “We’re doing this for years, so why we should change it?”
- “We have to store our tapes offsite”
- “There is a corporate policy that forces us to store our backups on tape”
In most cases, the attempt to sell a backup-to-disk appliance (like HP StoreOnce backup system) dies with the last arguments. Customers tend to doesn’t trust designs in which they don’t have a backup on tape. Some customers have a strong desire to have a tape which is labled with “MONDAY” or “FRIDAY FULL”. To be honest: Usually I see this behaviour only at SMB customers. Backup-to-disk appliances are often described as
- complex, and
None of them applies to a HP StoreOnce backup system. Not even expensive, if you not only focus on CAPEX.
Please allow me to write some sentences about HP StoreOnce.
A HP StoreOnce backup system is available as physical or virtual appliance. HP offers a broad range of physical appliances that can store between 5,5 TB and 1.728 TB BEFORE deduplication. The virtual StoreOnce VSA is available with a capacity of 4 TB, 10 TB and 50 TB before deduplication. And don’t forget the free 1 TB StoreOnce VSA! All HP StoreOnce backup systems, regardless if physical appliance or VSA, share the same StoreOnce deduplication technology, as well as the same replication and security features. In fact, the StoreOnce VSA runs the same (linux based) software as the physical applanices and vice versa. You can add features by adding software options:
- HP StoreOnce Catalyst
- HP StoreOnce Replication
- HP StoreOnce Security Pack
- HP StoreOnce Enterprise Manager
HP StoreOnce Catalyst allow the seamless movement of deduplicated data across StoreOnce capable devices. This means, that a HP Data Protector media agent can deduplicate data during a backup, write the data to a HP StoreOnce backup system, and then the data can replicated to another HP StoreOnce backup system. All without the need to rehydrate on the source, and deduplicate it on the destionation again. The StoreOnce VSA includes a HP StoreOnce Catalyst license!
HP StoreOnce Replication enables an appliance or a VSA to act as a target in a replication relationship. Only the target needs to be licensed. Fan-in describes the number of possible source appliances.
As you can see, even the StoreOnce VSA can used as a target for up to 8 source appliances. Replication is a licensable feature, except for the StoreOnce VSA. The StoreOnce VSA includes the replication license!
HP StoreOnce Enterprise Manager can be obtained for free and allows you to monitor up to 400 physical appliances or StoreOnce VSAs. It provides monitoring, reporting, trend analysis and forcasting. It integrates with the StoreOnce GUI for single pane-of-glass management for physical appliances and VSA.
HP StoreOnce Security Pack enables data-at-rest and data-in-flight encryption (using IPsec and only for StoreOnce Catalyst), as well as secure data deletion. Here applies the same as for the HP StoreOnce Catalyst and Replication license: The StoreOnce VSA includes this license already.
HP StoreOnce Deduplication
Deduplication is nothing really new. In simple terms it’s a technique to reduce the amount of stored data by removing redundancies. Data that is being detected as redundant, isn’t stored again on the disks. Only a pointer to the stored data is set. This runs the risk of potential data loss. What if the original block gets corrupted? Grist to the mill of the tape lovers (Tapes never fail… for sure…).
Don’t worry. I won’t bore you with stuff about a dead (or nearly dead) CPU architecture. Integrity Plus is HPs approach for an end-to-end verification process. Let’s take a look on how data comes into a StoreOnce backup system. From a client perspective, you can choose between Virtual Tape Library (VTL), NAS emulation (CIFS or NFS) and StoreOnce Catalyst.
When data is written to a VTL, a CRC is computed for each block and it’s stored together with the data block on disk. During a restore, a CRC is computed for every block that is read from disk and it’s compared to the initial stored CRC. If it differs, a SCSI check condition is reported. Because NAS emulation and StoreOnce Catalyst doesn’t use SCSI protocol, no CRC is computed and stored to disk. The integrity of the written data is guaranteed in other ways.
At the beginning of the deduplication process, the incoming data is divided into chunks. HP uses a variable length for each data chunk, but in average a data chunk is 4 KB. A smaller chunk size leads to better deduplication results. A SHA-1 (AFAIK 160 bit) hash is computed for each data chunk. This chunk hash is used to identify duplicate data by comparing it to other chunk hashes. At this point, a sparse index is used to find possible candidates of redundant data chunks. Instead of holding all chunk hashes in the memory, only a few hashes are stored in the RAM. The remaining chunk hashes are stored as metadata on disk. The container index contains a list of chunk hashes and a pointer to the data container where the data chunk is stored. Before data chunks are stored on disk, multiple chunks are compressed (using LZO) and a SHA-1 checksum is computed for the compressed chunks. This checksum is stored on disk. When the compressed data is decompressed, a new checksum is computed and it’s compared to the stored SHA-1 checksum. Metadata and container index files are protected with MD5 checksums. In addition, a transaction log file is maintained for the whole process and the sparse index is frequently flushed to disk.
When data is coming into the StoreOnce backup system, a match with a chunk hash in the memory can lead the system (using the sparse index, metadata and container index files) to containers with associated data chunk (e.g. data chunks that represent a backup VM). And if a data chunk of the incoming data is a duplicate, it is very likely that many of the following data chunks are also duplicates.
All physical appliances use RAID 6 to protect data in case of disk failures. Only the HP StoreOnce 2700 uses a RAID 5, because the appliance can only hold 4 SAS-NL disks. When using StoreOnce VSA, you can use any RAID level for the underlying storage. But you should use something above RAID 0…
- Supercapacitors on RAID controllers to protect write cache in case of power loss
- ECC memory
- Integrity Plus to protect the data within the StoreOnce backup system
- StoreOnce Replication to replicate data to another HP StoreOnce backup systems
- data-at-rest, data-in-flight encryption and secure deletion with StoreOnce Security Pack
Sounds very safe to me. Tape isn’t dead. Tape has its right to exist. But backup to tape isn’t safer than a backup to a StoreOnce backup system. Latter can offer you faster backups AND restores, new backup and recovery options (e.g. backups in RoBo offices that are replicated to the central datacenter). Think about the requirements for storing tapes (temperature, humidity, physical access), regular recovery tests, copy tapes to newer tapes etc. Consider not only CAPEX. Also remember OPEX.
A HP StoreOnce backup system is perfect for SMBs. It simplifies backup and recovery and it can offer new opportunities. Testdrive it using the free 1 TB StoreOnce VSA! Remember: The StoreOnce VSA includes StoreOnce Replication, Catalyst and the Security Pack! Even the free 1 TB StoreOnce VSA.
Feel free to follow him on Twitter and/ or leave a comment.
Latest posts by Patrick Terlisten (see all)
- Out of space – first steps when a datastore runs out of space - July 11, 2019
- User vdcs does not have the expected uid 1006 - July 2, 2019
- Poor performance with Windows 10/ 2019 1809 on VMFS 6 - May 23, 2019