Category Archives: Backup

Veeam backups fails because of time differences

Last week I had an interesting incident at a customer. The customer reported that one of multiple Veeam backup jobs jobs constantly failed.

jarmoluk/ pixabay.com/ Creative Commons CC0

The backup job included two VMs, and the backup of one of these VMs failed with this error:

The verified the used credentials for that job, but re-entering the password does not solved the issue. I then checked the Veeam backup logs located under %ProgramData%\Veeam\Backup (look for the Agent.Job_Name.Source.VM_Name.vmdk.log) and found VDDK Error 3014:

The user, that was used to connect to the vCenter, was an Active Directory located account. The account were granted administrator privileges root of the vCenter. Switching from an AD located account to Administrator@vsphere.local solved the issue. Next stop: vmware-sts-idmd.log on the vCenter Server appliance. The error found in this log confirmed my theory, that there was an issue with the authentication itself, not an issue with the AD located account.

To make a long story short: Time differences. The vCenter, the ESXi hosts and some servers had the wrong time. vCenter and ESXi hosts were using the Domain Controllers as time source.

This is the ntpq  output of the vCenter. You might notice the jitter values on the right side, both noted in milliseconds.

After some investigation, the root cause seemed to be a bad DCF77 receiver, which was connected to the domain controller that was hosting the PDC Emulator role. The DCF77 receiver was connected using an USB-2-LAN converter. Instead of using a DCF77 receiver, the customer and I implemented a NTP hierarchy using a valid NTP source on the internet (pool.ntp.org).

HPE Data Protector 9.08 is available

3 days ago, on 13th October 2016, HPE has released patch bundle 9,08 for Data Protector 9. A patch bundle isn’t a directly installable version, instead it’s a bundle of patches and enhancements for a specific version of Data Protector, in this case Data Protector 9.

Beside fixes for discovered problems, a patch bundle includes also enhancements. There are some enhancements in this patch bundle, that have caught my attention particularly.

QCCR2A64053: Support for object copy of file system data to Microsoft Azure. Data Protector now supports the creation of a special backup device, which can be used together with Data Protector object copies, to copy Data Protector file system backups to Azure Backup Vaults. This is an easy way to create copies of important data on Microsoft Azure.

Contemporaneous with the announcement of Data Protector 9.08, I got an e-mail of HPE with the information, that one of my change request has made it into the latest patch bundle:

QCCR2A68100: VMWARE GRE stays in debug mode. I have observed this behaviour in different Data Protector installations: If debugging isn’t explicitly disabled (OB2DBG=0 in the omnirc), the VMware GRE always writes debug logs. Regardless if debugging is enabled or disabled in the GRE configuration.

Because of some security related changes and fixes in Data Protector 9.08, HPE has marked this patch bundle as critical.

Download Data Protector patch bundle 9.08:

Data Protector 9.08 for Windows

Data Protector 9.08 for HP-UX/IA

Data Protector 9.08 for Linux/64

Data Protector: Copy sessions to encrypted devices fail after update to 9.07

Recently, a customer has informed me, that copy sessions to encrypted devices failed, after he has made an update to Data Protector 9.07. The copy sessions failed with this error:

The customer uses tape encryption. The destination for the backups is a HPE StoreOnce, and a post-backup copy creates a copy of the data on tape. Backup to disk was running fine, but the copy to tape failed immediately.

The customer has opened a ticket at the HPE support and got instantly a hotfix to resolve this issue. HPE has documented this error in QCCR2A69192. If you run into the same issue, please request hotfix QCCR2A69802. This hotfix consolidates QCCR2A69192 and QCCR2A69318 (The BMA ends abnormally during backup/copy to tape).

Thanks to Stefan for the hint!

End of support for HPE Data Protector 7.0x & 8.0x

Today I got an email from HPE, which has informed me of the imminent end of support for HPE Data Protector 7.0x 8.0x. As of June 30, 2016, HPE will offer no new updates or patches for Data Protector 7.0x and 8.0x. This means that

  • Telephone and email support
  • new security updates, and
  • new product updates

will be phased out. The self-help support will be continued until June 30, 2018. Self-help includes access to the knowledge base, current patches and access to known problems.

Data Protector 8.1x will be under support until June 30, 2017. The self-help support for Data Protector 8.1x will be continued until June 30, 2019.

Please note, that you need new license keys if you want to update Data Protector 7.0x or 8.0x to Data Protector 9. To gain new license keys, you need an active support contract. If you have valid Data Protector 8.1 license keys, you don’t need new license keys.

Don’t hesitate to leave a comment if you need further information.

HPE Data Protector 9.05: SAN backups failing back to NBDSSL

 

Last year in December, I updated the first customer from HPE Data Protector 9.04 to 9.05. Immediately after the first tests I noticed, that backups were made using the NBDSSL transport. I expected that the SAN transport would be used, because the prerequisites were met and it has worked until the update. I opened a case at the HPE support und I was advised to install the hotfix QCIM2A65619. With this hotfix, several files were replaced:

x8664\A.09.00\VEPA\DP_HOME_DIR\bin\components\DpSessionLogger.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\ViAPI.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\vCloudAPI.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\DPComServer.exe
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\components\vepalib_vmware.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\vepa_util.exe
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\vepa_bar.exe
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\components\vepalib_vcd.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\components\DPHostingEnvironmentComponent.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\components\CDpDataMoverComponent.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\components\vepalib_hyperv.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\bin\components\DpBackendService.dll
x8664\A.09.00\VEPA\DP_HOME_DIR\lib\vddk

The hotfix solved the issue. And to be honest: I didn’t care why it has worked after applying the hotfix. I had the same issue at multiple customers and applying the hotfix solved the issue in each case.

Today, I was reading through the HPE Data Protector 9.06 Integration Guide and the HPE Data Protector 9.0x Virtualization Support Matrix and I stumbled over this table:

Data Protector versionsVMware VDDK componentSupported backup / mount proxy operating systems
9.00, 9.01VDDK 5.5.0Windows Server 2003 R2 (x64)
Windows Server 2008, 2008 R2 (x64)
Windows Server 2012 (x64)
RHEL 5.9 (x64)
RHEL 6.2, 6.3 (x64)
SLES 10.4 (x64)
SLES 11 (x64)
9.02, 9.03VDDK 5.5.3Windows Server 2003 R2 (x64)
Windows Server 2008, 2008 R2 (x64)
Windows Server 2012 (x64)
RHEL 5.9 (x64)
RHEL 6.2, 6.3, 6.4 (x64)
SLES 10.4 (x64)
SLES 11 (x64)
9.04VDDK 6.0Windows Server 2008 R2 (x64)
Windows Server 2012, 2012 R2 (x64)
RHEL 6.6, 7.0 (x64)
SLES 11, 12 (x64)
9.05VDDK 6.0 U1Windows Server 2008 R2 (x64)
Windows Server 2012, 2012 R2 (x64)
RHEL 6.6, 7.0 (x64)
SLES 11, 12 (x64)
9.06VDDK 6.0 U2Windows Server 2008 R2 (x64)
Windows Server 2012, 2012 R2 (x64)
RHEL 6.6, 7.0 (x64)
SLES 11, 12 (x64)

There was a footnote for VDDK 6.0 U1.

The VM backups does not use SAN transport mode on vSphere 5.1, 5.5 (and its updates) environment and falls back to NBDSSL/NBD. This is because of VDDK 6.0 U1 issue. For more information, see VMware Knowledge Base.

Ups… that’s my issue! The footnote inclued a link to VMware KB2135621 (Virtual Disk Development Kit 6.0 U1 Backup and Restore commands fail using SAN transport mode on ESXi 5.5.x hosts on both Windows and Linux proxies). Described symptoms:

  • Virtual Disk Development Kit 6.0 Update 1 backup and restore commands fail using SAN transport mode on ESXi 5.5.x hosts.
  • This issue occurs on both Windows and Linux proxies.

Yep, that’s my issue. The customers that were observing this issue were running vSphere 5.5, not 6.0. With this knowledge, I checked the version of the vixDiskLib.dll on one of the patched Data Protector hosts. And there it was:

vixDiskLib

The vixDiskLib.dll had the build version 6.0.0 build-2498720, which is the build version of the Virtual Disk Development Kit 6.0. So it seems, that the Data Protector hotfix QCIM2A65619 makes a downgrade of the VDDK that is used by Data Protector.

KB2135621 describes, that this issue is resolved in in VMware vCenter Server 6.0 Update 2. This also implies, that this is fixed for VDDK 6.0 U2 and therefore Data Protector 9.06.

I’m sorry Data Protector. It was not your fault!

HPE Data Protector VE Integration/ VMware best practice

The Virtual Environment Integration (VE Integration) provides protection of VMs in virtual server environments. It is used o integrate HPE Data Protector with various virtualization environments, currently VMware vSphere and Microsoft Hyper-V. For Citrix XenServer is a script solution available. I will focus on VMware vSphere.

What is possible?

I took this table from the “HPE Data Protector 9.00 Integration Guide for Virtualization”.

FeatureVE Integration
Online backup
Crash-consistent backup
Application-consistent backup
Granularityvmdk, vmx
Full/ Incremental/ Differential✓/ ✓/ ✓
Support for changed block tracking (CBT)
Where does the Data Protector component need to be installed?backup host
Extra licenses needed1x On-Line Extension per ESXi host

As you can see, Data Protector offers all you need to create a crash-consistent backup of your VMs. HPE Data Protector relies on the VMware vSphere Storage APIs – Data Protection (formerly known as VMware vStorage APIs for Data Protection or VADP). Data Protector has to use the same API as Veeam, CommVault Simpana or any other product that can be used to backup VMs in a VMware vSphere environment. Therefore, most software products offer the same features.

How does it work?

HPE Data Protector uses the vStorage Image backup method to create a crash-consistent backup of your VMs. With this method, a backup host is used to create a backup of VMs hosted on a single or multiple ESXi hosts. The backup host can be a dedicated physical host, a virtual machine, or the Cell Manager (CM) itself (physical or virtual). All you need to make sure is, that the Data Protector Virtual Environment Integration component (VEAgent) is installed. During a vStorage Image backup, the VEAgent

  1. establishes a connection between the backup host and the ESXi or vCenter server (depending if it’s a standalone host or a vCenter environment)
  2. locks the VM, so that it can’t be migrated off the host by VMware vMotion
  3. requests a snapshot of the VM
  4. reads the VM data across LAN or SAN
  5. initializes the Media Agent (MA) and controls the transfer of the data to to backup device

After finishing the backup of the VM, the snapshot is released and the VM is unlocked. I took this picture from the “HPE Data Protector 9.00 Integration Guide for Virtualization” to illustrate the data flow and what components interact with each other.

hpe_dp_vepa

If Data Protector requests the creation of a snapshot, the snapshot is always named “_DP_VEPA_SNAP_”. I often use this simple PowerCLI one-liner to search orphaned VEAgent snapshots:

To be honest: Orphaned snapshots only occur if a VEAgent backup failes before Data Protector can delete the snapshot. So an orphaned snapshot indicates some kind of failure during the backup. The number of snapshots that remain in the snapshot chain after a backup depends on three factors:

  • Wheather CBT is used or not
  • Selected snapshot handling mode
  • Backup type specified

The snapshot, that remain in a snapshot chain play a great role for incremental and differential VM backups. Data Protector can detect changes on

  • file level, or at
  • block level

Without CBT, Data Protector uses snapshots to identify changes on file level. With CBT, Data Protector identifies changes on block level. With CBT, the number of snapshots remaining after a backup is always 0. Without CBT, Data Protector keeps up to 2 snapshots (mixed snapshot handling). You must not delete these snapshots. Otherwise a full backup of a VM is necessary to create a new, valid backup chain.

Even if CBT is enabled, Data Protector requests the creation of a snapshot to get a consistent state of the VM. Because of this, a VM backup requires sufficient free disk space on the datastore where the VMDKs of the VM reside. The longer a backup takes, and the more changes are made, the bigger the snapshot gets. Here comes the free space required option into play. You can specify the amount of free disk space, that must be available at the start of the backup, e.g. 10% or 20%. The required free space is calculated based on the size of VMDKs of a VM just before the snapshot is created. Data Protector checks all datastores where the virtual machine disks reside. If a VM has a 100 GB VMDK and you set the free space required option to 10%, at least 10 GB free disk space is required in each datastore, where the VM has VMDKs located. The check is per VM!

By default, VMs are backed up in parallel. This greatly improves the overall backup performance. But in rare cases it can lead to problems. You can disable parallel backups by adding

to the omnirc on the VEAgent backup host.

By default, a maximum of 10 concurrent threads are executed when backing up VMs using the VEAgent integration. This os good for the backup performance, but it also places load on the infrastructure. You can change this by adding the OB2_VEAGENT_VCENTER_CONNECTION_LIMIT variable to the omnirc on the VEAgent backup host.

I had several cases where VEAgent backups failed because the VEAgent (vepa_bar.exe) or the Backup Media Agent (bma.exe) failed with a memory dump during the backup, or during the initial environment discovery. In all cases, the VEAgent, the MA and the CM were located on a single physical host. This is highly not recommended according to the Data Protector Support. A possible solution is to deploy a Windows Server VM and push the VEAgent onto it. You can use this VM as VEAgent backup host, and the physical host acts only as MA and CM.

With the OB2_VEAGENT_BACKUP_DISK_BUFFER_SIZE option, you can modify the buffer size used during the backup. The SAN and the HotAdd transport mode support disk buffer sizes from 1 MB to 256 MB. By default, they use 8 MB disk buffers. The NBD and NBDSSL transport are always using 1 MB. Using bigger disk buffer sizes can improve the backup performance, but it also increases the memory consumption.

On Windows VMs it is possible to use Volume Shadow Copy Service (VSS) to quiesce the states of the applications running within a virtual machine before a snapshot is created. A ZIP archive is created that contains all the BCD and writer manifests. Please note that quiescence can slow down the performance of a backup sessions considerably.

TL;DR

During my last projects, I collected a number of common or best practices. I provide this “AS IS” with no warranties! Thanks to the HPE Data Protector support team for helping me during several support cases. Special thanks to Dimitar, Jose, Zhulien and Stephen!

Use multiple, smaller jobs instead of a few, bigger jobs

You should use jobs with a maximum of 30 VMs. Try to keep the size of a backup equal, but don’t add more than 30 VMs into a single job. If a job fails, you have to restart the job for 30 VMs, not for 200 or more VMs. With more jobs, you can execute jobs in parallel.

Use different hosts as Cell Manager, Media Agent and VEAgent

You shouldn’t combine CM, MA and VEAgent on a single physical or virtual server. Try to separate at least the VEAgent backup host. You can use a VM for this.

If you had to pack all services on a single server, reduce the load

Use OB2_VEAGENT_THREADED_BACKUP, or OB2_VEAGENT_VCENTER_CONNECTION_LIMIT, and/ or reduce the number of running MAs.

Always try to utilize CBT

Whenever possible, use CBT instead of single or mixed snapshot handling.

Use SAN Transport

Whenever possible, use SAN transport. If you can utilize SAN transport, try to use a virtual VEAgent backup host. In this case Data Protector will use HotAdd transport mode.

In case of StoreOnce: Single Object per Store Media

If you use a StoreOnce appliance (or a StoreOnce Software store), make sure that you have enabled “Single Object per Store Media”. I wrote a blog post about it: HPE Data Protector & StoreOnce Catalyst: Single Object per Store Media

Data Protector: Exchange backup failes because of database lock

Today I had a customer call, where a Exchange 2010 backup repeatedly failed. HPE Data Protector was unable to create a differential or incremental backup. For each database, the following error was logged:

Interestingly, there was no other backup session running. But the night before, the backup jobs failed because of a network failure.

The solution is easy. This error is caused by a wrong information in the Data Protector database. To remove this, open an administrative CMD on the Data Protector Cell Manager and run this omnidbutil command:

This command  will free up the locked resources in the Data Protector database.Then, run the job again.

HPE Data Protector & StoreOnce Catalyst: Single Object per Store Media

HPE Data Protector stores multiple backup objects on a single Catalyst store item. A backup object can be a volume, a mount point, a database or a virtual machine. You can have multiple backup objects per backup client. If your filesystem backup job has four backup clients, and each client has two volumes, the backup job will contain 8 backup objects. Another example is a single database of a Microsoft SQL or Oracle database server (instance).

A Catalyst store item is an object of a StoreOnce Catalyst store and stores the data of a specific backup job. If you backup multiple VMs in a single VE Integration job, the Catalyst store item will include all VMs from that specific job. Or if you backup an Exchange server with three databases, the Catalyst store item is used to store these three databases. Due to this behavior, a single Catalyst store item can reach enormous sizes. Usually this is not a problem. But if you have to copy backup objects to other media (e.g. tape), Data Protector has to read the store medium for each backup object. As the name says: The copy operation in Data Protector is based on backup objects. If there are multiple backup objects on a Catalyst store item, a backup object copy can take some time.

Since HPE Data Protector 8.1, Data Protector offers an option to store a single backup object per Catalyst store item. You can enable this option in the properties of a StoreOnce D2D device (“Settings” tab).

 single_object_store_media

With this option, Data Protector will create a single Catalyst store item for each backup object. This option can significantly speed up object copy operations. You should consider this option in the following cases:

  • You have to speed up object copy operations
  • You have multiple large backup objects per backup client (e.g. Exchange with multiple large databases, Microsoft SQL server with multiple large databases, VMware/ Hyper-V backups, file servers with large volumes etc.)

A possible disadvantage is the increasing number of Catalyst store items, especially if you have a large number of backup clients with many small backup objects. HPE Data Protector and StoreOnce have a limit with regard to the maximum number of Catalyst store items (which isn’t publicly documented …).

Consider the Veeam Network transport mode if you use NFS datastores

I’m using Veeam Backup & Replication (currently 8.0 Update 3) in my lab environment to backup some of my VMs to a HP StoreOnce VSA. The VMs reside in a NFS datastore on a Synology DS414slim NAS, the StoreOnce VSA is located in a local datastore (RAID 5 with SAS disks) on one of my ESXi hosts. The Veeam backup server is a VM and it’s also the Veeam Backup Proxy. The transport mode selection is set to “Automatic selection”.

Veeam Backup & Replication offers three different backup proxy transport modes:

  • Direct SAN Access
  • Virtual Appliance
  • Network

The Direct SAN Access transport mode is the recommended mode, if the VMs are located in shared datastores (connected via FC or iSCSI). The Veeam Backup Proxy needs access to the LUNs, so the Veeam Backup Proxy is mostly a physical machine. The data is directly read by the backup proxy from the LUNs. The Virtual Appliance mode uses the SCSI hot-add feature, which allows the attachment of disks to a running VM. In this case, the data is read by the backup proxy VM from the directly attached SCSI disk. In contrast to the Direct SAN Access mode, the Virtual Appliance mode can only be used if the backup proxy is a VM. The third transport mode is the Network transport mode. It can be used in any setup, regardless if the backup proxy is a VM or a physical machine. In this mode, the data is retrieved via the ESXi management network and travels over the network using the Network Block Device protocol (NBD or NBDSSL, latter is encrypted). This is a screenshot of the transport mode selection dialog of the backup proxy configuration.

veeam_transport_mode_selection

As you can see, the transport mode selection will happen automatically if you doesn’t select a specific transport mode. The selection will occur in the following order: Direct SAN Access > Virtual Appliance > Network. So if you have a physical backup proxy without direct access to the VMFS datastore LUNs, Veeam Backup & Replication will use the Network transport mode. A virtual backup proxy will use the Virtual Appliance transport. This explains why Veeam uses the Virtual Appliance transport mode in my lab environment.

Some days ago, I configured E-Mail notifications for some vCenter alarms. During the last nights I got alarm messages: A host has been disconnected from the vCenter. But the host reconnected some seconds later. Another observation was, that a running vSphere Client lost the connection to the vCenter Update Manager during the night. After some troubleshooting, I found indications, that some of my VMs became unresponsive. With this information, I quickly found the VMware KB article “Virtual machines residing on NFS storage become unresponsive during a snapshot removal operation (2010953)“. Therefore I switched the transport from Virtual Appliance to Network.

I recommend to use Network transport mode instead Virtual Appliance transport mode, if you have a virtual Veeam Backup Proxy and NFS datastores. I really can’t say that it’s running slower as the Virtual Appliance transport mode. It just works.

Important note for PernixData FVP customers

Remember to exclude the Veeam Backup Proxy VM from acceleration, if you use Virtual Appliance or NBD transport mode. If you use datastore policies, blacklist the VM or configure it as VADP appliance. If you use VM policies, simply doesn’t configure a policy for the Veeam Backup Proxy VM. If you use Direct SAN access, you need a pre- and a post-backup script to suspend the cache population during the backup. Check Frank Dennemans blog post about “PernixData FVP I/O Profiling PowerCLI commands“.

Using HP StoreOnce as target for Windows Server Backup (WSB)

Some days ago, I blogged about the new HP StoreOnce software release 3.13.0. This release included several fixes. One fix wasn’t mentioned by me, although it’s interesting.

  • Fixed issue where Windows 2012 R2 built-in native backup was not supported with 3.12.x software (BZ 61232)

Windows Server Backup (WSB) is part of Windows Server since Windows Server 2008. WSB can create bare metal backups and recover those backups. The same applies to system state backups, file level backups, Hyper-V VMs, Exchange etc. Very handy for small environmens. Backup can be stored on disk or on a file share. With Server 2012, the file share must be SMB3 capable. So if it’s not a Windows file server, the NAS that offers the file share has to be SMB3 capable. This doesn’t apply to Windows Server 2008 (R2).

With StoreOnce 3.13.0, HP has fixed this. Starting with 3.13.0, you can use a CIFS share on a StoreOnce appliance as a target for Windows Server Backup. This allows you to take advantage of the benefits of StoreOnce, like industry-leading deduplication and replication technology.

I was able to test this new feature with StoreOnce VSA appliances in my lab, as well as with a customers StoreOnce 4700 appliance.

Download you free copy of the HP StoreOnce Free 1 TB VSA today and give it a try!