Tag Archives: storage

Vembu BDR Suite v4.0 is now generally available

Vembu Technologies was founded in 2002, and with 60.000 customers and more than 4000 partners, Vembu is a leading provider with a comprehensive portfolio of software products and cloud services to small and medium businesses.

Last week, Vembu has announced the availability of Vembu BDR Suite v4.0! Vembu’s new release is all about maintaining business continuity and ensuring high availability. Apart from new features, this release features significant enhancements and bug fixes that are geared towards performance improvement.

Vembu Technologies/ Vembu BDR Essentials/ Copyright by Vembu Technologies

The Vembu BDR Suite

The Vembu BDR Suite is an one stop solution to all your backup and disaster recovery needs. That is what Vembu says about their own product. The BDR Suite covers

  • Backup and replication of VMs running on VMware vSphere and Microsoft Hyper-V
  • Backup and bare-metal recovery for physical servers and workstations (Windows Server and Desktop)
  • File and application backups of Microsoft Exchange, Microsoft SQL Server, Microsoft SharePoint, Microsoft Active Directory, Microsoft Outlook, and MySQL
  • Creating of backup copies and transfer of them to a DR site

More blog posts about Vembu:

Vembu BDR Essentials – affordable backup for SMB customers
The one stop solution for backup and DR: Vembu BDR Suite

What’s new in 4.0?

Vembu BDR Suite v4.0 has got some pretty nice new features. IMHO, there are four highlights:

  • Hyper-V Failover Cluster Support for Backup & Recovery
  • Shared VHDX Backup
  • Hyper-V Checksum Based Incremental, and the
  • Credential Manager

There is a significat chance that you use a Hyper-V Failover Cluster if you have more than one Hyper-V host. With v4.0 Vembu added support for backup and recovery for the VMs residing in a Hyper-V Failover Cluster. Even if the VMs running on Hyper-V cluster move from one host to another, the backups will continue to run without any interruption.

A feature, that I’m really missing in VMware and Veeam, is the support for the backup shared VHDX files. v4.0 added support for this.

Vembu BDR Suite v4.0 also added support bot performing incremental backups with Hyper-V. They call it Checksum based incremental method, but it is in fact Change Block Tracking. An important feature for Hyper-V customers!

The Vembu Credential Manager allows you to store the necessary credentials at one place, use it everywhere inside the Vembu BDR Suite v4.0.

But there are also other, very nice enhancements.

  • Handling new disk addition for VMware ESXi and Hyper-V, which allows the backup of newly added disks at the next backup. In prioir releases, newly added disks were only backuped during the next full backup.
  • Reconnection for VMware ESXi and Hyper-V jobs in case of a dropped network connection
  • Application-wware processing for Hyper-V VMs can now enabled on a per-VM basis
  • API for VM list with Storage utilization report which allows you to generate detailed reports whenever you need one

Interested in trying Vembu BDR suite?, Try a 30-day free trial now! For any questions, simply send an e-mail to vembu-support@vembu.com or follow them on Twitter.

Vembu VMBackup Deployment Scenarios

Vembu was founded in 2002 and has over 60,000 customers worldwide. One of their core products is the Vembu BDR Suite, which is an one stop solution to all your Backup and DR needs. I wrote a longer blog post about the Vembu BDR Suite.

One part of this suite is Vembu VMBackup, which is a data protection solution that is designed to backup VMware and Microsoft Hyper-V virtual machines secure and simple way. The offered features are compareable to Veeam Backup & Replication.

The core component of Vembu VMBackup is the Vembu BDR Backup server, which can be deployed in two ways:

  • On-premises Deployment
  • Hybrid Deployment

virnuls/ pixabay.com/ Creative Commons CC0

On-premises Deployment

In this deployment setup, customers deploy the product in their local environment. I think this is the most typical deployment type, where you install VMBackup on a physical server, in a VM or deployed as virtual appliance. Backup data is transferred  over LAN or SAN, and is written to the storage repositories. The Vembu BDR server acts as a centralized management point, where user can configure and manage backup and replication jobs.

In a simple deployment, the Vembu BDR Backup Server will act as backup proxy and management server instance. It is perfect for a small number of VMs with less simultaneous backup traffic and for VMBackup evaluation. The typical SMB environment.

If you seperate the management server from the backup proxy, the deployment changes to a distributed deployment. If necessary, multiple backup proxies can be deployed on physical hosts or in virtual machines. Customers can also deploy multiple BDR backups servers, which allows load balancing across a cluster of BDR backup servers. Pretty cool for bigger and/ or distributed environments. It allows customers to scale their backup solution over time.

On-Premises Deployment/ Vembu Technologies/ Copyright by Vembu Technologies

Hybrid Deployment

Backup is good, but having a backup copy offsite is better. Vembu OffsiteDR allows customers to create a copy of their backup data and transfer it to a DR location over LAN/ WAN. OffsiteDR instantly transfers backup data from a BDR Backup Server to an OffsiteDR server. Customers can restore failed VMs or missing files and application data in their DR site, or they can rebuild a failed BDR Backup Server from an OffsiteDR server.

Vembu Technologies/ OffsiteDR/ Copyright by Vembu Technologies

If customers don’t have a DR site, they can use Vembu CloudDR push a backup copy to the Vembu cloud. The data stored in the Vembu Cloud can easily be restored at anytime and to any location. Vembu uses AWS across all continents to asure the availability of their cloud services.

Vembu Technologies/ CloudDR/ Copyright by Vembu Technologies

Customers have the choice

It is obvious that customers have the freedom of choice how they deploy Vembu VMBackup.I like the virtual appliance approach, which eliminates the need for additional Windows Server licenses. More and more vendors tend to offer appliances for their products, just think about VMware vCenter Server Appliance, vRealize Orchestrator etc. So why not offer a backup server appliance? I wish other vendors would adopt this…

Another nice feature is the scale-out capability of Vembu. Start small and grow over time. Perfect for SMBs that want to start small and grow over time.

Powering on a VM with shared VMDK fails after extending a EagerZeroedThick VMDK

I hope that you are not reading this blog post while searching for a solution for a failed cluster. If so, feel free to leave a comment if this blog post saved your evening or weekend. :)

Last friday, a change at one of my customers went horribly wrong. I was not onsite, but they contacted me during the night from friday to saturday, because their most important Windows Server Failover Cluster was unable to start after extending a shared VMDK.

cripi/ pixabay.com/ Creative Commons CC0

They tried something pretty simple: Extending an virtual disk of a VM. That is something most of us doing pretty often. The customer did this also pretty often. It was a well known task… Except the fact, that the VM was part of a Windows Server Failover Cluster. With shared VMDKs. And the disks were EagerZeroedThick, because this is a requirement for shared VMDKs.

They extended the disk using the vSphere Web Client. And at this point, the change was doomed to fail. They tried to power-on the VMs, but all they got was this error:

VMware ESX cannot open the virtual disk, “/vmfs/volumes/4c549ecd-66066010-e610-002354a2261b/VMNAME/VMDKNAME.vmdk” for clustering. Please verify that the virtual disk was created using the ‘thick’ option.

A shared VMDK is a VMDK in multiwriter mode. This VMDK has to be created as Thick Provision Eager Zeroed. And if you wish to extend this VMDK, you must use  vmkfstools  with the option -d eagerzeroedthick. If you extend the VMDK using the Web Client, the extended portion of the disk will become LazyZeroed!

VMware has described this behaviour in the KB1033570 (Powering on the virtual machine fails with the error: Thin/TBZ disks cannot be opened in multiwriter mode). There is also a blog post by Cormac Hogan at VMware, who has described this behaviour.

That’s a screenshot from the failed cluster. Check out the type of the disk (Thick-Provision Lazy-Zeroed).

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

You must use vmkfstools  to extend a shared VMDK – but vmkfstools is also the solution, if you have trapped into this pitfall. Clone the VMDK with option -d eagerzeroedthick.

Another solution, which was new to me, is to use Storage vMotion. You can migrate the “broken” VMDK to another datastore and change the the disk format during Storage vMotion. This solution is described in the “Notes” section of KB1033570.

Both ways will fix the problem. The result will be a Thick Provision Eager Zeroed VMDK, which will allow the VMs to be successfully powered on.

Vembu BDR Essentials – affordable backup for SMB customers

It is common that vendors offer their products in special editions for SMB customers. VMware offers VMware vSphere Essentials and Essentials Plus, Veeam offers Veeam Backup Essentials, and now Vembu has published Vembu BDR Essentials.

Vembu Technologies/ Vembu BDR Essentials/ Copyright by Vembu Technologies

Backup is important. There is no reason to have no backup. According to an infographic published by Clutch Research at the World Backup Day 2017, 60% of all SMBs that lost all their data will shutdown within 6 months after the data loss. Pretty bad, isn’t it?

When I talk to SMB customers, most of them complain about the costs of backups. You need software, you need the hardware, and depending on the type of used hardware, you need media. And you should have a second copy of your data. In my opinion, tape is dead for SMB customers. HPE for example, offers pretty smart disk-based backup solutions, like the HPE StoreOnce. But hardware is nothing without software. And at this point, Vembu BDR Essentials comes into play.

Affordable backup for SMB customers

Most SMB virtualization deployments consists of two or three hosts, which makes 4 or 6 used CPU sockets. Because of this, Vembu BDR Essentials supportes up to 6 sockets or 50 VMs. But why does Vembu limit the number of sockets and VMs? You might missed the OR. Customers have to choice which limit they want to accept. Customers are limited at the host-level (max 6 sockets), but not limited in the amount of VMs, or they can use more than 6 sockets, but then they are limited to 50 VMs.

Feature Highlights

Vembu BDR Essentials support all important features:

  • Agentless VMBackup to backup VMs
  • Continuous Data Protection with support for RPOs of less than 15 minutes
  • Quick VM Recovery to get failed VMs up and running in minutes
  • Vembu Universal Explorer to restore individual items from Microsoft applications like Exchange, SharePoint, SQL and Active Directory
  • Replication of VMs Vembu OffsiteDR and Vembu CloudDR

Needless to say that Vembu BDR Essentials support VMware vSphere and Microsoft Hyper-V. If necessary, customer can upgrade to the Standard or Enterprise  edition.

To get more information about the different Vembu BDR parts, take a look at my last Vembu blog post: The one stop solution for backup and DR: Vembu BDR Suite

The pricing

Now the fun part – the pricing. Customers can save up to 50% compared to the Vembzu BDR Suite.

Vembu Technologies/ Vembu BDR Essentials Pricing/ Copyright by Vembu Technologies

The licenses for Vembu BDR Essentials are available in two models:

  • Subscription, and
  • Perpetual

Subscription licenses are available for 1, 2, 3 and 5 years. The perpetual licenses is valid for 10 years from the date of purchase. The subscription licensing has the benefit, that it included 24×7 technical support. If you purchase the perpetual  license, the Annual Maintenance Cost (AMC) for first year is free. From the second year, it is 20% of the license cost, and it is available for 1, 2 or 3 years.

There is no excuse for not having a backup

With Vembu BDR Essentials, there is no more excuse for not having a competitive backup protecting your business! The pricing fits any SMB customer, regardless of their size or business. The rich feature set is competitive to other vendors, and both leading hypervisors are supported.

A pretty nice product. Try it for free! Vembu also offers a free edition that might fit small environments. The free edition let you choose between unlimited VMs, that are covered with limited functionality, or unlimited functionality for up to 3 VMs. Check out this comparison of free, standard and enterprise edition.

The one stop solution for backup and DR: Vembu BDR Suite

I have worked with a lot of backup software products during my career, but for the last years I have primarily worked with MicroFocus Data Protector (former HP OmniBack, HP Data Protector, or HPE Data Protector), and Veeam Backup & Replication. Data Protector was a great solution for traditional server environments, or when UNIX (HP-UX, AIX, Solaris etc.) compatibility was required. Features like Zero Downtime Backups, LAN-free or Direct SAN backups were available for many years. But their code quality has suffered severely in the recent years. The product no longer seemed like a one-stop shop. Some months ago, HPE sold its software division to MicroFocus and started to sell Veeam Backup & Replication through its channel. Some months prior selling the complete software division, HPE acquired Trilead, which is many of us well known because of their VM Explorer. Sad but true: Data Protector is dead to me.

I think I don’t have to say much about Veeam. Veeam is unbeaten when it comes down to virtualized server environments, and they constantly add features and extend their product portfolio. Think about their solutions Office 365, or Veeam Agent for Windows and Linux.

Why Vembu?

It is always good to have more than product in the portfolio, just because to give customers the choice between different products. If your only tool is a hammer, everthing looks like a nail. That is why I took a closer look at Vembu. I became aware of Vembu, because they asked to place an ad on vcloudnine. This was a year ago. So it was obvious to take a look at their products. Furthermore, Vembu and its products were mentioned many times in my Twitter timeline. Two good reasons to take a look at them!

Vembu Technologies was founded in 2002, and with 60.000 customers and more than 4000 partners, Vembu is a leading provider with a comprehensive portfolio of software products and cloud services to small and medium businesses. We are not talking about a newcomer!

The Vembu BDR Suite

The Vembu BDR Suite is an one stop solution to all your backup and disaster recovery needs. That is what Vembu says about their own product. The BDR Suite covers

  • Backup and replication of VMs running on VMware vSphere and Microsoft Hyper-V
  • Backup and bare-metal recovery for physical servers and workstations (Windows Server and Desktop)
  • File and application backups of Microsoft Exchange, Microsoft SQL Server, Microsoft SharePoint, Microsoft Active Directory, Microsoft Outlook, and MySQL
  • Creating of backup copies and transfer of them to a DR site

Let’s have a more detailed look at the Vembu BDR Suite. This is a picture of the overall architecture.

Vembu Technologies/ Vembu BDR Suite architecture/ Copyright by Vembu Technologies

VMBackup

VMBackup provides fast, efficient and agentless backup for VMs hosted on VMware ESXi and on Microsoft Hyper-V. It also provides the capability to replicate virtual machines from one ESXi host to another ESXi (VMreplication). You might guess it – This feature is only available for VMware ESXi. In case of Microsoft Hyper-V, you have to use the built-in Hyper-V replication. The failover and failback of replicated VMs is managed by the BDR Backup Server. VMBackup offers instant VM recovery, recovery of single files and folder from image-level backups, and recovery of application items from Microsoft Exchange, Microsoft SQL Server, Microsoft SharePoint, and Microsoft Active Directory. The functionality is similar to what you know from other products, like Veeam Backup & Replication, or MicroFocus Data Protector. VMBackup is licensed per socket, and it is available in a Standard (~ 150 $ per socket) and an Enterprise (~ 250 $ per socket) edition.

ImageBackup

ImageBackup addresses something, that might be extinct for some of us: Physical servers, like physical database or file servers. It can take image backups of Windows servers and workstations. This allows customers to restore the entire server or workstation from scratch to the same, or to new hardware. ImageBackup utilizes the Volume Shadow Copy Service (VSS) to create a consistent backup of a physical machine. Customers can restore a backup to the bare-metal, restore single files and folders, as well as application items from Microsoft Exchange, Microsoft SQL Server, Microsoft SharePoint, and Microsoft Active Directory. If necessary, the can be restored to a supported hypervisor. With other words: P2V migration. ImageBackup is licensed per host, or per application server if you wish to take backups of applications like Microsoft Exchange or SQL server. ImageBackup for servers costs ~ 150 $, and it is free for workstations.

NetworkBackup

NetworkBackup addresses the backup of files, folders and application data from Windows, Mac and Linux clients. It is designed to protect business data across file servers, application servers, workstations and other endpoints. It does not take an image backup, but full and incremental backups. The feature set and use case of NetworkBackup is similar to “traditional” backup software like MicroFocus Data Protector or ARCServe. NetworkBackup offers intelligent scheduling policies, bandwidth management and flexible retention polices. Clients are not always onsite, to address this, NetworkBackup can store its data in the Vembu Cloud (Vembu Cloud Services). NetworkBackup is licensed per file server (~ 60 $ per server), application server (~ 150 $), or workstation (free).

OffsiteDR

OffsiteDR creates and transfers backup copies to a DR site. Data is immediately transferred when it arrives at the backup server. The Data is encrypted in-flight using industry-standard AES 256 encryption. WAN optimization is included, which means that data is compressed, encrypted and deduplicated before being replicated to the OffsiteDR server. The recovery of VMs and files can directly be done from the OffsiteDR server. So there is no need to setup a new backup server in case of a disaster recovery. OffsiteDR covers different recovery screnarios, like instantly recover machines directly from the Vembu OffsiteDR server, bare-metal restore using the Vembu Recovery CD, or restore the virtual machine as on a VMware ESXi or Microsoft Hyper-V server directly from the Vembu OffsiteDR server. OffsiteDR is an add-on to VMBackup, and it is licensed per CPU socket (~ 90 $).

Universal Explorer

The Universal Explorer is used to restore items from various Microsoft applications, like Microsoft Exchange, SQL Server, SharePoint, or Active Directory. An item can be an email, a mailbox, complete databases, user or group objects etc. These items are sourced from image-level backups of physical and virtual machines. You might see some similarities to Veeam Explorer. Both products are comparable.

Recovery CD

The Vembu Recovery CD can be used to recover physical or virtual maschines. Drivers for the target platform will be injected during the restore. This is pretty handy in case of P2P and V2P migrations.

Licensing & Editions

Vembu offers a subscription and a perpetual license model. The subscription model can be purchased on a monthly or yearly basis, such as 1, 2, 3 or 5 years. It includes 24/ 7 standard technical support, updates and upgrades throughout the licensed period. The perpetual licensing model allows you to purchase and use the Vembu BDR suite by paying a single fee. This includes free maintenance and support for the first year.

Visit the pricing page for more detailed information. A Vembu BDR Suite edition comparison is also available.

Final thoughts

With 60.000 customers and 4000 partners, Vembu is not a newcomer in the backup business. The product portfolio is quite comprehensive. The Vembu BDR Suite offers industry standard features to a very sweet price. I can’t see any feature, that a SMB customer might require, which is not available. In sum, the Vembu BDR suite seems to me to be a very good alternative to the top dogs in the backup business, especially if we are talkin about SMB customers.

Backup from a secondary HPE 3PAR StoreServ array with Veeam Backup & Replication

When taking a backup with Veeam Backup & Replication, a VM snapshot is created to get a consistent state of the VM. The snapshot is taken prior the backup, and it is removed after the successful backup of the VM. The snapshot grows during its lifetime, and you should keep in mind, that you need some free space in the datastore for snapshots. This can be a problem, especially in case of multiple VM backups at a time, and if the VMs share the same datastore.

Benefit of storage snapshots

If your underlying storage supports the creation of storage snapshots, Veeam offers an additional way to create a consistent state of the VMs. In this case, a storage snapshot is taken, which is presented to the backup proxy, and is then used to backup the data. As you can see: No VM snapshot is taken.

Now one more thing: If you have a replication or synchronous mirror between two storage systems, Veeam can do this operation on the secondary array. This is pretty cool, because it takes load from you primary storage!

Backup from a secondary HPE 3PAR StoreServ array

Last week I was able to try something new: Backup from a secondary HPE 3PAR StoreServ array. A customer has two HPE 3PAR StoreServ 8200 in a Peer Persistence setup, a HPE StoreOnce, and a physical Veeam backup server, which also acts as Veeam proxy. Everything is attached to a pretty nice 16 Gb dual Fabric SAN. The customer uses Veeam Backup & Replication 9.5 U3a. The data was taken from the secondary 3PAR StoreServ and it was pushed via FC into a Catalyst Store on a StoreOnce. Using the Catalyst API allows my customer to use Synthetic Full backups, because the creation is offloaded to StoreOnce. This setup is dramatically faster and better than the prior solution based on MicroFocus Data Protector. Okay, this last backup solution was designed to another time with other priorities and requirements. it was a perfect fit at the time it was designed.

This blog post from Veeam pointed me to this new feature: Backup from a secondary HPE 3PAR StoreServ array. Until I found this post, it was planned to use “traditional” storage snapshots, taken from the primary 3PAR StoreServ.

With this feature enabled, Veeam takes the snapshot on the 3PAR StoreServ, that is hosting the synchronous mirrored virtual volume. This graphic was created by Veeam and shows the backup workflow.

Veeam/ Backup process/ Copyright by Veeam

My tests showed, that it’s blazing fast, pretty easy to setup, and it takes unnecessary load from the primary storage.

In essence, there are only three steps to do:

  • add both 3PARs to Veeam
  • add the registry value and restart the Veeam Backup Server Service
  • enable the usage of storage snapshots in the backup job

How to enable this feature?

To enable this feature, you have to add a single registry value on the Veeam backup server, and afterwards restart the Veeam Backup Server service.

  • Location: HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication\
  • Name: Hp3PARPeerPersistentUseSecondary
  • Type: REG_DWORD (0 False, 1 True)
  • Default value: 0 (disabled)

Thanks to Pierre-Francois from Veeam for sharing his knowledge with the community. Read his blog post Backup from a secondary HPE 3PAR StoreServ array for additional information.

Meltdown & Spectre: What about HPE Storage and Citrix NetScaler?

In addition to my shortcut blog post about Meltdown and Spectre with regard of Microsoft Windows, VMware ESXi and vCenter, and HPE ProLiant, I would like to add some additional information about HPE Storage and Citrix NetScaler.

When we talk about Meltdown and Spectre, we are talking about three different vulnerabilities:

  • CVE-2017-5715 (branch target injection)
  • CVE-2017-5753 (bounds check bypass)
  • CVE-2017-5754 (rogue data cache load)

CVE-2017-5715 and CVE-2017-5753 are known as “Spectre”, CVE-2017-5754 is known as “Meltdown”. If you want to read more about these vulnerabilities, please visit meltdownattack.com.

Due to the fact that different CPU platforms are affected, one might can guess that also  other devices, like storage systems or load balancers, are affected. Because of my focus, this blog post will focus on HPE Storage and Citrix NetScaler.

HPE Storage

HPE has published a searchable and continously updated list with products, that might be affected (Side Channel Analysis Method allows information disclosure in Microprocessors). Interesting is, that a product can be affected, but not vulnerable.

ProductImpactedComment
Nimble StorageYesFix under investigation
StoreOnceYESNot vulnerable – Product doesn’t allow arbitrary code execution.
3PAR StoreServYESNot vulnerable – Product doesn’t allow arbitrary code execution.
3PAR Service ProcessorYESNot vulnerable – Product doesn’t allow arbitrary code execution.
3PAR File ControllerYESVulnerable- further information forthcoming.
MSAYESNot vulnerable – Product doesn’t allow arbitrary code execution.
StoreVirtualYESNot vulnerable – Product doesn’t allow arbitrary code execution.
StoreVirtual File ControllerYESVulnerable- further information forthcoming.

The File Controller are vulnerable, because they are based on Windows Server.

So if you are running 3PAR StoreServ, MSA, StoreOnce or StoreVirtual: Relax! If you are running Nimble Storage, wait for a fix.

Citrix NetScaler

Citrix has also published an article with information about their products (Citrix Security Updates for CVE-2017-5715, CVE-2017-5753, CVE-2017-5754).

The article is a bit spongy in its statements:

Citrix NetScaler (MPX/VPX): Citrix believes that currently supported versions of Citrix NetScaler MPX and VPX are not impacted by the presently known variants of these issues.

Citrix believes… So nothing to do yet, if you are running MPX or VPX appliances. But future updates might come.

The case is a bit different, when it comes to the NetScaler SDX appliances.

Citrix NetScaler SDX: Citrix believes that currently supported versions of Citrix NetScaler SDX are not at risk from malicious network traffic. However, in light of these issues, Citrix strongly recommends that customers only deploy NetScaler instances on Citrix NetScaler SDX where the NetScaler admins are trusted.

No fix so far, only a recommendation to check your processes and admins.

Checking the 3PAR Quorum Witness appliance

Two 3PAR StoreServs running in a Peer Persistence setup lost the connection to the Quorum Witness appliance. The appliance is an important part of a 3PAR Peer Persistence setup, because it acts as a tie-breaker in a split-brain scenario.

While analyzing this issue, I saw this message in the 3PAR Management Console:

3PAR Quorum Witness Status

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

In addition to that, the customer got e-mails that the 3PAR StoreServ arrays lost the connection to the Quorum Witness appliance. In my case, the CouchDB process died. A restart of the appliance brought it back online.

How to check the Quorum Witness appliance?

You can check the status of the appliance with a simple web request. The documentation shows a simple test based on curl. You can run this direct from the BASH of the appliance.

But you can also use the PowerShell cmdlet Invoke-WebRequest.

If you add /witness to the URL, you can test the access to the database, which is used for Peer Persistence.

If you get a connection error, check if the beam process is running.

If not, reboot the appliance. This can be done without downtime. The appliance comes only into play, if a failover occurs.

Why I moved from NFS to vSAN… and why it went wrong

I wanted to retire my Synology DS414slim, and switch completely to vSAN. Okay, no big deal. Many folks use vSAN in their lab. But I’d like to explain why I moved to vSAN and why this move failed. I think some of my thoughts are also applicable for customer environments.

So far, I used a Synology DS414slim with three Crucial M550 480 GB SSDs (RAID 5) as my main lab storage. The Synology was connected with two 1 GbE uplinks (LAG) to my  network, and each host was connected with 4x 1 GbE uplinks (single distributed vSwitch). The Synology was okay from the capacity perspective, but the performance was horrible. RAID 5, SSDs and NFS were not the best team, or to be precise, the  CPU of the Synology was the main bottleneck.

nas_ds414slim

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

1,2 GHz is not enough, if you want to use NFS or iSCSI. I never got more than 60 MB/s (sequential). The random IO performance was okay, but as soon as the IO increased, the latencies went through the roof. Not because the SSDs were to slow, but because the CPU of the Synology was not powerful enough to handle the NFS requests.

Workaround: Add more flash storage

The workaround for the poor random IO performance was adding more flash storage. This time, the flash storage was added to the hosts. I used PernixData FVP to boost my lab. FVP was a quite cool product (unfortunately it was a cool product.) PernixData granted me, as a PernixPro, some licenses for my lab.

End of an era

The acquisition of PernixData by Nutanix, the missing support für vSphere 6.5, and the end of availability of all PernixData products led to the decision to remove PernixData FVP from my lab. Without PernixData FVP, my lab was again a slow train crawling up a hill. Four HPE ProLiant, with enough CPU (40 cores) and memory resources (384 GB RAM) were tied down by slow IO.

Redistribution of resources

I had

  • three 480 GB SSDs, and
  • three 40 GB SSDs

in stock. The 40 GB SSDs were to small and slow, so I replaced them with 120 GB SSDs. I was able to equip three of my four hosts with SSDs. Three hosts with flash storage were enough to try VMware vSAN.

Fortunately, not all hosts have to add capacity to a vSAN cluster. Hosts can also only consume storage from a vSAN cluster. With this in mind, vSAN appeared to be a way out of my IO dilemma. In addition, using the 480 GB SSDs as capacity tier, a vSAN all-flash config was possible.

Migration

It took me a little time to move around VMs to temporary locations, while keeping my DC and my VCSA available. I had to remove my datastore on the Synology to free up the 480 GB SSDs. The necessary vSAN licenses were granted by VMware (vExpert licenses).

The creation of the vSAN cluster itself was easy. Fortunately, wiping partitions from disks is easy. You can use the vSphere Web Client to do this.

vsan_wipe_partitions

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

The initial performance was quite good, much better than expected and much better than the NFS performance of the old Synology NAS. I enabled deduplication and compression, but as soon as I moved VMs to the vSAN datastore, the throughput dropped and latencies went through the roof. It was totally unusable. Furthermore, I got health alarms:

vsan_congestion_error

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

As the load increased, the errors became more severe.

vsan_performance_error

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0


I was able to solve this with a blog post of Cormac Hogan (VSAN 6.1 New Feature – Handling of Problematic Disks). Even without compression and deduplication, the performance was not as expected and most times to low to work with. At this point, I got an idea what was causing my vSAN problems.

Do not use consumer-grade hardware with vSAN

To be honest: The budget is the problem. I had to take consumer-grade SSDs.

This is a screenshot from the vSAN Observer. esx1 to esx3 are equipped with SSDs, esx4 is only consuming storage from the vSAN cluster.

vsan_observer_perf

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Red is not the color to highlight good things…

An explanation attempt

This blog post of Duncan Epping (Why Queue Depth matters!) is a bit older, but still valid in my case. The controller I use  (HPE Smart Array P410i) has a a deep queue (1011), the RAID device has a queue length of 1024, but the SATA SSDs have only a queue length of 32. Here’s the disk adapter and disk device view of ESXTOP.

The consumer-grade SSDs drowned in IOs, unable to handle parallel read and write operations. There’s nothing much that I can do. Currently there are two options:

  • Replacing the SSDs with devices, that have a deeper queue depth
  • Replace the Synology NAS with a more powerful NAS and move back to NFS

I don’t know which way I will go. To get this clear:

  • This is my lab, not a customer environment
  • It is not a vSAN related problem
  • It is because of consumer-grade hardware

Do not try this at production kids. Go vSAN, but please use the right hardware.

HPE 3PAR OS updates that fix VMware VAAI ATS Heartbeat issue

Customers that use HPE 3PAR StoreServs with 3PAR OS 3.2.1 or 3.2.2 and VMware ESXi 5.5 U2 or later, might notice one or more of the following symptoms:

  • hosts lose connectivity to a VMFS5 datastore
  • hosts disconnect from the vCenter
  • VMs hang during I/O operations
  • you see the messages like these in the vobd.log or vCenter Events tab

  • you see the following messages in the vmkernel.log

Interestingly, not only HPE is affected by this. Multiple vendors have the same issue. VMware described this issue in KB2113956. HPE has published a customer advisory about this.

Workaround

If you have trouble and you can update, you can use this workaround. Disable ATS heartbeat for VMFS5 datastores. VMFS3 datastores are not affected by this issue. To disable ATS heartbeat, you can use this PowerCLI one-liner:

Solution

But there is also a solution. Most vendors have published firwmare updates for their products. HPE has released

  • 3PAR OS 3.2.2 MU3
  • 3PAR OS 3.2.2 EMU2 P33, and
  • 3PAR OS 3.2.1 EMU3 P45

All three releases of 3PAR OS include enhancements to improve ATS heartbeat. Because 3PAR OS 3.2.2 has also some nice enhancements for Adaptive Optimization, I recommend to update to 3PAR OS 3.2.2.