Tag Archives: vExpert

Backup from a secondary HPE 3PAR StoreServ array with Veeam Backup & Replication

When taking a backup with Veeam Backup & Replication, a VM snapshot is created to get a consistent state of the VM. The snapshot is taken prior the backup, and it is removed after the successful backup of the VM. The snapshot grows during its lifetime, and you should keep in mind, that you need some free space in the datastore for snapshots. This can be a problem, especially in case of multiple VM backups at a time, and if the VMs share the same datastore.

Benefit of storage snapshots

If your underlying storage supports the creation of storage snapshots, Veeam offers an additional way to create a consistent state of the VMs. In this case, a storage snapshot is taken, which is presented to the backup proxy, and is then used to backup the data. As you can see: No VM snapshot is taken.

Now one more thing: If you have a replication or synchronous mirror between two storage systems, Veeam can do this operation on the secondary array. This is pretty cool, because it takes load from you primary storage!

Backup from a secondary HPE 3PAR StoreServ array

Last week I was able to try something new: Backup from a secondary HPE 3PAR StoreServ array. A customer has two HPE 3PAR StoreServ 8200 in a Peer Persistence setup, a HPE StoreOnce, and a physical Veeam backup server, which also acts as Veeam proxy. Everything is attached to a pretty nice 16 Gb dual Fabric SAN. The customer uses Veeam Backup & Replication 9.5 U3a. The data was taken from the secondary 3PAR StoreServ and it was pushed via FC into a Catalyst Store on a StoreOnce. Using the Catalyst API allows my customer to use Synthetic Full backups, because the creation is offloaded to StoreOnce. This setup is dramatically faster and better than the prior solution based on MicroFocus Data Protector. Okay, this last backup solution was designed to another time with other priorities and requirements. it was a perfect fit at the time it was designed.

This blog post from Veeam pointed me to this new feature: Backup from a secondary HPE 3PAR StoreServ array. Until I found this post, it was planned to use “traditional” storage snapshots, taken from the primary 3PAR StoreServ.

With this feature enabled, Veeam takes the snapshot on the 3PAR StoreServ, that is hosting the synchronous mirrored virtual volume. This graphic was created by Veeam and shows the backup workflow.

Veeam/ Backup from secondary array/ Copyright by Veeam

My tests showed, that it’s blazing fast, pretty easy to setup, and it takes unnecessary load from the primary storage.

In essence, there are only three steps to do:

  • add both 3PARs to Veeam
  • add the registry value and restart the Veeam Backup Server Service
  • enable the usage of storage snapshots in the backup job

How to enable this feature?

To enable this feature, you have to add a single registry value on the Veeam backup server, and afterwards restart the Veeam Backup Server service.

  • Location: HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication\
  • Name: Hp3PARPeerPersistentUseSecondary
  • Type: REG_DWORD (0 False, 1 True)
  • Default value: 0 (disabled)

Thanks to Pierre-Francois from Veeam for sharing his knowledge with the community. Read his blog post Backup from a secondary HPE 3PAR StoreServ array for additional information.

Veeam backups fails because of time differences

Last week I had an interesting incident at a customer. The customer reported that one of multiple Veeam backup jobs jobs constantly failed.

jarmoluk/ pixabay.com/ Creative Commons CC0

The backup job included two VMs, and the backup of one of these VMs failed with this error:

The verified the used credentials for that job, but re-entering the password does not solved the issue. I then checked the Veeam backup logs located under %ProgramData%\Veeam\Backup (look for the Agent.Job_Name.Source.VM_Name.vmdk.log) and found VDDK Error 3014:

The user, that was used to connect to the vCenter, was an Active Directory located account. The account were granted administrator privileges root of the vCenter. Switching from an AD located account to Administrator@vsphere.local solved the issue. Next stop: vmware-sts-idmd.log on the vCenter Server appliance. The error found in this log confirmed my theory, that there was an issue with the authentication itself, not an issue with the AD located account.

To make a long story short: Time differences. The vCenter, the ESXi hosts and some servers had the wrong time. vCenter and ESXi hosts were using the Domain Controllers as time source.

This is the ntpq  output of the vCenter. You might notice the jitter values on the right side, both noted in milliseconds.

After some investigation, the root cause seemed to be a bad DCF77 receiver, which was connected to the domain controller that was hosting the PDC Emulator role. The DCF77 receiver was connected using an USB-2-LAN converter. Instead of using a DCF77 receiver, the customer and I implemented a NTP hierarchy using a valid NTP source on the internet (pool.ntp.org).

vSphere Distributed Switch health check fails on HPE Comware switches

During the replacement of some VMware ESXi hosts at a customer, I discovered a recurrent failure of the vSphere Distributed Switch health checks. A VLAN and MTU mismatch was reported. On the physical side, the ESXi hosts were connected to two HPE 5820 switches, that were configured as an IRF stack. Inside the VMware bubble, the hosts were sharing a vSphere Distributed Switch.

cre8tive / pixelio.de

The switch ports of the old ESXi hosts were configured as Hybrid ports. The switch ports of the new hosts were configured as Trunk ports, to streamline the switch and port configuration.

Some words about port types

Comware knows three different port types:

  • Access
  • Hybrid
  • Trunk

If you were familiar with Cisco, you will know Access and Trunk ports. If you were familiar with HPE ProCurve or Alcatel-Lucent Enterprise, these two port types refer to untagged and tagged ports.

So what is a Hybrid port? A Hybrid port can belong to multiple VLANs where they can be untagged and tagged. Yes, multiple untagged VLANs on a port are possible, but the switch will need additional information to bridge the traffic into correct untagged VLANs. This additional information can be  MAC addresses, IP addresses, LLDP-MED etc. Typically, hybrid ports are used for in VoIP deployments.

The benefit of a Hybrid port is, that I can put the native VLAN of a specific port, which is often referred as Port VLAN identifier (PVID), as a tagged VLAN on that port. This configuration allows, that all dvPortGroups have a VLAN tag assigned, even if the VLAN tag represents the native VLAN of a switch port.

Failing health checks

A failed health check rises a vCenter alarm. In my case, a VLAN and MTU alarm was reported. In both cases, VLAN 1 was causing the error. According to VMware, the three main causes for failed health checks are:

  • Mismatched VLAN trunks between a vSphere distributed switch and physical switch
  • Mismatched MTU settings between physical network adapters, distributed switches, and physical switch ports
  • Mismatched virtual switch teaming policies for the physical switch port-channel settings.

Let’s take a look at the port configuration on the Comware switch:

As you can see, this is a normal trunk port. All VLANs will be passed to the host. This is an except from the  display interface Ten-GigabitEthernet1/0/9  output:

The native VLAN is 1, this is the default configuration. Traffic, that is received and sent from a trunk port, is always tagged with a VLAN id of the originating VLAN – except traffic from the default (native) VLAN! This traffic is sent without a VLAN tag, and if frames were received with a VLAN tag, this frames will be dropped!

If you have a dvPortGroup for the default (native) VLAN, and this dvPortGroup is sending tagged frames, the frames will be dropped if you use a “standard” trunk port. And this is why the health check fails!

Ways to resolve this issue

In my case, the dvPortGroup was configured for VLAN 1, which is the default (native) VLAN on the switch ports.

There are two ways to solve this issue:

  • Remove the VLAN tag from the dvPortGroup configuration
  • Change the PVID for the trunk port

To change the PVID for a trunk port, you have to enter the following command in the interface context:

You have to change the PVID on all ESXi facing switch ports. You can use a non-existing VLAN ID for this.

vSphere Distributed Switch health check will switch to green for VLAN and MTU immediately.

Please note, that this is not the solution for all VLAN-related problems. You should make sure that you are not getting any side effects.

Unsupported hardware family ‘vmx-06’

A customer of mine got an appliance from a software vendor. The appliance was delivered as ZIP file with a VMDK, a MF, and an OVF file. Unfortunately, the appliance was created with VMware Workstation 6.0 with virtual machine hardware version 6, which is incompatible with VMware ESXi (Virtual machine hardware versions). During deployment, my customer got this error:

The OVF file includes a line with the VM hardware version.

If you change this line from vmx-06 to vmx-07, the hash of the OVF changes, and you will get an error during the deployment of the appliance because of the wrong file hash.

Solution

You have to change the SHA256 hash of the OVF, which is included in the MF file.

To create the new SHA256 hash, you can use the PowerShell cmdlet Get-FileHash .

Replace the hash and save the MF file. Then re-deploy the appliance.

Andreas Lesslhumer wrote a similar blog post in 2015:
“Unsupported hardware family vmx-10” during OVF import

Workaround for broken Windows 10 Start Menus with floating desktops

Last month, I wrote about a very annoying issue, that I discovered during a Windows 10 VDI deployment: Roaming of the AppData\Local folder breaks the Start Menu of Windows 10 Enterprise (Roaming of AppData\Local breaks Windows 10 Start Menu). During research, I stumbled over dozens of threads about this issue.

Today, after hours and hours of testing, troubleshooting and reading, I might have found a solution.

The environment

Currently I don’t know if this is a workaround, a weird hack, or no solution at all. Maybe it was luck that none of my 2074203423 logins at different linked-clones resulted in a broken start menu. The customer is running:

  • Horizon View 7.1
  • Windows 10 Enterprise N LTSB 2016 (1607)
  • View Agent 7.1 with enabled Persona Management

Searching for a solution

During my tests, I tried to discover WHY the TileDataLayer breaks. As I wrote in my earlier blog post, it is sufficient to delete the TileDataLayer folder. The folder will be recreated during the next logon, and the start menu is working again. Today, I added path for path to “Files and folders excluded from roamin” GPO setting, and at some point I had a working start menu. With this in mind, I did some research and stumbled over a VMware Communities thread (Vmware Horizon View 7.0.3 – Linked clone – Persistent mode – Persona management – Windows 10 (1607) – -> Windows 10 Start Menu doesn’t work)

User oliober did the same: He roamed only a couple of folders, one of them is the TileDataLayer folder, but not the whole Appdata\Local folder.

The “solution”

To make a long story short: You have to enable the roaming of AppData\Local, but then you exclude AppData\Local, and add only necessary folders to the exclusion list of the exclusion. Sounds funny, but it seems to work.

Feedback is welcome!

I am very interested in feedback. It would be great if you have the chance to verify this behaviour. Please leave a comment with your results.

As I already said: I don’t know if this is a workaround, a hack, a solution, or no solution at all. But for now, it seems to work. Microsoft deprecated TileDataLayer in Windows 10 1703. So for this new Windows 10 build, we have to find another working solution. The above described “solution” only works for 1607. But if you are using the Long Term Service Branch, this solution will work for the next 10 years. ;)

Some thoughts about using Windows Server 2012 R2 instead of Windows 10 for VDI

Disclaimer: The information from this blog post is provided on an “AS IS” basis, without warranties, both express and implied.

Last week, I had an interesting discussion with a customer. Some months back, the customer has decided to kick-off a PoC for a VMware Horizon View based virtual desktop infrastructure (VDI). He is currently using fat-clients with Windows 8.1, and the new environment should run on Windows 10 Enterprise. Last week, we discussed the idea of using Windows Server 2012 R2 as desktop OS.

Horizon View with Windows Server as desktop OS?

My customer has planned to use VMware Horizon View. The latest release is VMware Horizon View 7.2. VMware KB article 2150295 (Supported Guest Operating Systems for Horizon Agent and Remote Experience) lists all supported (non-Windows 10) Microsoft operating systems for different Horizon VIew releases. This article  shows, that Windows Server 2012 R2 (Standard and Datacenter) are both supported with all Horizon View releases, starting with Horizon View 7.0. The installation of a View Agent is supported, and you can create full- and linked-clone desktop pools. But there is also another important KB article: 2150305 (Feature Support Matrix for Horizon Agent). This article lists all available features, and whether they are compatible with a specific OS or not. According to this artice, the

  • Windows Media MMR,
  • VMware Client IP Transparency, and the
  • Horizon Virtualization Pack for Skype for Business

are not supported with Windows Server 2012 R2 and 2016.

From the support perspective, it’s safe to use Windows Server 2012 R2, or 2016, as desktop OS for a VMware Horizon View based virtual desktop infrastructure.

Licensing

Licensing Microsoft Windows for VDI is PITA. It’s all about the virtual desktop access rights, that can be acquired on two different ways:

  • Software Assurance (SA), or
  • Windows Virtual Desktop Access (VDA)

SA and VDA are available per-user and per-device.

Source: Microsoft

You need a SA or a VDA for each accessing device or user. There is no need for additional licenses for your virtual desktops! You will get the right to install Windows 10 Enterprise on your virtual desktops. This includes the  LTSB (Long Term Servicing Branch). LTSP offers updates without delivery of new features for the duration of mainstream support (5 years), and extended support (5 years). Another side effect is, that LTSB does not include most of the annoying Windows apps.

Do yourself a favor, and do not try to setup a VDI with Windows 10 Professional…

Service providers, that offer Desktop-as-a-Service (DaaS), are explicitly excluded from this licensing! They must license their stuff according to Microsofts Services Provider Licensing Agreement (SPLA).

How do I have to license Windows Server 2012 R2, if I want to use Windows Server as desktop OS? Windows Server datacenter licensing allows you to run an unlimited number of server VMs on your licensed hardware. To be clear: Windows Server is licensed per physical server, and there is nothing like license mobility! To license the access to the server, your need two different licenses:

  • Windows Server CAL (device or user), and
  • Remote Desktop Services (RDS) CAL (device or user)

The Windows Server CAL is needed for any access to a Microsoft Windows Server from a client, regardless what service is used (even for DHCP). The RDS CAL must be asssigned to any user or device, that is directly or indirectly interacting with the Windows Server desktop, or using a remote desktop access technology (RDS, PCoIP, Blast Extreme etc.) to access the Windows Server desktop.

With this license setup, you have licensed the Windows Server VM itself, and also the access to this VM. There is no need to purchase a SA or VDA.

Do the math

With this in mind, you have to do the math. Compare the licensing costs for Windows 10 and Windows Server 2012 R2/ 2016 in your specific situation. Setup a PoC to verify your requirements, and the support of your software on Windows Server.

Windows Server can be an interesting alternative compared to Windows 10. Maybe some of you, that already use it with Horizon View, have time to add some comments to this blog post. It would be nice to get some feedback about this topic.

Wrong iovDisableIR setting on ProLiant Gen8 might cause a PSOD

TL;DR: There’s a script at the bottom of the page that fixes the issue.

Some days ago, this HPE customer advisory caught my attention:

Advisory: (Revision) VMware – HPE ProLiant Gen8 Servers running VMware ESXi 5.5 Patch 10, VMware ESXi 6.0 Patch 4, Or VMware ESXi 6.5 May Experience Purple Screen Of Death (PSOD): LINT1 Motherboard Interrupt

And there is also a corrosponding VMware KB article:

ESXi host fails with intermittent NMI PSOD on HP ProLiant Gen8 servers

It isn’t clear WHY this setting was changed, but in VMware ESXi 5.5 patch 10, 6.0  patch 4, 6.0 U3 and, 6.5 the Intel IOMMU’s interrupt remapper functionality was disabled. So if you are running these ESXi versions on a HPE ProLiant Gen8, you might want to check if you are affected.

To make it clear again, only HPE ProLiant Gen8 models are affected. No newer (Gen9) or older (G6, G7) models.

Currently there is no resolution, only a workaround. The iovDisableIR setting must set to FALSE. If it’s set to TRUE, the Intel IOMMU’s interrupt remapper functionality is disabled.

To check this setting, you have to SSH to each host, and use esxcli  to check the current setting:

I have written a small PowerCLI script that uses the Get-EsxCli cmdlet to check all hosts in a cluster. The script only checks the setting, it doesn’t change the iovDisableIR setting.

Here’s another script, that analyzes and fixes the issue.

Creating console screenshots with Get-ScreenshotFromVM.ps1

Today, I had a very interesting discussion. As part of an ongoing troubleshooting process, console screenshots of virtual machines should be created.

The colleagues, who were working on the problem, already found a PowerCLI script that was able to create screenshots using the Managed Object Reference (MoRef). But unfortunately all they got were black screens and/ or login prompts. Latter were the reason why they were unable to run the script unattended. They used the Get-VMScreenshot script, which was written by Martin Pugh.

I had some time to take a look at his script and I created my own script, which is based on his idea and some parts of his code.

This file is also available on GitHub.

One important note: If you want to take console screenshots of VMs, please make sure that display power saving settings are disabled! Windows VMs are showing a black screen after some minutes. Please disable this using the energy options, or better using a GPO. Otherwise you will capture a black screen!

vExpert 2017 – My 2 cents about the increasing number of vExperts

Last Wednesday, VMware has published a list with the vExperts for 2017.

I’m on this list. I’m on this list for the fourth time, which makes me very happy and proud. I was surprised that I’m on this list. I have written only a few blog posts last year. I sometimes tweet about VMware, and I am active in some forums. The focus of this blog has shifted.

Are there too many vExperts?

Eric Siebert from vsphere-land.com wrote a blog post about the vExpert announcement (vExpert 2017 announced and there are still too many vExperts & vExpert class of 2016 announced – are there too many vExperts). Eric thinks that VMware makes it to easy to be a vExpert. There is no definition what it means to significantly contribute to the community.

Yes, Eric is right. The criteria are very “spongy”. And that is a problem. But it is not the problem. When the VMware community grows, the number of vExperts increase automatically.

Betteridge’s law of headlines – the answer is always no

Look at other community programs (Veeam Vanguard, Cisco Champion, PernixPro, Microsoft MVP etc.). These community programs were designed to reward individuals that have highly contributed to the community. These awards motivate individuals to contribute to the community. And if individuals contribute significantly, they are awarded for this. The increasing number of awarded individuals will motivate more community members to contribute. More individuals will be awarded. It’s a self-sustaining process. A process that will help your community to grow.

When you design such a community program, you don’t want to have a small elite group of individuals. You want that your community grows. But you must use the right criterias, and you must held the level high enough. You must not reward anyone, that missed the criteria. Otherwise the title becomes inflationary and loses its importance and reputation.

VMware should work on these criterias. They should rise the bar. But they should make the criterias and the election process transparent for everyone.

Horizon View: Server certificate does not match the external url

Certificates are always fun… or should I say PITA?  Whatever… During a small Horizon View PoC, I noticed an error message for the View Connection Server.

That’s right, Mr. Connection Server. The certificate subject name does not match the servers external URL, as this screenshot clearly shows.

But both settings are unused, because a VMware Access Point appliance is in place. If I remove the certificate, that was issued from a public certificate authority, I get an error message because of an invalid, self signed certificate.

I want to use the certificate on the Horizon View Connection Server, but I also want to get rid of the error message, caused by the wrong subject name. The customer uses split DNS, so he is using the same URL internally and externally, and the certificate uses the external URL as subject name.

Change the URLs

The solution is easy:

  1. Enable the checkboxes for the Secure Tunnel connection and the Blast Secure Gateway
  2. Change the hostname to the name, that matches the subject name of the certificate
  3. Uncheck the checkboxes again, and apply the settings

After a couple of secons, and a refresh of the dashboard, the error for the Connection Server should be gone.