Tag Archives: vExpert

Veeam backups fails because of time differences

This posting is ~2 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Last week I had an interesting incident at a customer. The customer reported that one of multiple Veeam backup jobs jobs constantly failed.

jarmoluk/ pixabay.com/ Creative Commons CC0

The backup job included two VMs, and the backup of one of these VMs failed with this error:

The verified the used credentials for that job, but re-entering the password does not solved the issue. I then checked the Veeam backup logs located under %ProgramData%\Veeam\Backup (look for the Agent.Job_Name.Source.VM_Name.vmdk.log) and found VDDK Error 3014:

The user, that was used to connect to the vCenter, was an Active Directory located account. The account were granted administrator privileges root of the vCenter. Switching from an AD located account to Administrator@vsphere.local solved the issue. Next stop: vmware-sts-idmd.log on the vCenter Server appliance. The error found in this log confirmed my theory, that there was an issue with the authentication itself, not an issue with the AD located account.

To make a long story short: Time differences. The vCenter, the ESXi hosts and some servers had the wrong time. vCenter and ESXi hosts were using the Domain Controllers as time source.

This is the ntpq  output of the vCenter. You might notice the jitter values on the right side, both noted in milliseconds.

After some investigation, the root cause seemed to be a bad DCF77 receiver, which was connected to the domain controller that was hosting the PDC Emulator role. The DCF77 receiver was connected using an USB-2-LAN converter. Instead of using a DCF77 receiver, the customer and I implemented a NTP hierarchy using a valid NTP source on the internet (pool.ntp.org).

vSphere Distributed Switch health check fails on HPE Comware switches

This posting is ~2 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

During the replacement of some VMware ESXi hosts at a customer, I discovered a recurrent failure of the vSphere Distributed Switch health checks. A VLAN and MTU mismatch was reported. On the physical side, the ESXi hosts were connected to two HPE 5820 switches, that were configured as an IRF stack. Inside the VMware bubble, the hosts were sharing a vSphere Distributed Switch.

cre8tive / pixelio.de

The switch ports of the old ESXi hosts were configured as Hybrid ports. The switch ports of the new hosts were configured as Trunk ports, to streamline the switch and port configuration.

Some words about port types

Comware knows three different port types:

  • Access
  • Hybrid
  • Trunk

If you were familiar with Cisco, you will know Access and Trunk ports. If you were familiar with HPE ProCurve or Alcatel-Lucent Enterprise, these two port types refer to untagged and tagged ports.

So what is a Hybrid port? A Hybrid port can belong to multiple VLANs where they can be untagged and tagged. Yes, multiple untagged VLANs on a port are possible, but the switch will need additional information to bridge the traffic into correct untagged VLANs. This additional information can be  MAC addresses, IP addresses, LLDP-MED etc. Typically, hybrid ports are used for in VoIP deployments.

The benefit of a Hybrid port is, that I can put the native VLAN of a specific port, which is often referred as Port VLAN identifier (PVID), as a tagged VLAN on that port. This configuration allows, that all dvPortGroups have a VLAN tag assigned, even if the VLAN tag represents the native VLAN of a switch port.

Failing health checks

A failed health check rises a vCenter alarm. In my case, a VLAN and MTU alarm was reported. In both cases, VLAN 1 was causing the error. According to VMware, the three main causes for failed health checks are:

  • Mismatched VLAN trunks between a vSphere distributed switch and physical switch
  • Mismatched MTU settings between physical network adapters, distributed switches, and physical switch ports
  • Mismatched virtual switch teaming policies for the physical switch port-channel settings.

Let’s take a look at the port configuration on the Comware switch:

As you can see, this is a normal trunk port. All VLANs will be passed to the host. This is an except from the  display interface Ten-GigabitEthernet1/0/9  output:

The native VLAN is 1, this is the default configuration. Traffic, that is received and sent from a trunk port, is always tagged with a VLAN id of the originating VLAN – except traffic from the default (native) VLAN! This traffic is sent without a VLAN tag, and if frames were received with a VLAN tag, this frames will be dropped!

If you have a dvPortGroup for the default (native) VLAN, and this dvPortGroup is sending tagged frames, the frames will be dropped if you use a “standard” trunk port. And this is why the health check fails!

Ways to resolve this issue

In my case, the dvPortGroup was configured for VLAN 1, which is the default (native) VLAN on the switch ports.

There are two ways to solve this issue:

  • Remove the VLAN tag from the dvPortGroup configuration
  • Change the PVID for the trunk port

To change the PVID for a trunk port, you have to enter the following command in the interface context:

You have to change the PVID on all ESXi facing switch ports. You can use a non-existing VLAN ID for this.

vSphere Distributed Switch health check will switch to green for VLAN and MTU immediately.

Please note, that this is not the solution for all VLAN-related problems. You should make sure that you are not getting any side effects.

Unsupported hardware family ‘vmx-06’

This posting is ~2 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

A customer of mine got an appliance from a software vendor. The appliance was delivered as ZIP file with a VMDK, a MF, and an OVF file. Unfortunately, the appliance was created with VMware Workstation 6.0 with virtual machine hardware version 6, which is incompatible with VMware ESXi (Virtual machine hardware versions). During deployment, my customer got this error:

The OVF file includes a line with the VM hardware version.

If you change this line from vmx-06 to vmx-07, the hash of the OVF changes, and you will get an error during the deployment of the appliance because of the wrong file hash.

Solution

You have to change the SHA256 hash of the OVF, which is included in the MF file.

To create the new SHA256 hash, you can use the PowerShell cmdlet Get-FileHash .

Replace the hash and save the MF file. Then re-deploy the appliance.

Andreas Lesslhumer wrote a similar blog post in 2015:
“Unsupported hardware family vmx-10” during OVF import

Workaround for broken Windows 10 Start Menus with floating desktops

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Last month, I wrote about a very annoying issue, that I discovered during a Windows 10 VDI deployment: Roaming of the AppData\Local folder breaks the Start Menu of Windows 10 Enterprise (Roaming of AppData\Local breaks Windows 10 Start Menu). During research, I stumbled over dozens of threads about this issue.

Today, after hours and hours of testing, troubleshooting and reading, I might have found a solution.

The environment

Currently I don’t know if this is a workaround, a weird hack, or no solution at all. Maybe it was luck that none of my 2074203423 logins at different linked-clones resulted in a broken start menu. The customer is running:

  • Horizon View 7.1
  • Windows 10 Enterprise N LTSB 2016 (1607)
  • View Agent 7.1 with enabled Persona Management

Searching for a solution

During my tests, I tried to discover WHY the TileDataLayer breaks. As I wrote in my earlier blog post, it is sufficient to delete the TileDataLayer folder. The folder will be recreated during the next logon, and the start menu is working again. Today, I added path for path to “Files and folders excluded from roamin” GPO setting, and at some point I had a working start menu. With this in mind, I did some research and stumbled over a VMware Communities thread (Vmware Horizon View 7.0.3 – Linked clone – Persistent mode – Persona management – Windows 10 (1607) – -> Windows 10 Start Menu doesn’t work)

User oliober did the same: He roamed only a couple of folders, one of them is the TileDataLayer folder, but not the whole Appdata\Local folder.

The “solution”

To make a long story short: You have to enable the roaming of AppData\Local, but then you exclude AppData\Local, and add only necessary folders to the exclusion list of the exclusion. Sounds funny, but it seems to work.

Horizon View GPO AppData Roaming

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Feedback is welcome!

I am very interested in feedback. It would be great if you have the chance to verify this behaviour. Please leave a comment with your results.

As I already said: I don’t know if this is a workaround, a hack, a solution, or no solution at all. But for now, it seems to work. Microsoft deprecated TileDataLayer in Windows 10 1703. So for this new Windows 10 build, we have to find another working solution. The above described “solution” only works for 1607. But if you are using the Long Term Service Branch, this solution will work for the next 10 years. ;)

Some thoughts about using Windows Server 2012 R2 instead of Windows 10 for VDI

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.
Disclaimer: The information from this blog post is provided on an “AS IS” basis, without warranties, both express and implied.

Last week, I had an interesting discussion with a customer. Some months back, the customer has decided to kick-off a PoC for a VMware Horizon View based virtual desktop infrastructure (VDI). He is currently using fat-clients with Windows 8.1, and the new environment should run on Windows 10 Enterprise. Last week, we discussed the idea of using Windows Server 2012 R2 as desktop OS.

Horizon View with Windows Server as desktop OS?

My customer has planned to use VMware Horizon View. The latest release is VMware Horizon View 7.2. VMware KB article 2150295 (Supported Guest Operating Systems for Horizon Agent and Remote Experience) lists all supported (non-Windows 10) Microsoft operating systems for different Horizon VIew releases. This article  shows, that Windows Server 2012 R2 (Standard and Datacenter) are both supported with all Horizon View releases, starting with Horizon View 7.0. The installation of a View Agent is supported, and you can create full- and linked-clone desktop pools. But there is also another important KB article: 2150305 (Feature Support Matrix for Horizon Agent). This article lists all available features, and whether they are compatible with a specific OS or not. According to this artice, the

  • Windows Media MMR,
  • VMware Client IP Transparency, and the
  • Horizon Virtualization Pack for Skype for Business

are not supported with Windows Server 2012 R2 and 2016.

From the support perspective, it’s safe to use Windows Server 2012 R2, or 2016, as desktop OS for a VMware Horizon View based virtual desktop infrastructure.

Licensing

Licensing Microsoft Windows for VDI is PITA. It’s all about the virtual desktop access rights, that can be acquired on two different ways:

  • Software Assurance (SA), or
  • Windows Virtual Desktop Access (VDA)

SA and VDA are available per-user and per-device.

Windows VDI Licensing

Microsoft/ microsoft.com

You need a SA or a VDA for each accessing device or user. There is no need for additional licenses for your virtual desktops! You will get the right to install Windows 10 Enterprise on your virtual desktops. This includes the  LTSB (Long Term Servicing Branch). LTSP offers updates without delivery of new features for the duration of mainstream support (5 years), and extended support (5 years). Another side effect is, that LTSB does not include most of the annoying Windows apps.

Do yourself a favor, and do not try to setup a VDI with Windows 10 Professional…

Service providers, that offer Desktop-as-a-Service (DaaS), are explicitly excluded from this licensing! They must license their stuff according to Microsofts Services Provider Licensing Agreement (SPLA).

How do I have to license Windows Server 2012 R2, if I want to use Windows Server as desktop OS? Windows Server datacenter licensing allows you to run an unlimited number of server VMs on your licensed hardware. To be clear: Windows Server is licensed per physical server, and there is nothing like license mobility! To license the access to the server, your need two different licenses:

  • Windows Server CAL (device or user), and
  • Remote Desktop Services (RDS) CAL (device or user)

The Windows Server CAL is needed for any access to a Microsoft Windows Server from a client, regardless what service is used (even for DHCP). The RDS CAL must be asssigned to any user or device, that is directly or indirectly interacting with the Windows Server desktop, or using a remote desktop access technology (RDS, PCoIP, Blast Extreme etc.) to access the Windows Server desktop.

With this license setup, you have licensed the Windows Server VM itself, and also the access to this VM. There is no need to purchase a SA or VDA.

Do the math

With this in mind, you have to do the math. Compare the licensing costs for Windows 10 and Windows Server 2012 R2/ 2016 in your specific situation. Setup a PoC to verify your requirements, and the support of your software on Windows Server.

Windows Server can be an interesting alternative compared to Windows 10. Maybe some of you, that already use it with Horizon View, have time to add some comments to this blog post. It would be nice to get some feedback about this topic.

Wrong iovDisableIR setting on ProLiant Gen8 might cause a PSOD

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

TL;DR: There’s a script at the bottom of the page that fixes the issue.

Some days ago, this HPE customer advisory caught my attention:

Advisory: (Revision) VMware – HPE ProLiant Gen8 Servers running VMware ESXi 5.5 Patch 10, VMware ESXi 6.0 Patch 4, Or VMware ESXi 6.5 May Experience Purple Screen Of Death (PSOD): LINT1 Motherboard Interrupt

And there is also a corrosponding VMware KB article:

ESXi host fails with intermittent NMI PSOD on HP ProLiant Gen8 servers

It isn’t clear WHY this setting was changed, but in VMware ESXi 5.5 patch 10, 6.0  patch 4, 6.0 U3 and, 6.5 the Intel IOMMU’s interrupt remapper functionality was disabled. So if you are running these ESXi versions on a HPE ProLiant Gen8, you might want to check if you are affected.

To make it clear again, only HPE ProLiant Gen8 models are affected. No newer (Gen9) or older (G6, G7) models.

Currently there is no resolution, only a workaround. The iovDisableIR setting must set to FALSE. If it’s set to TRUE, the Intel IOMMU’s interrupt remapper functionality is disabled.

To check this setting, you have to SSH to each host, and use esxcli  to check the current setting:

I have written a small PowerCLI script that uses the Get-EsxCli cmdlet to check all hosts in a cluster. The script only checks the setting, it doesn’t change the iovDisableIR setting.

Here’s another script, that analyzes and fixes the issue.

Creating console screenshots with Get-ScreenshotFromVM.ps1

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Today, I had a very interesting discussion. As part of an ongoing troubleshooting process, console screenshots of virtual machines should be created.

The colleagues, who were working on the problem, already found a PowerCLI script that was able to create screenshots using the Managed Object Reference (MoRef). But unfortunately all they got were black screens and/ or login prompts. Latter were the reason why they were unable to run the script unattended. They used the Get-VMScreenshot script, which was written by Martin Pugh.

I had some time to take a look at his script and I created my own script, which is based on his idea and some parts of his code.

This file is also available on GitHub.

One important note: If you want to take console screenshots of VMs, please make sure that display power saving settings are disabled! Windows VMs are showing a black screen after some minutes. Please disable this using the energy options, or better using a GPO. Otherwise you will capture a black screen!

vExpert 2017 – My 2 cents about the increasing number of vExperts

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Last Wednesday, VMware has published a list with the vExperts for 2017.

I’m on this list. I’m on this list for the fourth time, which makes me very happy and proud. I was surprised that I’m on this list. I have written only a few blog posts last year. I sometimes tweet about VMware, and I am active in some forums. The focus of this blog has shifted.

Are there too many vExperts?

Eric Siebert from vsphere-land.com wrote a blog post about the vExpert announcement (vExpert 2017 announced and there are still too many vExperts & vExpert class of 2016 announced – are there too many vExperts). Eric thinks that VMware makes it to easy to be a vExpert. There is no definition what it means to significantly contribute to the community.

Yes, Eric is right. The criteria are very “spongy”. And that is a problem. But it is not the problem. When the VMware community grows, the number of vExperts increase automatically.

Betteridge’s law of headlines – the answer is always no

Look at other community programs (Veeam Vanguard, Cisco Champion, PernixPro, Microsoft MVP etc.). These community programs were designed to reward individuals that have highly contributed to the community. These awards motivate individuals to contribute to the community. And if individuals contribute significantly, they are awarded for this. The increasing number of awarded individuals will motivate more community members to contribute. More individuals will be awarded. It’s a self-sustaining process. A process that will help your community to grow.

When you design such a community program, you don’t want to have a small elite group of individuals. You want that your community grows. But you must use the right criterias, and you must held the level high enough. You must not reward anyone, that missed the criteria. Otherwise the title becomes inflationary and loses its importance and reputation.

VMware should work on these criterias. They should rise the bar. But they should make the criterias and the election process transparent for everyone.

Horizon View: Server certificate does not match the external url

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.
Horizon View Certificate Error

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Certificates are always fun… or should I say PITA?  Whatever… During a small Horizon View PoC, I noticed an error message for the View Connection Server.

That’s right, Mr. Connection Server. The certificate subject name does not match the servers external URL, as this screenshot clearly shows.

Horizon View Secure Tunnel Settings

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

But both settings are unused, because a VMware Access Point appliance is in place. If I remove the certificate, that was issued from a public certificate authority, I get an error message because of an invalid, self signed certificate.

I want to use the certificate on the Horizon View Connection Server, but I also want to get rid of the error message, caused by the wrong subject name. The customer uses split DNS, so he is using the same URL internally and externally, and the certificate uses the external URL as subject name.

Change the URLs

The solution is easy:

  1. Enable the checkboxes for the Secure Tunnel connection and the Blast Secure Gateway
  2. Change the hostname to the name, that matches the subject name of the certificate
  3. Uncheck the checkboxes again, and apply the settings
Horizon View Secure Tunnel Settings

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

After a couple of secons, and a refresh of the dashboard, the error for the Connection Server should be gone.

Horizon View Connection Server Details

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

VMware EUC Access Point appliance – Name resolution not working after deployment

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

As part of a project, I had to deploy a VMware EUC Access Point appliance. Nothing fancy, because the awesome VMware Access Point Deployment Utility makes it easy to deploy.

Unfortunately, the deployed Access Point appliance was not working as expected. When I tried to access my Horizon View infrastructure behind the Access Point appliance, I got a HTTP 504 error. The REST API interface was working. I was able to exclude invalid certificates, routing, or firewall policies. I re-deployed the appliance using the the IP address of the connection server, instead of the FQDN. And this worked… I checked the name resolution with nslookup and the name resolution failed. So that was probably the problem.

One per line

To make a long story short: The DNS server, I entered in the VMware Access Point Deployment Utility, were added in a single line to the /etc/resolv.conf

This is wrong, even if the VMware Access Point Deployment Utility claims something different.

euc_deployment_dns

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

There must be a single “nameserver” entry for each DNS server.

You can easily change this after the deployment. Add only one DNS server during the deployment, and then add the second DNS server after the deployment.

I would like to highlight, that Chris Halstead mentioned this behaviour a year ago in his blog post “VMware Access Point Deployment Utility“. Chris is the author of the Deployment Utility.