Tag Archives: bug

Wrong iovDisableIR setting on ProLiant Gen8 might cause a PSOD

TL;DR: There’s a script at the bottom of the page that fixes the issue.

Some days ago, this HPE customer advisory caught my attention:

Advisory: (Revision) VMware – HPE ProLiant Gen8 Servers running VMware ESXi 5.5 Patch 10, VMware ESXi 6.0 Patch 4, Or VMware ESXi 6.5 May Experience Purple Screen Of Death (PSOD): LINT1 Motherboard Interrupt

And there is also a corrosponding VMware KB article:

ESXi host fails with intermittent NMI PSOD on HP ProLiant Gen8 servers

It isn’t clear WHY this setting was changed, but in VMware ESXi 5.5 patch 10, 6.0  patch 4, 6.0 U3 and, 6.5 the Intel IOMMU’s interrupt remapper functionality was disabled. So if you are running these ESXi versions on a HPE ProLiant Gen8, you might want to check if you are affected.

To make it clear again, only HPE ProLiant Gen8 models are affected. No newer (Gen9) or older (G6, G7) models.

Currently there is no resolution, only a workaround. The iovDisableIR setting must set to FALSE. If it’s set to TRUE, the Intel IOMMU’s interrupt remapper functionality is disabled.

To check this setting, you have to SSH to each host, and use esxcli  to check the current setting:

I have written a small PowerCLI script that uses the Get-EsxCli cmdlet to check all hosts in a cluster. The script only checks the setting, it doesn’t change the iovDisableIR setting.

Here’s another script, that analyzes and fixes the issue.

HPE 3PAR OS updates that fix VMware VAAI ATS Heartbeat issue

Customers that use HPE 3PAR StoreServs with 3PAR OS 3.2.1 or 3.2.2 and VMware ESXi 5.5 U2 or later, might notice one or more of the following symptoms:

  • hosts lose connectivity to a VMFS5 datastore
  • hosts disconnect from the vCenter
  • VMs hang during I/O operations
  • you see the messages like these in the vobd.log or vCenter Events tab

  • you see the following messages in the vmkernel.log

Interestingly, not only HPE is affected by this. Multiple vendors have the same issue. VMware described this issue in KB2113956. HPE has published a customer advisory about this.

Workaround

If you have trouble and you can update, you can use this workaround. Disable ATS heartbeat for VMFS5 datastores. VMFS3 datastores are not affected by this issue. To disable ATS heartbeat, you can use this PowerCLI one-liner:

Solution

But there is also a solution. Most vendors have published firwmare updates for their products. HPE has released

  • 3PAR OS 3.2.2 MU3
  • 3PAR OS 3.2.2 EMU2 P33, and
  • 3PAR OS 3.2.1 EMU3 P45

All three releases of 3PAR OS include enhancements to improve ATS heartbeat. Because 3PAR OS 3.2.2 has also some nice enhancements for Adaptive Optimization, I recommend to update to 3PAR OS 3.2.2.

Data Protector: Copy sessions to encrypted devices fail after update to 9.07

Recently, a customer has informed me, that copy sessions to encrypted devices failed, after he has made an update to Data Protector 9.07. The copy sessions failed with this error:

The customer uses tape encryption. The destination for the backups is a HPE StoreOnce, and a post-backup copy creates a copy of the data on tape. Backup to disk was running fine, but the copy to tape failed immediately.

The customer has opened a ticket at the HPE support and got instantly a hotfix to resolve this issue. HPE has documented this error in QCCR2A69192. If you run into the same issue, please request hotfix QCCR2A69802. This hotfix consolidates QCCR2A69192 and QCCR2A69318 (The BMA ends abnormally during backup/copy to tape).

Thanks to Stefan for the hint!

Receive Connector role not selectable in Exchange 2016 CU2

Another bug in Exchange 2016 CU2. The Role of a new receive connector is greyed out. You can select “Front-End-Transport”. This is a screenshot from a german Exchange 2016 CU2.

receive_connect_role_not_selectable

Solution

Use the Exchange Management Shell to create a new receive connector. Afterwards, you can modify it with the Exchange Control Panel (ECP).

Microsoft has confirmed, that this is a bug in Exchange 2016 CU2.

WSUS on Windows 2012 (R2) and KB3159706 – WSUS console fails to connect

As any other environment, my lab needs some maintenance from time to time. I use a Windows 2012 R2 VM with the Windows Server Update Service (WSUS) role to keep my Windows VMs up to date. Like many others, I was surprised by KB3148812 (Update enables ESD decryption provision in WSUS in Windows Server 2012 and Windows Server 2012 R2), which broke my WSUS. But the fix was easy: Uninstall KB3148812 and reboot the server. The WSUS product team published an artice about this known issue in their blog: Known Issues with KB3148812. In the meantime, Microsoft has published a new update, which supersedes KB3148812: KB3159706.

WSUS dead again?

Today I wanted to check the update status of my VMs. Unfortunately, the WSUS console was unable to connect to the WSUS server.

wsus_2

I checked the status of the service and found the WSUS service stopped. But even after I had started the service, the WSUS console was unable to connect to the server. I found an error in the event logs (ID 507, source Windows Server Update Services), but the message “Update Services failed its initialization and stopped” wasn’t helpful. More helpful was a log entry:

After some searching and examination of the recently installed updates, I came across KB3159706.

Manual steps required to complete the installation of KB3159706

Open an elevated CMD and run this command:

The output should look similar to this:

Then you have to install the “HTTP Activation” feature under “.NET Framework 4.5” features.

wsus_3
After a restart of the WSUS service, the WSUS should work again.

Summary

The installation of KB3148812 on a WSUS server will break the WSUS installation. Because of this, Microsoft has published KB3159706. If you install this update (in my case it was installed automatically over WSUS…), you have to execute some manual steps to ensure that WSUS works as expected. The WSUS product team is aware of this and they pointed this out in their blog article “The long-term fix for KB3148812 issues” (you will find a hint directly at top of the blog article).

Guest customizations fails after upgrade to VMware vSphere 6

VMware vSphere 6 is now an year old and it was time to update my lab to vSphere 6. The update went smooth, and everything has worked as expected. Some days later, I updated the master VM of a small automated desktop pool. I’m using VMware Horizon 6.2.1 in my lab to deploy a small number of Windows 8.1 VMs for tests, administration etc. The recompose of the pool failed during the guest customization.

view_error_decrypt_password

I checked the customization specification immediately and got an error in the vSphere C# client.

vcsa_error_decrypt_password

Interestingly, I got no error in the vSphere Web Client:

vcsa_error_decrypt_password_web_client

After re-entering the Administrator password, the  customization specification was usable again. No errors so far.

A quick search in the VMware KB lead me to the article “Virtual machines with customizations fail to deploy when using Custom SSL Certificates (1019893)“. But this article doesn’t apply to vCenter 6.0. For the notes: I’m using CA-signed certificates in my environment. It seems to be a good idea to re-enter the passwords in customization specifications after a vCenter migration/ upgrade (5.x to 6.x or from VCSA 5.x to 6.x).

Screen resolution scaling has stopped working after Horizon View agent update

Another inconvenience that I noticed during the update process from VMware Horizon View 6.1.1 to 6.2 was, that the automatic screen resizing stopped working. When I connected to a desktop pool with the VMware Horizon client, I only got the screen resolution of the VM (the resolution that is used when connecting to the VM with the vSphere console)), not 1920×1200 as expected. This issue only occured with PCoIP, not with RDP. I had this issue with a static desktop and a dynamic desktop pool, and it occurred after updating the Horizon View agent. The resolution scaling worked with a Windows 2012 R2 RDS host, when I connected to a RDS with PCoIP.

VMware KB1018158 (Configuring PCoIP for use with View Manager) did not solved the problem. I checked the VMX version, the video RAM config etc. Nothing has changed, everything was configured as expected. At this point it was clear to me, that this must be an issue with the Horizon View agent. I took some snapshots and tried to reinstall the Horizon View agent. I removed the Horizon View agent and the VMware tools from one of my static desktops. After a reboot, I installed the VMware tools and then the Horizon agent. To my surprise, this first attempt has solved the problem. I tried the same with my second static desktop pool VM and with the master VM of my dynamic desktop pool (don’t forget to recompose the VMs…). This workaround has fixed the problem in each case.

I don’t know if this is a bug. I haven’t found any hints in the VMware Community forum or blogs. Maybe someone knows the answer.

VMware Horizon View agent update on RDS host fails with “Internal Error 25030”

I’m running a small VMware Horizon View environment in my lab. Nothing fancy, but all you need to show what Horizon View can do for you. This environment includes a Windows Server 2012 R2 RDS host. During the update process from Horizon View 6.1.1 to 6.2, I had to update the View agent on this RDS host. This update installation failed with an “Internal Error 25030”, followed by a rollback. Fortunately I had a snapshot, so I went back to the previous state and tried the update again. This attempt also went awry.

To make a long story short: Read the fscking release notes! This quote is taken from the Horizon View 6.2 release notes:

When you upgrade View Agent 6.1.1 to View Agent 6.2 on an RDS host running on Windows Server 2012 or 2012 R2, the upgrade fails with an “Internal Error 25030” message.
Workaround: Uninstall View Agent 6.1.1, restart the RDS host, and install View Agent 6.2.

And this is not the first time that this error occurred. I found this quote in the the Horizon View 6.1.1 release notes:

When you upgrade View Agent 6.1 to View Agent 6.1.1 on an RDS host running on Windows Server 2012 or 2012 R2, the upgrade fails with an “Internal Error 25030” message.
Workaround: Uninstall View Agent 6.1, restart the RDS host, and install View Agent 6.1.1

If you take a closer look at these two statements, you might notice some similarities… But I do not want to be spiteful. The workaround did the trick. Simply uninstall the View agent (if it’s still installed after the rollback… that was not the case with me), reboot and reinstall the View agent.

Error 1325: VBRCatalog is not a valid short file name

While upgrading a rather old (but very stable) Veeam Backup & Replication 6.1 installation to 8.0 Update 3 (with intermediate step to 6.5), I ran into a curious error. Right after the welcome screen, this error message

veeam_error_1325

appeared. A closer look into the BackupSetup.log (you can find this log in the %temp% dir. Just enter %temp% into the Explorer address bar) resulted in this very interesting log entry:

First, the VBRCatalog folder was located under D:\Veeam, so why the hell was the CATALOGPATH property changed to E:\VBRCatalog? I searched the registry for for E:\VBRCatalog and found multiple entries for it. One of the entries was located under “HKLM\SOFTWARE\Wow6432Node\Veeam\Veeam Backup Catalog”. The entry under “HKLM\SOFTWARE\Veeam\Veeam Backup Catalog” pointed to the correct path. I found some other entries, e.g. in connection with Windows Installer.

After changing all found entries to the correct path, the update went smooth. The reason for this error was that the VBRCatalog was moved after the installation. I did this more than 3 years ago and followed Veeam KB1453. But this article only describes the change of the CatalogPath entry under “HKLM\SOFTWARE\Veeam\Veeam Backup Catalog”. You have to change all references to the old VBRCatalog path! Otherwise you will run into the same error as I.

 

Microsoft Exchange 2013 shows blank ECP & OWA after changes to SSL certificates

EDIT
This issue is described in KB2971270 and is fixed in CU6.

I ran a couple of times in this error. After applying changes to SSL certificates (add, replace or delete a SSL certificate) and rebooting the server, the event log is flooded with events from source “HttpEvent” and event id 15021. The message says:

If you try to access the Exchange Control Panel (ECP) or Outlook Web Access (OWA), you will get a blank website. To solve this issue, open up an elevated command prompt on your Exchange 2013 server.

Check the certificate hash and appliaction ID for 0.0.0.0:443, 0.0.0.0:444 and 127.0.0.1:443. You will notice, that the application ID for this three entries is the same, but the certificate hash for 0.0.0.0:444 differs from the other two entries. And that’s the point. Remove the certificate for 0.0.0.0:444.

Now add it again with the correct certificate hash and application ID.

That’s it. Reboot the Exchange 2013 server and everything should be up and running again.