Replacing an expired lookup service SSL certificate on a vSphere PSC

A few days ago, I ran into a very nasty problem. Fortunately, it was in my lab. Some months ago, I replaced the certificates of my vCenter Server Appliance (VCSA), and I’ve chosen to use the VMware Certificate Authority (VMCA) as a subordinate of my AD-based enterprise CA. The VMCA was used as intermediate CA. The certificates were replaced using the  vSphere 6.0 Certificate Manager (/usr/lib/vmware-vmca/bin/certificate-manager), and I followed the instructions of KB2112016 (Configuring VMware vSphere 6.0 VMware Certificate Authority as a subordinate Certificate Authority).

The VCSA was migrated from vSphere 5.5, and with vSphere 5.5 I was also using custom certificates. These certificates were also issued by my AD-based enterprise CA, and these certificates were migration during the vSphere 5.5 > 6.0 migration. So at the end, I replaced custom certificates with VMCA (as an intermediate CA) certificates.

Everything was fine, until a power outage. After powering-on my VMs, I noticed several errors. After logging into the vSphere Web Client, I got an error message at the top of the page:

Error occurred while processing request. Check vSphere WebClient logs for details.

While searching for the cause, I checked the URL of the Platform Services Controller (https://vcsa1.lab.local/psc/login) and got this:

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

HTTP Status 400 - An error occurred while sending an authentication request to the PSC Single Sign-On server - null

type Status report
message An error occurred while sending an authentication request to the PSC Single Sign-On server - null
description The request sent by the client was syntactically incorrect.

This error led me to KB2144086 (Updating certificates using certificate manager on vCenter Server or PSC 6.0 Update 1b fails), but was able to proof, that I have used different subject names for the different solution user certificates.

While digging in the PSC logs, I found this error in the /var/log/vmware/psc-client/psc-client.log:

Caused by: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint doesn't match
        at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:217)
        at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(Unknown Source)

        ... 71 more

Finally, I found Aaron Smiths blog post “Troubleshooting Expired PSC Certificates with vSphere 6”, who had the same problem. I checked the certificate of the Lookup Service and there it was:

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

This was the original custom certificate, issued by my AD-based enterprise CA, and installed on my vSphere 5.5 VCSA.

Aaron also offered the solution by referencing KB2118939 (Replacing the Lookup Service SSL certificate on a Platform Services Controller 6.0). I followed the instructions in KB2118939 and replaced the certificate of the Lookup Service with a certificate of the VMCA.

Take care of your certificates

With vSphere 6.0, the Lookup Service should be accessed through the HTTP Reverse Proxy. This proxy uses the machine certificate. Therefore, an expired Lookup Certificate is not obvious. If you connect directly to the Lookup Service using port 7444, you will see the expired certificate. The Lookup Service certificate is not replaced with a custom certificate, if you replace the different solution user certificates.

If you have a vSphere 6.0 VCSA, which was migrated from vSphere 5.5, and you have replaced the certificates on that vSphere 5.5 VCSA with custom certificates, you should check your Lookup Service certificate immidiately! Follow KB2118939 for further instructions.

Credit to Aaron Smith for this blog post. Thank you!