Tag Archives: vsphere

Powering on a VM with shared VMDK fails after extending a EagerZeroedThick VMDK

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

I hope that you are not reading this blog post while searching for a solution for a failed cluster. If so, feel free to leave a comment if this blog post saved your evening or weekend. :)

Last friday, a change at one of my customers went horribly wrong. I was not onsite, but they contacted me during the night from friday to saturday, because their most important Windows Server Failover Cluster was unable to start after extending a shared VMDK.

cripi/ pixabay.com/ Creative Commons CC0

They tried something pretty simple: Extending an virtual disk of a VM. That is something most of us doing pretty often. The customer did this also pretty often. It was a well known task… Except the fact, that the VM was part of a Windows Server Failover Cluster. With shared VMDKs. And the disks were EagerZeroedThick, because this is a requirement for shared VMDKs.

They extended the disk using the vSphere Web Client. And at this point, the change was doomed to fail. They tried to power-on the VMs, but all they got was this error:

VMware ESX cannot open the virtual disk, “/vmfs/volumes/4c549ecd-66066010-e610-002354a2261b/VMNAME/VMDKNAME.vmdk” for clustering. Please verify that the virtual disk was created using the ‘thick’ option.

A shared VMDK is a VMDK in multiwriter mode. This VMDK has to be created as Thick Provision Eager Zeroed. And if you wish to extend this VMDK, you must use vmkfstools  with the option -d eagerzeroedthick. If you extend the VMDK using the Web Client, the extended portion of the disk will become LazyZeroed!

VMware has described this behaviour in the KB1033570 (Powering on the virtual machine fails with the error: Thin/TBZ disks cannot be opened in multiwriter mode). There is also a blog post by Cormac Hogan at VMware, who has described this behaviour.

That’s a screenshot from the failed cluster. Check out the type of the disk (Thick-Provision Lazy-Zeroed).

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

You must use vmkfstools  to extend a shared VMDK – but vmkfstools is also the solution, if you have trapped into this pitfall. Clone the VMDK with option -d eagerzeroedthick.

vmkfstools -i old.vmdk new.vmdk -d eagerzeroedthick

Another solution, which was new to me, is to use Storage vMotion. You can migrate the “broken” VMDK to another datastore and change the the disk format during Storage vMotion. This solution is described in the “Notes” section of KB1033570.

Both ways will fix the problem. The result will be a Thick Provision Eager Zeroed VMDK, which will allow the VMs to be successfully powered on.

Veeam backups fails because of time differences

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Last week I had an interesting incident at a customer. The customer reported that one of multiple Veeam backup jobs jobs constantly failed.

jarmoluk/ pixabay.com/ Creative Commons CC0

The backup job included two VMs, and the backup of one of these VMs failed with this error:

Error: Failed to open VDDK disk [[VMDS-SAS-01] VMDC1/VMDC1_1.vmdk] ( is read-only mode - [true] ) 
Failed to open virtual disk Logon attempt with parameters [VC/ESX: [vcenter.domain.tld];Port: 443;Login: 
[AD\Administrator];VMX Spec: [moref=vm-59];Snapshot mor: [snapshot-20226];Transports: [san];Read Only: [true]]
failed because of the following errors: Failed to open virtual disk Logon attempt with parameters 
VC/ESX: [vcenter.domain.tld];Port: 443;Login: [AD\Administrator

The verified the used credentials for that job, but re-entering the password does not solved the issue. I then checked the Veeam backup logs located under %ProgramData%\Veeam\Backup (look for the Agent.Job_Name.Source.VM_Name.vmdk.log) and found VDDK Error 3014:

Insufficient permissions in the host operating system

The user, that was used to connect to the vCenter, was an Active Directory located account. The account were granted administrator privileges root of the vCenter. Switching from an AD located account to Administrator@vsphere.local solved the issue. Next stop: vmware-sts-idmd.log on the vCenter Server appliance. The error found in this log confirmed my theory, that there was an issue with the authentication itself, not an issue with the AD located account.

[2018-07-04T11:59:49.848+02:00 vsphere.local        142f5216-8316-4752-b02c-e02be4154816 INFO ] [VmEventAppender] EventLog: source=[VMware Identity Server], tenant=[vsphere.local], eventid=[USER_NAME_PWD_AUTH_FAILED], level=[ERROR], category=[VMEVENT_CATEGORY_IDM], text=[Failed to authenticate principal [AD\Administrator]. Native platform error [code: 851968][null][null]], detailText=[com.vmware.identity.interop.idm.IdmNativeException: Native platform error [code: 851968][null][null]
[2018-07-04T11:59:49.848+02:00 vsphere.local        142f5216-8316-4752-b02c-e02be4154816 ERROR] [IdentityManager] Failed to authenticate principal [AD\Administrator]. Native platform error [code: 851968][null][null]
com.vmware.identity.interop.idm.IdmNativeException: Native platform error [code: 851968][null][null]

[2018-07-04T12:10:41.603+02:00 vsphere.local        64051ea1-0d7f-453d-8e34-92f0c8c37e77 INFO ] [IdentityManager] Authentication succeeded for user [AD\Administrator] in tenant [vsphere.local] in [37] milliseconds with provider [ad.domain.tld] of type [com.vmware.identity.idm.server.provider.activedirectory.ActiveDirectoryProvider]

To make a long story short: Time differences. The vCenter, the ESXi hosts and some servers had the wrong time. vCenter and ESXi hosts were using the Domain Controllers as time source.

This is the ntpq  output of the vCenter. You might notice the jitter values on the right side, both noted in milliseconds.

vcenter:/storage/log/vmware/sso # ntpq
ntpq> peer
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
vmdc2.ad.       192.168.16.11    2 u   53   64  363    0.532  207.553 152007.
vmdc1.ad.       .LOCL.           1 u    2   64  377    0.257  204.559 161964.

After some investigation, the root cause seemed to be a bad DCF77 receiver, which was connected to the domain controller that was hosting the PDC Emulator role. The DCF77 receiver was connected using an USB-2-LAN converter. Instead of using a DCF77 receiver, the customer and I implemented a NTP hierarchy using a valid NTP source on the internet (pool.ntp.org).

Unsupported hardware family ‘vmx-06’

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

A customer of mine got an appliance from a software vendor. The appliance was delivered as ZIP file with a VMDK, a MF, and an OVF file. Unfortunately, the appliance was created with VMware Workstation 6.0 with virtual machine hardware version 6, which is incompatible with VMware ESXi (Virtual machine hardware versions). During deployment, my customer got this error:

unsupported hardware family 'vmx-06'

The OVF file includes a line with the VM hardware version.

<vssd:VirtualSystemType>vmx-06</vssd:VirtualSystemType>

If you change this line from vmx-06 to vmx-07, the hash of the OVF changes, and you will get an error during the deployment of the appliance because of the wrong file hash.

Solution

You have to change the SHA256 hash of the OVF, which is included in the MF file.

SHA256(appliance-d9-64-bit.ovf)= 46b84a48a03a8b183ff88168ad43e56bd95ffaa78d0a5a8b2b8cf87e0d45f36a
SHA256(appliance-d9-64-bit-file1.iso)= ec78bc48b48d676775b60eda41528ec33c151c2ce7414a12b13d9b73d34de544
SHA256(appliance-d9-64-bit-disk1.vmdk)= 6b4c1bea5706ce554630b5b5407ac31434e4b41a81930e8cc46f36511085fcd9

To create the new SHA256 hash, you can use the PowerShell cmdlet Get-FileHash .

PS C:\Users\p.terlisten\Downloads> Get-FileHash -Algorithm SHA256 .\appliance-d9-64-bit.ovf

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          46B84A48A03A8B183FF88168AD43E56BD95FFAA78D0A5A8B2B8CF87E0D45F36A       C:\Users\p.terlisten\Download...


PS C:\Users\p.terlisten\Downloads> Get-FileHash -Algorithm SHA256 .\endos-d9-64-bit.ovf

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          C14954237907AB45F75C669B5AD2B0A8159096D8526064AC5646F71066DE5C94       C:\Users\p.terlisten\Download...

Replace the hash and save the MF file. Then re-deploy the appliance.

Andreas Lesslhumer wrote a similar blog post in 2015:
“Unsupported hardware family vmx-10” during OVF import

Hell freezes over – VMware virtualization on Microsoft Azure

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.
Update

On November 22, 2017, Ajay Patel (Senior Vice President, Product Development, Cloud Services, VMware) published a blog post in reaction to Microsofts announcement (VMware – The Platform of Choice in the Cloud). Especially these statements are interesting:

No VMware-certified partner names have been mentioned nor have any partners collaborated with VMware in engineering this offering. This offering has been developed independent of VMware, and is neither certified nor supported by VMware.

and

Microsoft recognizing the leadership position of VMware’s offering and exploring support for VMware on Azure as a superior and necessary solution for customers over Hyper-V or native Azure Stack environments is understandable but, we do not believe this approach will offer customers a good solution to their hybrid or multi-cloud future.

Looks like VMware is not happy about Microsofts annoucement. And this blog post clearly states, that VMware will not partner with VMware to bringt VMware virtualization stack on Azure.

I don’t know if this is a wise decision of VMware. The hypervisor, their core product, is a commodity nowadays. We are taking about a bare-metal solution, so it’s not different from what VMware build with AWS. It’s more about how it is embedded in the cloud services and cloud control plane. If you use VMware vSphere, Horizon and O365, the step to move virtualization workloads to VMware on Azure is smaller, than move it to AWS.

On November 23, 2017, the register published this interesting analysis: VMware refuses to support its wares running in Azure.

Original post

Yesterday, Microsoft announced new services to ease the migration from VMware to Microsoft Azure. Corey Sanders (Director of Compute, Azure) posted a blog post (Transforming your VMware environment with Microsoft Azure) and introduced three new Azure services.

Microsoft Azure

Microsoft/ microsoft.com

Azure Migrate

The free Azure Migrate service does not focus on single server workloads. It is focused on multi-server application and will help customers through the three staged

  • Discovery and assessment
  • Migration, and
  • Resource & Cost Optimization

Azure Migrate can discover your VMware-hosted applications on-premises, it can visualize dependencies between them, and it will help customers to create a suitable sizing for the Azure hosted VMs. Azure Site Recovery (ASR) is used for the migration of workloads from the on-premises VMware infrastructure to Microsoft Azure. At the end, when your applications are running on Microsoft Azure, the free Azure Cost Management service helps you to forecast, track, and optimize your spendings.

Integrate VMware workloads with Azure services

Many of the current available Azure services can be used with your on-premises VMware infrastructure, without the need to migrate workloads to Microsoft Azure. This includes Azure Backup, Azure Site Recovery, Azure Log Analytics or managing Microsoft Azure resources with VMware vRealize Automation.

But the real game-changer seesm to bis this:

Host VMware infrastructure with VMware virtualization on Azure

Bam! Microsoft announces the preview of VMware vSphere on Microsoft Azure. It will run on bare-metal on Azure hardware, beside other Azure services. The general availability is expected in 2018.

My two cents

This is the second big announcement about VMware stuff on Azure (don’t forget VMware Horizon Cloud on Microsoft Azure). And although I believe, that this is something that Microsoft wants to offer to get more customers on Azure, this can be a great chance for VMware. VMware customers don’t have to go to Amazon, when they want to host VMware at a major public cloud provider, especially if they are already Microsoft Azure/ O365 customers. This is a pretty bold move from Microsoft and similar to VMware Cloud on AWS. I’m curious to get more details about this.

Creating console screenshots with Get-ScreenshotFromVM.ps1

This posting is ~4 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Today, I had a very interesting discussion. As part of an ongoing troubleshooting process, console screenshots of virtual machines should be created.

The colleagues, who were working on the problem, already found a PowerCLI script that was able to create screenshots using the Managed Object Reference (MoRef). But unfortunately all they got were black screens and/ or login prompts. Latter were the reason why they were unable to run the script unattended. They used the Get-VMScreenshot script, which was written by Martin Pugh.

I had some time to take a look at his script and I created my own script, which is based on his idea and some parts of his code.

This file is also available on GitHub.

One important note: If you want to take console screenshots of VMs, please make sure that display power saving settings are disabled! Windows VMs are showing a black screen after some minutes. Please disable this using the energy options, or better using a GPO. Otherwise you will capture a black screen!

Why I moved from NFS to vSAN… and why it went wrong

This posting is ~4 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

I wanted to retire my Synology DS414slim, and switch completely to vSAN. Okay, no big deal. Many folks use vSAN in their lab. But I’d like to explain why I moved to vSAN and why this move failed. I think some of my thoughts are also applicable for customer environments.

So far, I used a Synology DS414slim with three Crucial M550 480 GB SSDs (RAID 5) as my main lab storage. The Synology was connected with two 1 GbE uplinks (LAG) to my  network, and each host was connected with 4x 1 GbE uplinks (single distributed vSwitch). The Synology was okay from the capacity perspective, but the performance was horrible. RAID 5, SSDs and NFS were not the best team, or to be precise, the  CPU of the Synology was the main bottleneck.

nas_ds414slim

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

1,2 GHz is not enough, if you want to use NFS or iSCSI. I never got more than 60 MB/s (sequential). The random IO performance was okay, but as soon as the IO increased, the latencies went through the roof. Not because the SSDs were to slow, but because the CPU of the Synology was not powerful enough to handle the NFS requests.

Workaround: Add more flash storage

The workaround for the poor random IO performance was adding more flash storage. This time, the flash storage was added to the hosts. I used PernixData FVP to boost my lab. FVP was a quite cool product (unfortunately it was a cool product.) PernixData granted me, as a PernixPro, some licenses for my lab.

End of an era

The acquisition of PernixData by Nutanix, the missing support für vSphere 6.5, and the end of availability of all PernixData products led to the decision to remove PernixData FVP from my lab. Without PernixData FVP, my lab was again a slow train crawling up a hill. Four HPE ProLiant, with enough CPU (40 cores) and memory resources (384 GB RAM) were tied down by slow IO.

Redistribution of resources

I had

  • three 480 GB SSDs, and
  • three 40 GB SSDs

in stock. The 40 GB SSDs were to small and slow, so I replaced them with 120 GB SSDs. I was able to equip three of my four hosts with SSDs. Three hosts with flash storage were enough to try VMware vSAN.

Fortunately, not all hosts have to add capacity to a vSAN cluster. Hosts can also only consume storage from a vSAN cluster. With this in mind, vSAN appeared to be a way out of my IO dilemma. In addition, using the 480 GB SSDs as capacity tier, a vSAN all-flash config was possible.

Migration

It took me a little time to move around VMs to temporary locations, while keeping my DC and my VCSA available. I had to remove my datastore on the Synology to free up the 480 GB SSDs. The necessary vSAN licenses were granted by VMware (vExpert licenses).

The creation of the vSAN cluster itself was easy. Fortunately, wiping partitions from disks is easy. You can use the vSphere Web Client to do this.

vsan_wipe_partitions

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

The initial performance was quite good, much better than expected and much better than the NFS performance of the old Synology NAS. I enabled deduplication and compression, but as soon as I moved VMs to the vSAN datastore, the throughput dropped and latencies went through the roof. It was totally unusable. Furthermore, I got health alarms:

vsan_congestion_error

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

As the load increased, the errors became more severe.

vsan_performance_error

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

WARNING – WRITE Average latency on VSAN device %s is %d ms an higher than threshold value %d ms

WARNING – WRITE Average Latency on VSAN device %s has exceeded threshold value %d ms %d times.

I was able to solve this with a blog post of Cormac Hogan (VSAN 6.1 New Feature – Handling of Problematic Disks). Even without compression and deduplication, the performance was not as expected and most times to low to work with. At this point, I got an idea what was causing my vSAN problems.

Do not use consumer-grade hardware with vSAN

To be honest: The budget is the problem. I had to take consumer-grade SSDs.

This is a screenshot from the vSAN Observer. esx1 to esx3 are equipped with SSDs, esx4 is only consuming storage from the vSAN cluster.

vsan_observer_perf

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Red is not the color to highlight good things…

An explanation attempt

This blog post of Duncan Epping (Why Queue Depth matters!) is a bit older, but still valid in my case. The controller I use  (HPE Smart Array P410i) has a a deep queue (1011), the RAID device has a queue length of 1024, but the SATA SSDs have only a queue length of 32. Here’s the disk adapter and disk device view of ESXTOP.

 7:32:24am up 2 days 15:56, 777 worlds, 2 VMs, 3 vCPUs; CPU load average: 0.01, 0.01, 0.01

 ADAPTR PATH                 NPTH AQLEN   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
 vmhba0 -                       3  1011  4264.32   846.35  3417.97    44.31    61.81     5.06     0.01     5.06     0.00
 7:31:51am up 2 days 15:56, 777 worlds, 2 VMs, 3 vCPUs; CPU load average: 0.01, 0.01, 0.01

DEVICE                                PATH/WORLD/PARTITION DQLEN WQLEN ACTV QUED %USD  LOAD   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
mpx.vmhba32:C0:T0:L0                           -               1     -    0    0    0  0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
naa.600508b1001cb6a46f895b4d05ebb2a3           -            1024     -    0    0    0  0.00   150.30     2.29   148.01     0.01     9.25     0.39     0.00     0.39     0.00
naa.600508b1001cfa6875ba9d07c38b1eb2           -            1024     -    0    0    0  0.00   437.16   413.32    23.84    10.22     0.10     0.89     0.00     0.90     0.00

The consumer-grade SSDs drowned in IOs, unable to handle parallel read and write operations. There’s nothing much that I can do. Currently there are two options:

  • Replacing the SSDs with devices, that have a deeper queue depth
  • Replace the Synology NAS with a more powerful NAS and move back to NFS

I don’t know which way I will go. To get this clear:

  • This is my lab, not a customer environment
  • It is not a vSAN related problem
  • It is because of consumer-grade hardware

Do not try this at production kids. Go vSAN, but please use the right hardware.

Replacing an expired lookup service SSL certificate on a vSphere PSC

This posting is ~4 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

A few days ago, I ran into a very nasty problem. Fortunately, it was in my lab. Some months ago, I replaced the certificates of my vCenter Server Appliance (VCSA), and I’ve chosen to use the VMware Certificate Authority (VMCA) as a subordinate of my AD-based enterprise CA. The VMCA was used as intermediate CA. The certificates were replaced using the  vSphere 6.0 Certificate Manager (/usr/lib/vmware-vmca/bin/certificate-manager), and I followed the instructions of KB2112016 (Configuring VMware vSphere 6.0 VMware Certificate Authority as a subordinate Certificate Authority).

The VCSA was migrated from vSphere 5.5, and with vSphere 5.5 I was also using custom certificates. These certificates were also issued by my AD-based enterprise CA, and these certificates were migration during the vSphere 5.5 > 6.0 migration. So at the end, I replaced custom certificates with VMCA (as an intermediate CA) certificates.

Everything was fine, until a power outage. After powering-on my VMs, I noticed several errors. After logging into the vSphere Web Client, I got an error message at the top of the page:

Error occurred while processing request. Check vSphere WebClient logs for details.

While searching for the cause, I checked the URL of the Platform Services Controller (https://vcsa1.lab.local/psc/login) and got this:

psc_error_1

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

HTTP Status 400 - An error occurred while sending an authentication request to the PSC Single Sign-On server - null

type Status report
message An error occurred while sending an authentication request to the PSC Single Sign-On server - null
description The request sent by the client was syntactically incorrect.

This error led me to KB2144086 (Updating certificates using certificate manager on vCenter Server or PSC 6.0 Update 1b fails), but was able to proof, that I have used different subject names for the different solution user certificates.

While digging in the PSC logs, I found this error in the /var/log/vmware/psc-client/psc-client.log:

Caused by: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint doesn't match
        at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:217)
        at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(Unknown Source)

        ... 71 more

Finally, I found Aaron Smiths blog post “Troubleshooting Expired PSC Certificates with vSphere 6“, who had the same problem. I checked the certificate of the Lookup Service and there it was:

psc_error_2

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

This was the original custom certificate, issued by my AD-based enterprise CA, and installed on my vSphere 5.5 VCSA.

Aaron also offered the solution by referencing KB2118939 (Replacing the Lookup Service SSL certificate on a Platform Services Controller 6.0). I followed the instructions in KB2118939 and replaced the certificate of the Lookup Service with a certificate of the VMCA.

Take care of your certificates

With vSphere 6.0, the Lookup Service should be accessed through the HTTP Reverse Proxy. This proxy uses the machine certificate. Therefore, an expired Lookup Certificate is not obvious. If you connect directly to the Lookup Service using port 7444, you will see the expired certificate. The Lookup Service certificate is not replaced with a custom certificate, if you replace the different solution user certificates.

If you have a vSphere 6.0 VCSA, which was migrated from vSphere 5.5, and you have replaced the certificates on that vSphere 5.5 VCSA with custom certificates, you should check your Lookup Service certificate immidiately! Follow KB2118939 for further instructions.

Credit to Aaron Smith for this blog post. Thank you!

Monitoring hardware status with Python and vSphere API calls

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Apparently it’s “how to monitor hardware status” week on vcloudnine.de. Some days ago, I wrote an article about using SNMP for hardware monitoring. You can also use the vSphere Web Client to get the status of the host hardware. A third way is through the vSphere API. I just want to share a short example how to use vSphere API calls and pyVmomi. pyVmomi is the Python SDK for the VMware vSphere API.

Get hardware status with vSphere API calls

I just want to share a small example, that shows the basic principle. The script gathers the temperature sensor data of a ProLiant DL360 G7 running ESXi 6.0 U2 using vSphere API calls.

The output of the script looks like this:

Valid certificate

Hostname: esx1.lab.local
Type: ProLiant DL360 G7
Getting temperature sensor data...

Other 1 Temp 28 --- Normal 65.0 Degrees C
Drive Backplane 1 Temp 27 --- Normal 35.0 Degrees C
Peripheral Bay 8 Temp 26 --- Normal 42.0 Degrees C
Peripheral Bay 7 Temp 25 --- Normal 38.0 Degrees C
Peripheral Bay 6 Temp 24 --- Normal 45.0 Degrees C
Peripheral Bay 5 Temp 23 --- Normal 39.0 Degrees C
Peripheral Bay 4 Temp 22 --- Normal 46.0 Degrees C
Peripheral Bay 3 Temp 21 --- Normal 44.0 Degrees C
Peripheral Bay 2 Temp 20 --- Normal 39.0 Degrees C
Peripheral Bay 1 Temp 19 --- Normal 37.0 Degrees C
Other 5 Temp 18 --- Normal 39.0 Degrees C
Memory Module 10 Temp 17 --- Normal 36.0 Degrees C
Other 4 Temp 16 --- Normal 34.0 Degrees C
Other 3 Temp 15 --- Normal 35.0 Degrees C
Memory Module 9 Temp 14 --- Normal 34.0 Degrees C
Power Supply 5 Temp 13 --- Normal 45.0 Degrees C
Power Supply 4 Temp 12 --- Normal 36.0 Degrees C
Memory Module 8 Temp 11 --- Normal 36.0 Degrees C
Memory Module 7 Temp 10 --- Normal 38.0 Degrees C
Memory Module 6 Temp 9 --- Normal 36.0 Degrees C
Memory Module 5 Temp 8 --- Normal 39.0 Degrees C
Memory Module 4 Temp 7 --- Normal 35.0 Degrees C
Memory Module 3 Temp 6 --- Normal 37.0 Degrees C
Memory Module 2 Temp 5 --- Normal 36.0 Degrees C
Memory Module 1 Temp 4 --- Normal 38.0 Degrees C
Other 2 Temp 3 --- Normal 40.0 Degrees C
Other 1 Temp 2 --- Normal 40.0 Degrees C
Other 1 Temp 1 --- Normal 27.0 Degrees C
>>>

Nothing fancy. You can easily loop through numericSensorInfo to gather data from other sensors. Use the Managed Object Browser (MOB) to navigate through the API. This is handy if you search for specific sensors. If you need accurate data, the vSphere API is the way to go. If you focus on something lightweight, try SNMP.

Missing hardware status tab in the vSphere Client

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

I thought, everyone knows it, but I’m always being asked “Where’s the hardware status tab?” after an update from vSphere 5.x to 6. Many customers still use the vSphere Client (C # client), and steer clear of the vSphere Web Client. To be honest: Me too. I often use the C# client, especially if I do mass operations, or for a quick look at something.

This is really nothing new, the answer is clear. But I think it’s a good idea to write it down. At least for myself. As a reminder to use the vSphere Web Client.

The hardware status tab

Many customers used the hardware status tab to get a quick overview about the health of the ESXi host hardware.

vsphere_client_hw_status_tab

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

But after an update to vSphere 6 the hardware status tab is missing in the vSphere Client. This is an expected behaviour! VMware has published an knowledge base article about this (The Hardware Status Tab is no longer available in the vSphere Client after upgrading to vCenter Server 6.0). The only solution is to use the vSphere Web Client.

Use the vSphere Web Client

Meanwhile, the old vSphere Client has many downsides. All features introduced in vSphere 5.5 and later are only available through the vSphere Web Client. This also applies to the hardware status tab.

vsphere_web_client_hw_status_tab

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

You find the hardware status on the “Monitor” tab of a host. It offers the same information as the legacy hardware status tab from the vSphere Client.

Do yourself a favor and use the vSphere Web Client. Always, any time.

First steps with Python and pyVmomi (vSphere SDK for Python)

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

In December 2013, VMware made an christmas gift to the community by releasing pyVmomi. pyVmomi is a SDK that allows you to manage VMware ESXi and vCenter using Python and the VMware vSphere API. Nearly 18 months are past since then and pyVmomi has developed over time.

I’ve started to play around with Python, and I’ve written about the reasons in one of my last blog posts (Hey infrastructure guy, you should learn Python!).

How to get pyVmomi?

You can install the official release of pyVmomi using pip (pip installs packages, a recursive acronym).

pip3 install --upgrade pyvmomi

The latest version is available on GitHub. To get the latest version, use

python setup.py develop

or

python setup.py install

That you can fetch the latest version from GitHub is pretty cool and shows a big benefit: The community can contribute to pyVmomi and it’s more frequently updated. A huge benefit in regard of code quality and features.

What Python releases are support?

The latest information about supported Python releases can be found on the GitHub page of the project.

  • pyVmomi 6.0.0.2016.4 and later support 2.7, 3.3 and 3.4
  • pyVmomi 6.0.0 and later support 2.7, 3.3 and 3.4
  • pyVmomi 5.5.0-2014.1 and 5.5.0-2014.1.1 support Python 2.6, 2.7, 3.3 and 3.4
  • pyVmomi 5.5.0 and below support Python 2.6 and 2.7

Interesting fact: pyVmomi version numbers correlate with vSphere releases. pyVmomi 6.0.0 was released with the GA of VMware vSphere 6. pyVmomi supports the corresponding vSphere release and the previous four vSphere releases.

I’m using Python 3 for my examples. I wouldn’t recommend to start with Python 2 these days.

First steps

pyVmomi allows you to manage VMware ESXi and vCenter using Python and the VMware vSphere API. Because of this, the VMware vSphere API Reference Documentation will be your best friend.

First of all, you need a connection to the API. To connect to the vSphere API, we have to import and use the module pyVim, more precise, the pyVim.connect module and the SmartConnect function. pyVim.connect is used for the connection handling (creation, deletion…) to the Virtualization Management Object Management Infrastructure (VMOMI). pyVim is part of pyVmomi and it’s installed automatically.

from pyVim.connect import SmartConnect


c = SmartConnect(host="192.168.20.1", user="root", pwd='Passw0rd')

SmartConnect accepts various parameters, but for the beginning it’s sufficient to use three of them: host, user and pwd. You can use “help(SmartConnect)” to get information about the SmartConnect function. “c” is the object (pyVmomi.VmomiSupport.vim.ServiceInstance) which we will use later.

A connection itself is useless. But how can we explore the API? Python doesn’t support typing, so it can be difficult to “explore” an API. That’s why the VMware vSphere API Reference Documentation and the Managed Object Browser (MOB) will be your best friends. The MOB is a web-based interface and represents the vSphere API. It allows you to navigate through the API. Any changes you make through the MOB, by invoking methods, take effect and change the config or will give you an output.

Important note: If you are using VMware vSphere 6 (ESXi 6.0 and vCenter 6.0), you have to enable the MOB. The MOB is disabled by default. Check VMware KB2108405 (The Managed Object Browser is disabled by default in vSphere 6.0) for more details.

Open a browser and open https://ip-or-fqdn/mob. You can use the IP address or the FQDN of an ESXi host or a vCenter Server (Appliance). I use a standalone ESXi 5.5 host in this example.

python_esxi_mob_1

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Our first code

Let’s try something easy. I’ve framed a method in the screenshot above. We will use this method now.

from pyVim.connect import SmartConnect


c = SmartConnect(host="192.168.20.1", user="root", pwd='Passw0rd')

print(c.CurrentTime())

This code will connect to the vSphere API, invoke the method “CurrentTime()” and prints the result. What happens if we execute our first lines of Python code? We will get an error…

self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)

Python checks SSL certificates in strict mode. Because of this, untrusted certificates will cause trouble. This applies to Python 3, as well as to Python >= 2.7.9 (PEP 0466). Most people use untrusted certificates. To deal with this, we have to create a context for our HTTP connection. This context can be used by the SmartConnect function. To create a context, we have to import the ssl module of Python.

from pyVim.connect import SmartConnect
import ssl


s = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
s.verify_mode = ssl.CERT_NONE

c = SmartConnect(host="192.168.20.1", user="root", pwd='Passw0rd', sslContext=s)

print(c.CurrentTime())

“s” is the new object (ssl.SSLContext) we will use and the parameter “sslContext=s” will told SmartConnect to use this object.

Save this code into a file (I called it pyvmomitest.py in my example). Navigate to the folder, open a Python REPL and import the file you’ve saved (module) a moment ago.

C:\Users\p.terlisten\Documents\Development\Python\Playground>python
Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyvmomitest
2016-05-20 12:29:23.837049+00:00
>>>

Hurray! We used the vSphere API to get the current date and time (CET).

But what if we have deployed valid certificates? And what about housekeeping? We have connected, but we haven’t disconnected from the API? We can use a try-except block to handle this. And because we are nice, we import also the function “Disconnect” from pyVim.connect to disconnect from the vSphere API at the end.

from pyVim.connect import SmartConnect, Disconnect
import ssl

s = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
s.verify_mode = ssl.CERT_NONE

try:
    c = SmartConnect(host="192.168.20.1", user="root", pwd='Passw0rd')
    print('Valid certificate')
except:
    c = SmartConnect(host="192.168.20.1", user="root", pwd='Passw0rd', sslContext=s)
    print('Invalid or untrusted certificate')

print(c.CurrentTime())

Disconnect(c)

With this code, we should get the following output.

C:\Users\p.terlisten\Documents\Development\Python\Playground>python
Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:38:48) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyvmomitest
Invalid or untrusted certificate
2016-05-20 12:36:28.707876+00:00
>>>

Okay, the vSphere API wasn’t designed to retrieve the current date and time. Let’s look at something more useful. This script will give us the names of all VMs in the datacenter.

from pyVim.connect import SmartConnect, Disconnect
import ssl

s = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
s.verify_mode = ssl.CERT_NONE

try:
    c = SmartConnect(host="192.168.20.1", user="root", pwd='Passw0rd')
    print('Valid certificate')
except:
    c = SmartConnect(host="192.168.20.1", user="root", pwd='Passw0rd', sslContext=s)
    print('Invalid or untrusted certificate')

datacenter = c.content.rootFolder.childEntity[0]
vms = datacenter.vmFolder.childEntity

for i in vms:
    print(i.name)

Disconnect(c)

Let’s take this statement and look at everything after the “c”. We will use the MOB to navigate through the API. This will help you to understand, how the Python code and the structure of the vSphere API correlate.

datacenter = c.content.rootFolder.childEntity[0]

Open the MOB. You will easily find  the property “content”.

python_esxi_mob_2

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Click on “content” and search for the property “rootFolder”.

python_esxi_mob_3

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Click on the value “ha-folder-root.” The property “childEntity” is an ManagedObjectReference (MOR) and references to all datacenters (the counting starts at 0) known to the ESXi or vCenter. The value “childEntity[0]” will give us the first datacenter.

python_esxi_mob_4

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

If we have the datacenter, the way to get the names of the VMs is the same. You can use the MOB, to verify this.

vms = datacenter.vmFolder.childEntity

Click on the value “ha-datacenter”. At the bottom of the list, you will find the property “vmFolder”.

python_esxi_mob_5

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Click on the value “ha-folder-vm”.

python_esxi_mob_6

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

The MOR “childEntity” references to two VMs. Click on one of the IDs.

python_esxi_mob_7

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

The property “name” includes the name of the VM. Because of this, we can use a simple

for i in vms:
    print(i.name)

to get the name for each VM.

Summary

This was only a short introduction into pyVmomi. You should be now able to install pyVmomi, make a connection to the vSphere API and retrieve some basic stuff.

Every day I discover something new. It’s important to understand how the vSphere API works. Play with pyVmomi and with the vSphere API. It looks harder as it is.

btw: There is a Hands-on-Lab available “HOL-SDC-1622 VMware Development Tools and SDKs“. Check it out!