Tag Archives: esxi

Upgrade to ESXi 7.0: Missing dependencies VIBs Error

This posting is ~2 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

This error gets me from time to time, regardless which server vendor, mostly on hosts that were upgraded a couple of times. In this case it was a ESXi host currently running a pretty old build of ESXi 6.7 U3 and my job was the upgrade to 7.0 Update 3c.

If you add a upgrade baseline to the cluster or host, and you try to remediate the host, the task fails with a dependency error. When taking a closer look into the taks details, you were getting told that the task has failed because of a failed dependency, but not which VIB it caused.

You can find the name if the causing VIB on the update manager tab of the host which you tried to update. The status of the baseline is “incompatible”, and not “non-compliant”.

To resolve this issue you have to remove the causing VIB. This is no big deal and can be done with esxcli. Enable SSH and open a SSH connection to the host. Then remove the VIB.

[root@esx-01:~] esxcli software vib list | grep -i ssacli
ssacli                         4.17.6.0-6.7.0.7535516.hpe          HPE        PartnerSupported  2020-06-18
[root@esx-01:~] esxcli software vib remove -n ssacli
Removal Result
   Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
   Reboot Required: true
   VIBs Installed:
   VIBs Removed: HPE_bootbank_ssacli_4.17.6.0-6.7.0.7535516.hpe
   VIBs Skipped:
[root@esx-01:~]

You need to reboot the host after the removal of the VIB. Then you can proceed with the update. The status of the upgrade baseline should be now “not-compliant”.

VMware ESXi 6.7 memory health warnings after ProLiant SPP

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

During the deployment of a vSAN cluster consisting of multiple HPE ProLiant DL380 Gen10 hosts, I noticed a memory health warning after updating the firmware using the Support Pack for ProLiant. The error was definitely not shown before the update, so it was clear, that this was not a real issue with the hardware. Furthermore: All hosts showed this error.

Memory health status after SPP

The same day, a customer called me and asked me about a strange memory health error after he has updated all of his hosts with the latest SPP…

My first guess, that this was not caused by a HW malfunction was correct. HPE published a advisory about this issue:

The Memory Sensor Status Reported in the vSphere Web Client Is Not Accurate For HPE ProLiant Gen10 and Gen10 Plus Servers Running VMware ESXi 6.5/6.7/7.0 With HPE Integrated Lights-Out 5 (iLO 5) Firmware Version 2.30

To fix this issue, you have to update the ILO5 firmware to version 2.31. You can do this manually using the ILO5 interface, or you can add the file to the SPP. I’ve added the BIN file to the USB stick with the latest SPP.

If you want to update the firmware manually, simply upload the BIN file using the built-in firmware update function.

  1. Navigate to Firmware & OS Software in the navigation tree, and then click Update Firmware
  2. Select the Local file option and browse to the BIN file
  3. To save a copy of the component to the iLO Repository, select the Also store in iLO Repository check box
  4. To start the update process, click Flash

You can download the latest ILO5 2.31from HPE using this link. After the FW update, the error will resolve itself.

Only ESXi 6.7 is affected, and only ESXi 6.7 running on HPE ProLiant hosts, regardless if ML, DL or BL series.

Update Manager fails with unknown error during host remediation

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

During an vSphere 6.5 > 6.7 update a was host failing continously at the remediation with an “unknown error”. The host was updated from ESXI 6.5 to 6.7 using an upgrade baseline. Other hosts were updated to 6.7 and with the latest patches without any issues. Something strange was going on…

The esxupdate.log and the vua.log on the host itself showed nothing special. So I checked the vmware-vum-server-log4cpp.log which was much more informative!

[2020-07-19 13:03:25:217 'SingleHostScanTask.SingleHostScanTask{262}' 139762329831168 ERROR] [singleHostScanTask, 693] caught an odbc error: "ODBC error: (23505) - ERROR: duplicate key value violates unique constraint "pk_vci_scanresults"; Error while executing the query" is returned when executing SQL statement "INSERT INTO VCI_SCANRESULTS(endtime, id, scan_status, scan_type, starttime, target_component, target_uid) VALUES (?, ?, ?, ?, ?, ?, ?)"
[2020-07-19 13:03:25:219 'SingleHostScanTask.SingleHostScanTask{262}' 139762329831168 ERROR] [singleHostScanTask, 404] SingleHostScan caught exception: caught an odbc error: "ODBC error: (23505) - ERROR: duplicate key value violates unique constraint "pk_vci_scanresults"; Error while executing the query" is returned when executing SQL statement "INSERT INTO VCI_SCANRESULTS(endtime, id, scan_status, scan_type, starttime, target_component, target_uid) VALUES (?, ?, ?, ?, ?, ?, ?)" with code: -1
[2020-07-19 13:03:25:223 'SingleHostScanTask.SingleHostScanTask{262}' 139762329831168 ERROR] [vciTaskBase, 568] Task execution has failed: caught an odbc error: "ODBC error: (23505) - ERROR: duplicate key value violates unique constraint "pk_vci_scanresults"; Error while executing the query" is returned when executing SQL statement "INSERT INTO VCI_SCANRESULTS(endtime, id, scan_status, scan_type, starttime, target_component, target_uid) VALUES (?

Well… ERROR: duplicate key value violates unique constraint “pk_vci_scanresults” is not what I expected, but it is an error, and it occured everytime I tried to remediate the host.

Google found nothing about this error, so I decided to reset the VUM database. Please don’t try this at your customer! Log a call at VMware.

To reset the VUM database:

  1. Connect to vCenter Server Appliance via SSH
  2. Switch to the BASH 
  3. Stop the VMware Update Manager Service with this command

    service-control –stop vmware-updatemgr
     
  4. To reset the VMware Update Manager Database (applies only to VCSA 6.7 and 7.0!)

    /usr/lib/vmware-updatemgr/bin/updatemgr-utility.py reset-db
  1. Delete the contents of the VMware Update Manager Patch Store

    rm -rf /storage/updatemgr/patch-store/*
     
  2. Start the VMware Update Manager Service again

    service-control –start vmware-updatemgr

You will lose all your baselines, so you have to configure them again. And you need to download all patches again.

For vSAN environments this procedure will also remove the vSAN default baselines, but they will recreated automatically when there is a configuration change to vSAN or an update to the HCL DB. Again: Don’t do this at home!

Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

I’ve got several mails and comments about this topic. It looks like that the latest ESXi 6.7 updates are causing some trouble on HPE ProLiant Gen10 servers.

I’ve blogged about recurring host hardware sensor state alarm messages some weeks ago. A customer noticed them after an update. Last week, I got the first comments under this blog post abot fan failure messages after applying the latest ESXi 6.7 updates. Then more and more customers asked me about this, because they got these messages too in their environment after applying the latest updates.

Last Saturday I tweeted my blog post to give a hint to my followers who may be experiencing the same problem.

Fortunately one of my followers (Thanks Markus!) pointed me to a VMware KB article with a workaround: Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7 (78989).

This is NOT a solution, but a workaround. Keep that in Mind.

Thanks again to Markus. Make sure to visit his awesome blog (MY CLOUD-(R)EVOLUTION) , especially if you are interested in vSphere, Veeam and automation!

VMware ESXi 6.7: Recurring host hardware sensor state alarm

This posting is ~4 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

If you found this blog post because you are searchting for a solution for a FAN FAILURE on your ProLiant Gen10 HW after applying the latest ESXi 6.7 patches, then use this shortcut for the workaround: Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7


I had a really annoying problem at one of my customers. After deploying new VMware ESXi hosts (HPE ProLiant DL380 Gen10) along with an upgrade of the vCenter Server Appliance to 6.7 U2, the customer reported recurring host hardware sensor state alarm messages in the vCenter for all hosts.

After acknowledging the alarm, it recurred after a couple of minutes or hours. The hardware was finde, no errors or warnings were noticed in the ILO Management Log. But the vCenter reported periodically a Sensor -1 type error in the Events window. The /var/log/syslog.log contained messages like this:

2019-11-29T04:39:48Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:49Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:50Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:51Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:52Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1

Sure, you can ignore this. But you shouldn’t ignore this, because these events can result in the vCenter database increasing in size. vCenter can crash once the SEAT partition size goes above the 95% threshold. So you better fix this!

Long story short: This bug is fixed with the latest November updates for ESXi 6.7 U3. A workaround is to disable the WBEM service. The WBEM service might be enabled after a reboot. In this case you have to disable the sfcbd-watchdog service.

But the best way to solve this is to install the latest patches (VMware ESXi 6.7, Patch Release ESXi670-201911001)

“Cannot execute upgrade script on host” during ESXi 6.5 upgrade

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

I was onsite at one of my customers to update a small VMware vSphere 6.0 U3 environment to 6.5 U2c. The environment consists of three hosts. Two hosts in a cluster, and a third host is only used to run a HPE StoreVirtual Failover Manager.

The update of the first host, using the Update Manager and a HPE custom ESX 6.5 image, was pretty flawless. But the update of the second host failed with “Cannot execute upgrade script on host”

typographyimages/ pixabay.com/ Creative Commons CC0

I checked the host and found it with ESXi 6.5 installed. But I was missing one of the five iSCSI datastores. Then I tried to patch the host with the latest patches and hit “Remidiate”. The task failed with “Cannot execute upgrade script on host”. So I did a rollback to ESXi 6.0 and tried the update again, but this time using ILO and the HPE custom ISO. But the result was the same: The host was running ESXi 6.5 after the update, but the upgrade failed with the “Upgrade Script” error. After this attempt, the host was unable to mount any of the iSCSI datastores. This was because the datastores were mounted ATS-only on the other host, and the failed host was unable to mount the datastores in this mode. Very strange…

I checked the vua.log and found this error message:

2018-11-05T16:35:56.614Z info vua[A3CAB70] [Originator@6876 sub=VUA] Command '/tmp/vuaScript-xMVUfb/precheck.py --ip=172.19.0.14' finished with exit status 1
--> stderr: --------
--> INFO:root:Running esxcfg-info
--> Traceback (most recent call last):
-->   File "/build/mts/release/bora-9298722/bora/build/esx/release/vmvisor/sys-boot/lib64/python3.5/subprocess.py", line 385, in run
-->   File "/build/mts/release/bora-9298722/bora/build/esx/release/vmvisor/sys-boot/lib64/python3.5/subprocess.py", line 788, in communicate
-->   File "/build/mts/release/bora-9298722/bora/build/esx/release/vmvisor/sys-boot/lib64/python3.5/encodings/ascii.py", line 26, in decode
--> UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 1272423: ordinal not in range(128)

Focus on this part of the error message:

--> UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 1272423: ordinal not in range(128)

The upgrade script failed due to an illegal character in the output of esxcfg-info. First of all, I had to find out what this 0x80 character is. I checked UTF-8 and the windows1252 encoding, and found out, that 0x80 is the € (Euro) symbol in the windows-1252 encoding. I searched the output of esxcfg-info for the € symbol – and found it.

            \==+Heap : 
               |----Name............................................€A
               |----Growable........................................true
               |----Max Size........................................41848 bytes
               |----Max Available...................................40816 bytes
               |----Current Size....................................29560 bytes
               |----Current Size....................................29560 bytes
               |----Current Allocation..............................1032 bytes
               |----Current Available...............................1032 bytes
               |----Current Releasable..............................20400 bytes
               |----Percent Free of Current.........................96 
               |----Percent Free of Max.............................97 
               |----Percent Releasable..............................69

But how to get rid of it? Where does it hide in the ESXi config? I scrolled a bit up and down around the € symbol. A bit above, I found a reference to HPE_SATP_LH . This took immidiately my attention, because the customer is using StoreVirtual VSA and StoreVirtual HW appliances.

Now, my second educated guess of the day came into play. I checked the installed VIBs, and found the StoreVirtual Multipathing Extension installed on the failed host – but not on the host, where the ESXi 6.5 update was successful.

I removed the VIB from the buggy host, did a reboot, tried to update the host with the latest patches – with success! The cross-checking showed, that the € symbol was missing in the esxcfg-info  output of the host that was upgraded first. I don’t have a clue why the StoreVirtual Multipathing Extension caused this error. The customer and I decided to not install the StoreVirtual Multipathing Extension again.

High CPU usage on Citrix ADC VPX

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

While building a small Citrix NetScaler… ehm… ADC VPX (I really hate this name…) lab environment, I noticed that the fan of my Lenovo T480s was spinning up. I was wondering why, because the VPX VM was just running for a couple of minutes – without any load. But the task manager told me, that the VMware Workstation Process was consuming 25% (I have a Intel i5 Quad Core CPU) CPU. So VMware Workstation was just eating a whole CPU core without doing anything. I would not care, but the fan… And it reminded me, that I’ve seen an similar behaviour in various VPX deployments on VMWare ESXi.

Fifaliana/ pixabay.com/ Creative Commons CC0

A quick search lead me to this Citrix Support Knowledge Center article: High CPU Usage on NetScaler VPX Reported on VMware ESXi Version 6.0. That’s exactly what I’ve observed.

The solution is setting the parameter cpuyield  to yes.

> set ns vpxparam -cpuyield YES
 Done
> show ns runningConfig | grep "cpuyield"
set ns vpxparam -cpuyield YES
>

The VPX does not need a reboot. Short after setting the parameter, the fan stopped spinning. Have I mentioned how I love silence on my desk? I’m pretty happy that my T480s is a really quiet laptop.

But what does this parameter is used for? In pretty simple words: To allocate CPU cycles, that are not used by other VMs. Until ADC VPX 11.1, the VPX was sharing CPU with other VMs. This changed with ADC VPX 12.0. Since this release, the VPX was like a child, that was playing with their favorite toy just to make sure, that no other child can play with it. Not very polite…

This is a quote from the Support Knowledge Center article:

Set ns vpxparam parameters:
-cpuyield: Release or do not release of allocated but unused CPU resources.

YES: Allow allocated but unused CPU resources to be used by another VM.

NO: Reserve all CPU resources for the VM to which they have been allocated. This option shows higher percentage in hypervisor for VPX CPU usage.
DEFAULT: NO

I don’t think that I would change this in production. But for lab environments, especially if you run this on VMware Workstation, I would set -cpuyield  to yes .

Powering on a VM with shared VMDK fails after extending a EagerZeroedThick VMDK

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

I hope that you are not reading this blog post while searching for a solution for a failed cluster. If so, feel free to leave a comment if this blog post saved your evening or weekend. :)

Last friday, a change at one of my customers went horribly wrong. I was not onsite, but they contacted me during the night from friday to saturday, because their most important Windows Server Failover Cluster was unable to start after extending a shared VMDK.

cripi/ pixabay.com/ Creative Commons CC0

They tried something pretty simple: Extending an virtual disk of a VM. That is something most of us doing pretty often. The customer did this also pretty often. It was a well known task… Except the fact, that the VM was part of a Windows Server Failover Cluster. With shared VMDKs. And the disks were EagerZeroedThick, because this is a requirement for shared VMDKs.

They extended the disk using the vSphere Web Client. And at this point, the change was doomed to fail. They tried to power-on the VMs, but all they got was this error:

VMware ESX cannot open the virtual disk, “/vmfs/volumes/4c549ecd-66066010-e610-002354a2261b/VMNAME/VMDKNAME.vmdk” for clustering. Please verify that the virtual disk was created using the ‘thick’ option.

A shared VMDK is a VMDK in multiwriter mode. This VMDK has to be created as Thick Provision Eager Zeroed. And if you wish to extend this VMDK, you must use vmkfstools  with the option -d eagerzeroedthick. If you extend the VMDK using the Web Client, the extended portion of the disk will become LazyZeroed!

VMware has described this behaviour in the KB1033570 (Powering on the virtual machine fails with the error: Thin/TBZ disks cannot be opened in multiwriter mode). There is also a blog post by Cormac Hogan at VMware, who has described this behaviour.

That’s a screenshot from the failed cluster. Check out the type of the disk (Thick-Provision Lazy-Zeroed).

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

You must use vmkfstools  to extend a shared VMDK – but vmkfstools is also the solution, if you have trapped into this pitfall. Clone the VMDK with option -d eagerzeroedthick.

vmkfstools -i old.vmdk new.vmdk -d eagerzeroedthick

Another solution, which was new to me, is to use Storage vMotion. You can migrate the “broken” VMDK to another datastore and change the the disk format during Storage vMotion. This solution is described in the “Notes” section of KB1033570.

Both ways will fix the problem. The result will be a Thick Provision Eager Zeroed VMDK, which will allow the VMs to be successfully powered on.

Vembu BDR Essentials – affordable backup for SMB customers

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

It is common that vendors offer their products in special editions for SMB customers. VMware offers VMware vSphere Essentials and Essentials Plus, Veeam offers Veeam Backup Essentials, and now Vembu has published Vembu BDR Essentials.

Vembu Technologies/ Vembu BDR Essentials/ Copyright by Vembu Technologies

Backup is important. There is no reason to have no backup. According to an infographic published by Clutch Research at the World Backup Day 2017, 60% of all SMBs that lost all their data will shutdown within 6 months after the data loss. Pretty bad, isn’t it?

When I talk to SMB customers, most of them complain about the costs of backups. You need software, you need the hardware, and depending on the type of used hardware, you need media. And you should have a second copy of your data. In my opinion, tape is dead for SMB customers. HPE for example, offers pretty smart disk-based backup solutions, like the HPE StoreOnce. But hardware is nothing without software. And at this point, Vembu BDR Essentials comes into play.

Affordable backup for SMB customers

Most SMB virtualization deployments consists of two or three hosts, which makes 4 or 6 used CPU sockets. Because of this, Vembu BDR Essentials supportes up to 6 sockets or 50 VMs. But why does Vembu limit the number of sockets and VMs? You might missed the OR. Customers have to choice which limit they want to accept. Customers are limited at the host-level (max 6 sockets), but not limited in the amount of VMs, or they can use more than 6 sockets, but then they are limited to 50 VMs.

Feature Highlights

Vembu BDR Essentials support all important features:

  • Agentless VMBackup to backup VMs
  • Continuous Data Protection with support for RPOs of less than 15 minutes
  • Quick VM Recovery to get failed VMs up and running in minutes
  • Vembu Universal Explorer to restore individual items from Microsoft applications like Exchange, SharePoint, SQL and Active Directory
  • Replication of VMs Vembu OffsiteDR and Vembu CloudDR

Needless to say that Vembu BDR Essentials support VMware vSphere and Microsoft Hyper-V. If necessary, customer can upgrade to the Standard or Enterprise  edition.

To get more information about the different Vembu BDR parts, take a look at my last Vembu blog post: The one stop solution for backup and DR: Vembu BDR Suite

The pricing

Now the fun part – the pricing. Customers can save up to 50% compared to the Vembzu BDR Suite.

Vembu Technologies/ Vembu BDR Essentials Pricing/ Copyright by Vembu Technologies

The licenses for Vembu BDR Essentials are available in two models:

  • Subscription, and
  • Perpetual

Subscription licenses are available for 1, 2, 3 and 5 years. The perpetual licenses is valid for 10 years from the date of purchase. The subscription licensing has the benefit, that it included 24×7 technical support. If you purchase the perpetual  license, the Annual Maintenance Cost (AMC) for first year is free. From the second year, it is 20% of the license cost, and it is available for 1, 2 or 3 years.

There is no excuse for not having a backup

With Vembu BDR Essentials, there is no more excuse for not having a competitive backup protecting your business! The pricing fits any SMB customer, regardless of their size or business. The rich feature set is competitive to other vendors, and both leading hypervisors are supported.

A pretty nice product. Try it for free! Vembu also offers a free edition that might fit small environments. The free edition let you choose between unlimited VMs, that are covered with limited functionality, or unlimited functionality for up to 3 VMs. Check out this comparison of free, standard and enterprise edition.

How to monitor ESXi host hardware with SNMP

This posting is ~7 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

The Simple Network Management Protocol (SNMP) is a protocol for monitoring and configuration of network-attached devices. SNMP exposes data in the form of variables and values. These variables can then be queried or set. A query retrieves the value of a variable, a set operation assigns a value to a variable. The variables are organized in a hierarchy and each variable is identified by an object identifiers (OID). The management information base (MIB ) describes this hierarchy. MIB files (simple text files) contain metadata for each OID. These are necessary for the translation of a numeric OID into a human-readable format.  SNMP knows two devices types:

  • the managed device which runs the SNMP agent
  • the network management station (NMS) which runs the management software

The NMS queries the SNMP agent with GET requests. Configuration changes are made using SET requests. The SNMP agent can inform the NMS about state changes using a SNMP trap message. The easiest way for authentication is the SNMP community string.

SNMP is pretty handy and it’s still used, especially for monitoring and managing networking components. SNMP has the benefit, that it’s very lightweight. Monitoring a system with WBEM or using an API can cause slightly more load, compared to SNMP. Furthermore, SNMP is a internet-protocol standard. Nearly every device supports SNMP.

Monitoring host hardware with SNMP

Why should I monitor my ESXi host hardware with SNMP? The vCenter Server can trigger an alarm and most customers use applications like VMware vRealize Operations, Microsoft System Center Operations Manager, or HPE Systems Insight Manager (SIM). There are better ways to monitor the overall health of an ESXi host. But sometimes you want to get some stats about the network interfaces (throughput), or you have a script that should do something, if a NIC goes down or something else happens. Again, SNMP is very resource-friendly and widely supported.

Configure SNMP on ESXi

I focus on ESXi 5.1 and beyond. The ESXi host is called “the SNMP Agent”. We don’t configure traps or trap destinations. We just want to poll the SNMP agent using SNMP GET requests. The configuration is done using esxcli . First of all, we need to set a community string and enable SNMP.

[root@esx1:~] esxcli system snmp set -c public -e true
[root@esx1:~] esxcli system snmp get
   Authentication:
   Communities: public
   Enable: true
   Engineid: 00000063000000a100000000
   Hwsrc: indications
   Largestorage: true
   Loglevel: info
   Notraps:
   Port: 161
   Privacy:
   Remoteusers:
   Syscontact:
   Syslocation:
   Targets:
   Users:
   V3targets:

That’s it! The necessary firewall ports and services are opened and started automatically.

Querying the SNMP agent

I use a CentOS VM to show you some queries. The Net-SNMP package contains the tools snmpwalk  and snmpget. To install the Net-SNMP utils, simply use yum .

[root@web ~]# yum install net-snmp-utils.x86_64

Download the VMware SNMP MIB files, extract the ZIP file, and copy the content to to /usr/share/snmp/mibs.

[root@web mibs]# ls -lt
total 3852
-rw-r--r--. 1 root root  50968 Jun  3 17:05 BRIDGE-MIB.mib
-rw-r--r--. 1 root root  59268 Jun  3 17:05 ENTITY-MIB.mib
-rw-r--r--. 1 root root  52586 Jun  3 17:05 HOST-RESOURCES-MIB.mib
-rw-r--r--. 1 root root  10583 Jun  3 17:05 HOST-RESOURCES-TYPES.mib
-rw-r--r--. 1 root root   7309 Jun  3 17:05 IANA-ADDRESS-FAMILY-NUMBERS-MIB.mib
-rw-r--r--. 1 root root  33324 Jun  3 17:05 IANAifType-MIB.mib
-rw-r--r--. 1 root root   3890 Jun  3 17:05 IANA-RTPROTO-MIB.mib
-rw-r--r--. 1 root root  76268 Jun  3 17:05 IEEE8021-BRIDGE-MIB.mib
-rw-r--r--. 1 root root  89275 Jun  3 17:05 IEEE8021-Q-BRIDGE-MIB.mib
-rw-r--r--. 1 root root  16082 Jun  3 17:05 IEEE8021-TC-MIB.mib
-rw-r--r--. 1 root root  44543 Jun  3 17:05 IEEE8023-LAG-MIB.mib
-rw-r--r--. 1 root root  71747 Jun  3 17:05 IF-MIB.mib
-rw-r--r--. 1 root root  16782 Jun  3 17:05 INET-ADDRESS-MIB.mib
-rw-r--r--. 1 root root  46405 Jun  3 17:05 IP-FORWARD-MIB.mib
-rw-r--r--. 1 root root 185967 Jun  3 17:05 IP-MIB.mib
-rw-r--r--. 1 root root    229 Jun  3 17:05 list-ids-diagnostics.txt
-rw-r--r--. 1 root root  77406 Jun  3 17:05 LLDP-V2-MIB.mib
-rw-r--r--. 1 root root  16108 Jun  3 17:05 LLDP-V2-TC-MIB.mib
-rw-r--r--. 1 root root  23777 Jun  3 17:05 notifications.txt
-rw-r--r--. 1 root root  39918 Jun  3 17:05 P-BRIDGE-MIB.mib
-rw-r--r--. 1 root root  84172 Jun  3 17:05 Q-BRIDGE-MIB.mib
-rw-r--r--. 1 root root   1465 Jun  3 17:05 README
-rw-r--r--. 1 root root 223872 Jun  3 17:05 RMON2-MIB.mib
-rw-r--r--. 1 root root 148032 Jun  3 17:05 RMON-MIB.mib
-rw-r--r--. 1 root root  22342 Jun  3 17:05 SNMP-FRAMEWORK-MIB.mib
-rw-r--r--. 1 root root   5543 Jun  3 17:05 SNMP-MPD-MIB.mib
-rw-r--r--. 1 root root   8259 Jun  3 17:05 SNMPv2-CONF.mib
-rw-r--r--. 1 root root  31588 Jun  3 17:05 SNMPv2-MIB.mib
-rw-r--r--. 1 root root   8932 Jun  3 17:05 SNMPv2-SMI.mib
-rw-r--r--. 1 root root  38048 Jun  3 17:05 SNMPv2-TC.mib
-rw-r--r--. 1 root root  28647 Jun  3 17:05 TCP-MIB.mib
-rw-r--r--. 1 root root  93608 Jun  3 17:05 TOKEN-RING-RMON-MIB.mib
-rw-r--r--. 1 root root  20951 Jun  3 17:05 UDP-MIB.mib
-rw-r--r--. 1 root root   3175 Jun  3 17:05 UUID-TC-MIB.mib
-rw-r--r--. 1 root root   2326 Jun  3 17:05 VMWARE-CIMOM-MIB.mib
-rw-r--r--. 1 root root  22411 Jun  3 17:05 VMWARE-ENV-MIB.mib
-rw-r--r--. 1 root root  53480 Jun  3 17:05 VMWARE-ESX-AGENTCAP-MIB.mib
-rw-r--r--. 1 root root   2328 Jun  3 17:05 VMWARE-HEARTBEAT-MIB.mib
-rw-r--r--. 1 root root   1699 Jun  3 17:05 VMWARE-NSX-MANAGER-AGENTCAP-MIB.mib
-rw-r--r--. 1 root root 146953 Jun  3 17:05 VMWARE-NSX-MANAGER-MIB.mib
-rw-r--r--. 1 root root  15641 Jun  3 17:05 VMWARE-OBSOLETE-MIB.mib
-rw-r--r--. 1 root root   2173 Jun  3 17:05 VMWARE-PRODUCTS-MIB.mib
-rw-r--r--. 1 root root   8305 Jun  3 17:05 VMWARE-RESOURCES-MIB.mib
-rw-r--r--. 1 root root   3736 Jun  3 17:05 VMWARE-ROOT-MIB.mib
-rw-r--r--. 1 root root  11142 Jun  3 17:05 VMWARE-SRM-EVENT-MIB.mib
-rw-r--r--. 1 root root   3872 Jun  3 17:05 VMWARE-SYSTEM-MIB.mib
-rw-r--r--. 1 root root   7017 Jun  3 17:05 VMWARE-TC-MIB.mib
-rw-r--r--. 1 root root   7611 Jun  3 17:05 VMWARE-VA-AGENTCAP-MIB.mib
-rw-r--r--. 1 root root   8777 Jun  3 17:05 VMWARE-VC-EVENT-MIB.mib
-rw-r--r--. 1 root root  38576 Jun  3 17:05 VMWARE-VCOPS-EVENT-MIB.mib
-rw-r--r--. 1 root root  26952 Jun  3 17:05 VMWARE-VMINFO-MIB.mib

Now we can use snmpwalk  to “walk down the hierarchy “. This is only a small part of the complete output. The complete snmpwalk  output has more than 4000 lines!

[root@web mibs]# snmpwalk -m ALL -c public -v 2c esx1.lab.local
SNMPv2-MIB::sysDescr.0 = STRING: VMware ESXi 6.0.0 build-3825889 VMware, Inc. x86_64
SNMPv2-MIB::sysObjectID.0 = OID: VMWARE-PRODUCTS-MIB::vmwESX
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (402700) 1:07:07.00
SNMPv2-MIB::sysContact.0 = STRING:
SNMPv2-MIB::sysName.0 = STRING: esx1

Now we can search for interesting parts. If you want to monitor the link status of the NICs, try this:

[root@web mibs]# snmpwalk -m ALL -c public -v 2c esx1.lab.local IF-MIB::ifDescr
IF-MIB::ifDescr.1 = STRING: Device vmnic0 at 03:00.0 bnx2
IF-MIB::ifDescr.2 = STRING: Device vmnic1 at 03:00.1 bnx2
IF-MIB::ifDescr.3 = STRING: Device vmnic2 at 04:00.0 bnx2
IF-MIB::ifDescr.4 = STRING: Device vmnic3 at 04:00.1 bnx2
IF-MIB::ifDescr.5 = STRING: Device vmnic4 at 06:00.0 bnx2
IF-MIB::ifDescr.6 = STRING: Device vmnic5 at 06:00.1 bnx2
IF-MIB::ifDescr.7 = STRING: Distributed Virtual VMware switch: DvsPortset-0
IF-MIB::ifDescr.8 = STRING: Virtual interface: vmk0 on port 33554442 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
IF-MIB::ifDescr.9 = STRING: Virtual interface: vmk1 on port 33554443 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
IF-MIB::ifDescr.10 = STRING: Virtual interface: vmk2 on port 33554444 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
IF-MIB::ifDescr.11 = STRING: Virtual interface: vmk3 on port 33554445 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27

As you can see, I used a subtree of the whole hierarchy (IF-MIB::ifDescr). This is the “translated” OID. To get the numeric OID, you have to add the option -O fn to snmpwalk .

[root@web mibs]# snmpwalk -O fn -m ALL -c public -v 2c esx1.lab.local IF-MIB::ifDescr
.1.3.6.1.2.1.2.2.1.2.1 = STRING: Device vmnic0 at 03:00.0 bnx2
.1.3.6.1.2.1.2.2.1.2.2 = STRING: Device vmnic1 at 03:00.1 bnx2
.1.3.6.1.2.1.2.2.1.2.3 = STRING: Device vmnic2 at 04:00.0 bnx2
.1.3.6.1.2.1.2.2.1.2.4 = STRING: Device vmnic3 at 04:00.1 bnx2
.1.3.6.1.2.1.2.2.1.2.5 = STRING: Device vmnic4 at 06:00.0 bnx2
.1.3.6.1.2.1.2.2.1.2.6 = STRING: Device vmnic5 at 06:00.1 bnx2
.1.3.6.1.2.1.2.2.1.2.7 = STRING: Distributed Virtual VMware switch: DvsPortset-0
.1.3.6.1.2.1.2.2.1.2.8 = STRING: Virtual interface: vmk0 on port 33554442 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
.1.3.6.1.2.1.2.2.1.2.9 = STRING: Virtual interface: vmk1 on port 33554443 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
.1.3.6.1.2.1.2.2.1.2.10 = STRING: Virtual interface: vmk2 on port 33554444 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
.1.3.6.1.2.1.2.2.1.2.11 = STRING: Virtual interface: vmk3 on port 33554445 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27

You can use snmptranslate  to translate an OID.

[root@web mibs]# snmptranslate .1.3.6.1.2.1.2.2.1.2
IF-MIB::ifDescr
[root@web mibs]# snmptranslate -O fn IF-MIB::ifDescr
.1.3.6.1.2.1.2.2.1.2

So far, we have only the description of the interfaces. With a little searching, we find the status of the interfaces (I stripped the output).

IF-MIB::ifOperStatus.1 = INTEGER: up(1)
IF-MIB::ifOperStatus.2 = INTEGER: up(1)
IF-MIB::ifOperStatus.3 = INTEGER: down(2)
IF-MIB::ifOperStatus.4 = INTEGER: down(2)
IF-MIB::ifOperStatus.5 = INTEGER: up(1)
IF-MIB::ifOperStatus.6 = INTEGER: up(1)

ifOperStatus.1  corresponds with ifDescr.1 , ifOperStatus.2  corresponds with ifDescr.2  and so on. The ifOperStatus corresponds  with the status of the NICs in the vSphere Web Client.

nic_status_web_client

If you want to monitor the fans or power supplies, use these these OIDs.

HOST-RESOURCES-MIB::hrDeviceDescr.35 = STRING: POWER Power Supply 1
HOST-RESOURCES-MIB::hrDeviceDescr.36 = STRING: POWER Power Supply 2
HOST-RESOURCES-MIB::hrDeviceDescr.37 = STRING: FAN Fan Block 1
HOST-RESOURCES-MIB::hrDeviceDescr.38 = STRING: FAN Fan Block 2
HOST-RESOURCES-MIB::hrDeviceDescr.39 = STRING: FAN Fan Block 3
HOST-RESOURCES-MIB::hrDeviceDescr.40 = STRING: FAN Fan Block 4

HOST-RESOURCES-MIB::hrDeviceStatus.35 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.36 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.37 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.38 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.39 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.40 = INTEGER: running(2)

Many possibilities

SNMP offers a simple and lightweight way to monitor a managed device. It’s not a replacement for vCenter, vROps or SCOM. But it can be an addition, especially because SNMP is an internet-protocol standard.