Tag Archives: server

Virtually reseated: Reset blade in a HPE C7000 enclosure

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

After a reboot, a VMware ESXi 6.7 U3 told me that he has no compatible NICs. Fun fact: Right before the reboot everything was fine.

The ILO also showed no NICs. Unfortunately, I wasn’t onsite to pull the blade server and put it back in. But there is a way to do this “virtually”.

You have to connect to the IP address of the Onboard Administrator via SSH. Then issue the reset server command with the bay of the server you want to reset and an argument.

OA1-C7000> reset server 13

WARNING: Resetting the server trips its E-Fuse. This causes all power to be momentarily removed from the server. This command should only be used when physical access to the server is unavailable, and the server must be removed and
reinserted.

Any disk operations on direct attached storage devices will be affected. I/O
will be interrupted on any direct attached I/O devices.

Entering anything other than 'YES' will result in the command not executing.

Do you want to continue ? yes

Successfully reset the E-Fuse for device bay 13.

The server will power up automatically. Please note, that the OBA is unable to display certain information right after this operation. It will take a couple of minutes until all information, like serial number or device bay name are visible again.

Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

I’ve got several mails and comments about this topic. It looks like that the latest ESXi 6.7 updates are causing some trouble on HPE ProLiant Gen10 servers.

I’ve blogged about recurring host hardware sensor state alarm messages some weeks ago. A customer noticed them after an update. Last week, I got the first comments under this blog post abot fan failure messages after applying the latest ESXi 6.7 updates. Then more and more customers asked me about this, because they got these messages too in their environment after applying the latest updates.

Last Saturday I tweeted my blog post to give a hint to my followers who may be experiencing the same problem.

Fortunately one of my followers (Thanks Markus!) pointed me to a VMware KB article with a workaround: Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7 (78989).

This is NOT a solution, but a workaround. Keep that in Mind.

Thanks again to Markus. Make sure to visit his awesome blog (MY CLOUD-(R)EVOLUTION) , especially if you are interested in vSphere, Veeam and automation!

Using WP fail2ban with the CloudFlare API to protect your website

This posting is ~6 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

The downside of using WordPress is that many people use it. That makes WordPress a perfect target for attacks. I have some trouble with attacks, and one of the consequences is, that my web server crashes under load. The easiest way to solve this issue would be to ban those IP addresses. I use Fail2ban to protect some other services. So the idea of using Fail2ban to ban IP addresses, that are used for attacks, was obvious.

From the Fail2ban wiki:

Fail2ban scans log files (e.g. /var/log/apache/error_log) and bans IPs that show the malicious signs — too many password failures, seeking for exploits, etc. Generally Fail2Ban is then used to update firewall rules to reject the IP addresses for a specified amount of time, although any arbitrary other action (e.g. sending an email) could also be configured. Out of the box Fail2Ban comes with filters for various services (apache, courier, ssh, etc).

That works for services, like IMAP, very good. Unfortunately, this does not work out of the box for WordPress. But adding the WordPress plugin WP fail2ban brings us closer to the solution. For performance and security reasons, vcloudnine.de can only be accessed through a content delivery network (CDN), in this case CloudFlare. Because CloudFlare acts as a reverse proxy, I can not see “the real” IP address. Furthermore, I can not log the IP addresses because of the German data protection law. This makes the Fail2ban and the WordPress Fail2ban plugin nearly useless, because all I would ban with iptables, would be the CloudFlare CND IP ranges. But CloudFlare offers a firewall service. CloudFlare would be the right place to block IP addresses.

So, how can I stick Fail2ban, the WP Fail2ban plugin and CloudFlares firewall service together?

APIs FTW!

APIs are the solution for nearly every problem. Like others, CloudFlare offers an API that can be used to automate tasks. In this case, I use the API to add entries to the CloudFlare firewall. Or honestly: Someone wrote a Fail2ban action that do this for me.

First of all, you have to install the WP Fail2ban plugin. That is easy. Simply install the plugin. Then copy the wordpress-hard.conf from the plugin directory to the filters.d directory of Fail2ban.

[[email protected] filters.d]# cp wordpress-hard.conf /etc/fail2ban/filter.d/

Then edit the /etc/fail2ban/jail.conf and add the necessary entries for WordPress.

[wordpress-hard]

enabled  = true
filter   = wordpress-hard
logpath  = /var/log/messages
action   = cloudflare
maxretry = 3
bantime  = 604800

Please note, that in my case, the plugin logs to /var/log/messages. The action is “cloudflare”. To allow Fail2ban to work with the CloudFlare API, you need the CloudFlare API Key. This key is uniqe for every CloudFlare account. You can get this key from you CloudFlare user profile. Go to the user settings and scroll down.

Cloudflare Global API Key

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Open the /etc/fail2ban/action.d/cloudflare.conf and scroll to the end of the file. Add the token and your CloudFlare login name (e-mail address) to the file.

# Default Cloudflare API token
cftoken = 1234567890abcdefghijklmopqrstuvwxyz99

cfuser = user@domain.tld

Last step is to tell the WP Fail2ban plugin which IPs should be trusted. We have to add subnets of the CloudFlare CDN. Edit you wp-config.php and add this line at the end:

/** CloudFlare IP Ranges */
define('WP_FAIL2BAN_PROXIES','103.21.244.0/22,103.22.200.0/22,103.31.4.0/22,104.16.0.0/12,108.162.192.0/18,131.0.72.0/22,141.101.64.0/18,162.158.0.0/15,172.64.0.0/13,173.245.48.0/20,188.114.96.0/20,190.93.240.0/20,197.234.240.0/22,198.41.128.0/17,199.27.128.0/21,2400:cb00::/32,2405:8100::/32,2405:b500::/32,2606:4700::/32,2803:f800::/32,2c0f:f248::/32,2a06:98c0::/29');

The reason for this can be found in the FAQ of the WP Fail2ban plugin. The IP ranges used by CloudFlare can be found at CloudFlare.

Does it work?

Seems so… This is an example from /var/log/messages.

Jan 15 20:01:46 webserver wordpress(www.vcloudnine.de)[4312]: Authentication attempt for unknown user vcloudnine from 195.154.183.xxx
Jan 15 20:01:46 webserver fail2ban.filter[4393]: INFO [wordpress-hard] Found 195.154.183.xxx

And this is a screenshot from the CloudFlare firewall section.

Cloudflare Firewall Blocked Websites

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Another short test with curl has also worked. I will monitor the firewall section of CloudFlare. Let’s see who’s added next…

Important note for those, who use SELinux: Make sure that you install the policycoreutils-python package, and create a custom policy for Fail2Ban!

[[email protected] ~]# grep fail2ban /var/log/audit/audit.log | audit2allow -M myfail2banpolicy
******************** IMPORTANT ***********************
To make this policy package active, execute:

semodule -i myfail2banpolicy.pp

A strong indicator are errors like this in /var/log/messages:

Jan 22 12:06:03 webserver fail2ban.actions[16399]: NOTICE [wordpress-hard] Ban xx.xx.xx.xx
Jan 22 12:06:03 webserver fail2ban.action[16399]: ERROR curl -s -o /dev/null https://www.cloudflare.com/api_json.html -d 'a=ban' -d 'tkn=7c8e62809d4183931347772b366e621003c63' -d 'email=patrick@blazilla.de' -d 'key=xx.xx.xx.xx' -- stdout: ''
Jan 22 12:06:03 webserver fail2ban.action[16399]: ERROR curl -s -o /dev/null https://www.cloudflare.com/api_json.html -d 'a=ban' -d 'tkn=7c8e62809d4183931347772b366e621003c63' -d 'email=patrick@blazilla.de' -d 'key=xx.xx.xx.xx' -- stderr: ''
Jan 22 12:06:03 webserver fail2ban.action[16399]: ERROR curl -s -o /dev/null https://www.cloudflare.com/api_json.html -d 'a=ban' -d 'tkn=7c8e62809d4183931347772b366e621003c63' -d 'email=patrick@blazilla.de' -d 'key=xx.xx.xx.xx' -- returned 7
Jan 22 12:06:03 webserver fail2ban.actions[16399]: ERROR Failed to execute ban jail 'wordpress-hard' action 'cloudflare' info 'CallingMap({'ipjailmatches': <function <lambda> at 0x7f49967edc80>, 'matches': '', 'ip': 'xx.xx.xx.xx', 'ipmatches': <function <lambda> at 0x7f49967edde8>, 'ipfailures': <function <lambda> at 0x7f49967edc08>, 'time': 1485083163.0328701, 'failures': 2, 'ipjailfailures': <function <lambda> at 0x7f49967eded8>})': Error banning xx.xx.xx.xx

You will find corresponding audit messages in the /var/log/audit.log:

type=AVC msg=audit(1485083254.298:17688): avc:  denied  { name_connect } for  pid=16575 comm="curl" dest=443 scontext=unconfined_u:system_r:fail2ban_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket

Make sure that you create a custom policy for Fail2Ban, and that you load the policy.

HPE ProLiant PowerShell SDK

This posting is ~7 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Some days ago, my colleague Claudia and I started to work on a new project: A greenfield deployment consisting of some well known building blocks: HPE ProLiant, HPE MSA, HPE Networking (now Aruba) and VMware vSphere. Nothing new for us, because we did this a couple times together. But this led us to the idea, to automate some tasks. Especially the configuration of the HPE ProLiants: Changing BIOS settings and configuring the iLO.

Do not automate what you have not fully understood

Some of the wisest words I have ever said to a customer. Modifying the BIOS and iLO settings is a well understood task. But if you have to deploy a bunch of ProLiants, this is a monotonous, and therefore error prone process. Perfect for automation!

Scripting Tools for Windows PowerShell

To support the automation of HPE ProLiant deployments, HPE offers the Scripting Tools for Windows PowerShell. HPE offers the PowerShell modules free for charge. There are three different downloads:

  • iLO cmdlets
  • BIOS cmdlets
  • Onboard Administrator (OA) cmdlets

The iLO cmdlets include PowerShell cmdlets to configure and manage iLO on HPE ProLiant G7, Gen8 or Gen9 servers. The BIOS cmdlets does not support G7 servers, so you can only configure and manage legacy and UEFI BIOS for Gen8 (except DL580) and all Gen9 models. The OA cmdlets support the configuration and management of the HPE Onboard Administrator, which is used with HPEs well known ProLiant BL blade servers. The OA cmdlets need at least  OA v3.11, whereby v4.60 is the latest version available.  All you need to get started are

  • Microsoft .NET Framework 4.5, and
  • Windows Management Framework 3.0 or later

If you are using Windows 8 or 10, you already have PowerShell 4 respectively PowerShell 5.

Support for HPE ProLiant Gen9 iLO RESTful API

If you have ever seen a HPE ProLiant Gen9 booting up, you might have noticed the iLO RESTful API icon down right. Depending on the server model, the BIOS cmdlets utilize the ILO4 RESTful API. But the iLO RESTful API ecosystem is it worth to be presented in an own blog post. Stay tuned.

Documentation and examples

HPE offers a simple documentation for the BIOS, iLO and OA cmdlets. You can find the documentation in HPEs Information Library. Documentation is important, but sometimes example code is necessary to quickly ramp up code. Check HPEs PowerShell SDK GitHub repository for examples.

Time to code

I’m keen on it and curious to automate some of my regular deployment tasks with these PowerShell modules. Some of these tasks are always the same:

  • change the power management and other BIOS settings
  • change the network settings of the iLO
  • change the initial password of the iLO administrator account and create additional iLO user accounts

Further automation tasks are not necessarily related to the HPE ProLiant PowerShell SDK, but to PowerShell, respectively VMware PowerCLI. PowerShell is great to automate the different aspects and modules of an infrastructure deployment. You can use it to build your own tool box.

How to monitor ESXi host hardware with SNMP

This posting is ~7 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

The Simple Network Management Protocol (SNMP) is a protocol for monitoring and configuration of network-attached devices. SNMP exposes data in the form of variables and values. These variables can then be queried or set. A query retrieves the value of a variable, a set operation assigns a value to a variable. The variables are organized in a hierarchy and each variable is identified by an object identifiers (OID). The management information base (MIB ) describes this hierarchy. MIB files (simple text files) contain metadata for each OID. These are necessary for the translation of a numeric OID into a human-readable format.  SNMP knows two devices types:

  • the managed device which runs the SNMP agent
  • the network management station (NMS) which runs the management software

The NMS queries the SNMP agent with GET requests. Configuration changes are made using SET requests. The SNMP agent can inform the NMS about state changes using a SNMP trap message. The easiest way for authentication is the SNMP community string.

SNMP is pretty handy and it’s still used, especially for monitoring and managing networking components. SNMP has the benefit, that it’s very lightweight. Monitoring a system with WBEM or using an API can cause slightly more load, compared to SNMP. Furthermore, SNMP is a internet-protocol standard. Nearly every device supports SNMP.

Monitoring host hardware with SNMP

Why should I monitor my ESXi host hardware with SNMP? The vCenter Server can trigger an alarm and most customers use applications like VMware vRealize Operations, Microsoft System Center Operations Manager, or HPE Systems Insight Manager (SIM). There are better ways to monitor the overall health of an ESXi host. But sometimes you want to get some stats about the network interfaces (throughput), or you have a script that should do something, if a NIC goes down or something else happens. Again, SNMP is very resource-friendly and widely supported.

Configure SNMP on ESXi

I focus on ESXi 5.1 and beyond. The ESXi host is called “the SNMP Agent”. We don’t configure traps or trap destinations. We just want to poll the SNMP agent using SNMP GET requests. The configuration is done using esxcli . First of all, we need to set a community string and enable SNMP.

[[email protected]:~] esxcli system snmp set -c public -e true
[[email protected]:~] esxcli system snmp get
   Authentication:
   Communities: public
   Enable: true
   Engineid: 00000063000000a100000000
   Hwsrc: indications
   Largestorage: true
   Loglevel: info
   Notraps:
   Port: 161
   Privacy:
   Remoteusers:
   Syscontact:
   Syslocation:
   Targets:
   Users:
   V3targets:

That’s it! The necessary firewall ports and services are opened and started automatically.

Querying the SNMP agent

I use a CentOS VM to show you some queries. The Net-SNMP package contains the tools snmpwalk  and snmpget. To install the Net-SNMP utils, simply use yum .

[[email protected] ~]# yum install net-snmp-utils.x86_64

Download the VMware SNMP MIB files, extract the ZIP file, and copy the content to to /usr/share/snmp/mibs.

[[email protected] mibs]# ls -lt
total 3852
-rw-r--r--. 1 root root  50968 Jun  3 17:05 BRIDGE-MIB.mib
-rw-r--r--. 1 root root  59268 Jun  3 17:05 ENTITY-MIB.mib
-rw-r--r--. 1 root root  52586 Jun  3 17:05 HOST-RESOURCES-MIB.mib
-rw-r--r--. 1 root root  10583 Jun  3 17:05 HOST-RESOURCES-TYPES.mib
-rw-r--r--. 1 root root   7309 Jun  3 17:05 IANA-ADDRESS-FAMILY-NUMBERS-MIB.mib
-rw-r--r--. 1 root root  33324 Jun  3 17:05 IANAifType-MIB.mib
-rw-r--r--. 1 root root   3890 Jun  3 17:05 IANA-RTPROTO-MIB.mib
-rw-r--r--. 1 root root  76268 Jun  3 17:05 IEEE8021-BRIDGE-MIB.mib
-rw-r--r--. 1 root root  89275 Jun  3 17:05 IEEE8021-Q-BRIDGE-MIB.mib
-rw-r--r--. 1 root root  16082 Jun  3 17:05 IEEE8021-TC-MIB.mib
-rw-r--r--. 1 root root  44543 Jun  3 17:05 IEEE8023-LAG-MIB.mib
-rw-r--r--. 1 root root  71747 Jun  3 17:05 IF-MIB.mib
-rw-r--r--. 1 root root  16782 Jun  3 17:05 INET-ADDRESS-MIB.mib
-rw-r--r--. 1 root root  46405 Jun  3 17:05 IP-FORWARD-MIB.mib
-rw-r--r--. 1 root root 185967 Jun  3 17:05 IP-MIB.mib
-rw-r--r--. 1 root root    229 Jun  3 17:05 list-ids-diagnostics.txt
-rw-r--r--. 1 root root  77406 Jun  3 17:05 LLDP-V2-MIB.mib
-rw-r--r--. 1 root root  16108 Jun  3 17:05 LLDP-V2-TC-MIB.mib
-rw-r--r--. 1 root root  23777 Jun  3 17:05 notifications.txt
-rw-r--r--. 1 root root  39918 Jun  3 17:05 P-BRIDGE-MIB.mib
-rw-r--r--. 1 root root  84172 Jun  3 17:05 Q-BRIDGE-MIB.mib
-rw-r--r--. 1 root root   1465 Jun  3 17:05 README
-rw-r--r--. 1 root root 223872 Jun  3 17:05 RMON2-MIB.mib
-rw-r--r--. 1 root root 148032 Jun  3 17:05 RMON-MIB.mib
-rw-r--r--. 1 root root  22342 Jun  3 17:05 SNMP-FRAMEWORK-MIB.mib
-rw-r--r--. 1 root root   5543 Jun  3 17:05 SNMP-MPD-MIB.mib
-rw-r--r--. 1 root root   8259 Jun  3 17:05 SNMPv2-CONF.mib
-rw-r--r--. 1 root root  31588 Jun  3 17:05 SNMPv2-MIB.mib
-rw-r--r--. 1 root root   8932 Jun  3 17:05 SNMPv2-SMI.mib
-rw-r--r--. 1 root root  38048 Jun  3 17:05 SNMPv2-TC.mib
-rw-r--r--. 1 root root  28647 Jun  3 17:05 TCP-MIB.mib
-rw-r--r--. 1 root root  93608 Jun  3 17:05 TOKEN-RING-RMON-MIB.mib
-rw-r--r--. 1 root root  20951 Jun  3 17:05 UDP-MIB.mib
-rw-r--r--. 1 root root   3175 Jun  3 17:05 UUID-TC-MIB.mib
-rw-r--r--. 1 root root   2326 Jun  3 17:05 VMWARE-CIMOM-MIB.mib
-rw-r--r--. 1 root root  22411 Jun  3 17:05 VMWARE-ENV-MIB.mib
-rw-r--r--. 1 root root  53480 Jun  3 17:05 VMWARE-ESX-AGENTCAP-MIB.mib
-rw-r--r--. 1 root root   2328 Jun  3 17:05 VMWARE-HEARTBEAT-MIB.mib
-rw-r--r--. 1 root root   1699 Jun  3 17:05 VMWARE-NSX-MANAGER-AGENTCAP-MIB.mib
-rw-r--r--. 1 root root 146953 Jun  3 17:05 VMWARE-NSX-MANAGER-MIB.mib
-rw-r--r--. 1 root root  15641 Jun  3 17:05 VMWARE-OBSOLETE-MIB.mib
-rw-r--r--. 1 root root   2173 Jun  3 17:05 VMWARE-PRODUCTS-MIB.mib
-rw-r--r--. 1 root root   8305 Jun  3 17:05 VMWARE-RESOURCES-MIB.mib
-rw-r--r--. 1 root root   3736 Jun  3 17:05 VMWARE-ROOT-MIB.mib
-rw-r--r--. 1 root root  11142 Jun  3 17:05 VMWARE-SRM-EVENT-MIB.mib
-rw-r--r--. 1 root root   3872 Jun  3 17:05 VMWARE-SYSTEM-MIB.mib
-rw-r--r--. 1 root root   7017 Jun  3 17:05 VMWARE-TC-MIB.mib
-rw-r--r--. 1 root root   7611 Jun  3 17:05 VMWARE-VA-AGENTCAP-MIB.mib
-rw-r--r--. 1 root root   8777 Jun  3 17:05 VMWARE-VC-EVENT-MIB.mib
-rw-r--r--. 1 root root  38576 Jun  3 17:05 VMWARE-VCOPS-EVENT-MIB.mib
-rw-r--r--. 1 root root  26952 Jun  3 17:05 VMWARE-VMINFO-MIB.mib

Now we can use snmpwalk  to “walk down the hierarchy “. This is only a small part of the complete output. The complete snmpwalk  output has more than 4000 lines!

[[email protected] mibs]# snmpwalk -m ALL -c public -v 2c esx1.lab.local
SNMPv2-MIB::sysDescr.0 = STRING: VMware ESXi 6.0.0 build-3825889 VMware, Inc. x86_64
SNMPv2-MIB::sysObjectID.0 = OID: VMWARE-PRODUCTS-MIB::vmwESX
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (402700) 1:07:07.00
SNMPv2-MIB::sysContact.0 = STRING:
SNMPv2-MIB::sysName.0 = STRING: esx1

Now we can search for interesting parts. If you want to monitor the link status of the NICs, try this:

[[email protected] mibs]# snmpwalk -m ALL -c public -v 2c esx1.lab.local IF-MIB::ifDescr
IF-MIB::ifDescr.1 = STRING: Device vmnic0 at 03:00.0 bnx2
IF-MIB::ifDescr.2 = STRING: Device vmnic1 at 03:00.1 bnx2
IF-MIB::ifDescr.3 = STRING: Device vmnic2 at 04:00.0 bnx2
IF-MIB::ifDescr.4 = STRING: Device vmnic3 at 04:00.1 bnx2
IF-MIB::ifDescr.5 = STRING: Device vmnic4 at 06:00.0 bnx2
IF-MIB::ifDescr.6 = STRING: Device vmnic5 at 06:00.1 bnx2
IF-MIB::ifDescr.7 = STRING: Distributed Virtual VMware switch: DvsPortset-0
IF-MIB::ifDescr.8 = STRING: Virtual interface: vmk0 on port 33554442 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
IF-MIB::ifDescr.9 = STRING: Virtual interface: vmk1 on port 33554443 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
IF-MIB::ifDescr.10 = STRING: Virtual interface: vmk2 on port 33554444 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
IF-MIB::ifDescr.11 = STRING: Virtual interface: vmk3 on port 33554445 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27

As you can see, I used a subtree of the whole hierarchy (IF-MIB::ifDescr). This is the “translated” OID. To get the numeric OID, you have to add the option -O fn to snmpwalk .

[[email protected] mibs]# snmpwalk -O fn -m ALL -c public -v 2c esx1.lab.local IF-MIB::ifDescr
.1.3.6.1.2.1.2.2.1.2.1 = STRING: Device vmnic0 at 03:00.0 bnx2
.1.3.6.1.2.1.2.2.1.2.2 = STRING: Device vmnic1 at 03:00.1 bnx2
.1.3.6.1.2.1.2.2.1.2.3 = STRING: Device vmnic2 at 04:00.0 bnx2
.1.3.6.1.2.1.2.2.1.2.4 = STRING: Device vmnic3 at 04:00.1 bnx2
.1.3.6.1.2.1.2.2.1.2.5 = STRING: Device vmnic4 at 06:00.0 bnx2
.1.3.6.1.2.1.2.2.1.2.6 = STRING: Device vmnic5 at 06:00.1 bnx2
.1.3.6.1.2.1.2.2.1.2.7 = STRING: Distributed Virtual VMware switch: DvsPortset-0
.1.3.6.1.2.1.2.2.1.2.8 = STRING: Virtual interface: vmk0 on port 33554442 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
.1.3.6.1.2.1.2.2.1.2.9 = STRING: Virtual interface: vmk1 on port 33554443 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
.1.3.6.1.2.1.2.2.1.2.10 = STRING: Virtual interface: vmk2 on port 33554444 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27
.1.3.6.1.2.1.2.2.1.2.11 = STRING: Virtual interface: vmk3 on port 33554445 DVS 6b a0 37 50 c6 24 04 b8-25 08 f5 ea 32 ef 48 27

You can use snmptranslate  to translate an OID.

[[email protected] mibs]# snmptranslate .1.3.6.1.2.1.2.2.1.2
IF-MIB::ifDescr
[[email protected] mibs]# snmptranslate -O fn IF-MIB::ifDescr
.1.3.6.1.2.1.2.2.1.2

So far, we have only the description of the interfaces. With a little searching, we find the status of the interfaces (I stripped the output).

IF-MIB::ifOperStatus.1 = INTEGER: up(1)
IF-MIB::ifOperStatus.2 = INTEGER: up(1)
IF-MIB::ifOperStatus.3 = INTEGER: down(2)
IF-MIB::ifOperStatus.4 = INTEGER: down(2)
IF-MIB::ifOperStatus.5 = INTEGER: up(1)
IF-MIB::ifOperStatus.6 = INTEGER: up(1)

ifOperStatus.1  corresponds with ifDescr.1 , ifOperStatus.2  corresponds with ifDescr.2  and so on. The ifOperStatus corresponds  with the status of the NICs in the vSphere Web Client.

nic_status_web_client

If you want to monitor the fans or power supplies, use these these OIDs.

HOST-RESOURCES-MIB::hrDeviceDescr.35 = STRING: POWER Power Supply 1
HOST-RESOURCES-MIB::hrDeviceDescr.36 = STRING: POWER Power Supply 2
HOST-RESOURCES-MIB::hrDeviceDescr.37 = STRING: FAN Fan Block 1
HOST-RESOURCES-MIB::hrDeviceDescr.38 = STRING: FAN Fan Block 2
HOST-RESOURCES-MIB::hrDeviceDescr.39 = STRING: FAN Fan Block 3
HOST-RESOURCES-MIB::hrDeviceDescr.40 = STRING: FAN Fan Block 4

HOST-RESOURCES-MIB::hrDeviceStatus.35 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.36 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.37 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.38 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.39 = INTEGER: running(2)
HOST-RESOURCES-MIB::hrDeviceStatus.40 = INTEGER: running(2)

Many possibilities

SNMP offers a simple and lightweight way to monitor a managed device. It’s not a replacement for vCenter, vROps or SCOM. But it can be an addition, especially because SNMP is an internet-protocol standard.

HPE Hyper Converged 380 – A look under the hood

This posting is ~7 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

In March 2016, HPE CEO Meg Whitman announced a ProLiant-based HCI solution, that should be easier to use and cheaper than Nutanix.

This isn’t HPEs first dance on this floor. In August 2015, HP launched the Hyper Converged 250 System (HC250), which is based on the Apollo server platform. The HW design of the HC250 comes close to a Nutanix Block, because the Apollo platform supports up to four nodes in 2U. Let me say this clear: The Hyper Converged 380 (HC380) is not a replacement for the HC250! And before the HC250, HPE offered the Converged System 200-HC StoreVirtual and 200-HC EVO:RAIL (different models).

The HC380 is based on the ProLiant DL380 Gen9 platform. The DL380 Gen9 is one of the, if not the best selling x86 server on the market. Instead of developing everything from scratch, HPE build their new HC380 from different already available HPE products. With one exception: HPE OneView User Experience (UX). IT was developed from scratch and consolidates all management and monitoring tasks into a single console. The use of already available components was the reason for the low time-to-market (TTM) of the HC380.

Currently, the HC380 can only run VMware vSphere (HPE CloudSystem uses VMware vSphere). Support for Microsoft Hyper-V and Citrix XenServer will be added later. If you wish to run Microsoft Hyper-V, check the HC250 or wait until it’s supported with the HC380.

What flavor would you like?

The HC380 is available in three editions (use cases):

  • HC380 (Virtualization)
  • HC380 (HPE CloudSystem)
  • HC380 (VDI)

All three use cases are orderable using a single SKU and include two DL380 Gen9 nodes (2U). You can add up to 14 expansion nodes, so that you can have up to 16 dual-socket DL380 Gen9.

Each node comes with two Intel Xeon E5 CPUs. The exact CPU model has to be selected before ordering. The same applies to the memory (128 GB or 256 GB per node, up to 1,5 TB) and disk groups (up to three disk groups, each with 4,5 to 8 TB usable capacity per block, 8 drives either SSD/ HDD or all HDD with a maximum of 25 TB usable per node). The memory and disk group configuration depends on the specific use case (virtualization, CloudSystem, VDI). The same applies to the number of network ports (something between 8x 1 GbE and 6x 10 GbE plus 4x 1 GbE). For VDI, customers can add NVIDIA GRID K1, GRID K2 or Telsa M60 cards.

VMware vSphere 6 Enterprise or Enterprise Plus are pre-installed and licences can be bought from HPE. Interesting note from the QuickSpecs:

NOTE: HPE Hyper Converged 380 for VMware vSphere requires valid VMware vSphere Enterprise or higher, and vCenter licenses. VMware licenses can only be removed from the order if it is confirmed that the end-customer has a valid licenses in place (Enterprise License Agreement (ELA), vCloud Air Partner or unused Enterprise Purchasing Program tokens).

Hewlett Packard Enterprise supports VMware vSphere Enterprise, vSphere Enterprise Plus and Horizon on the HPE Hyper Converged 380.

No support for vSphere Standard or Essentials (Plus)! Let’s see how HPE will react on the fact, that VMware will phase out vSphere Enterprise licenses.

The server includes 3y/ 3y/ 3y onsite support with next business day response. Nevertheless, at least 3-year HPE Hyper Converged 380 solution support is requires according to the latest QuickSpecs.

What’s under the hood?

As I already mentioned, the HC380 was built from well known HPE products. Only HPE OneView User Experience (UX) was developed from scratch. OneView User Experience (UX) consolidates the following tasks into a single console (source QuickSpecs):

  • Virtual machine (VM) vending (create, edit, delete)
  • Hardware/driver and appliance UI frictionless updates
  • Advanced capacity and performance analytics (optional)
  • Backup and restore of appliance configuration details
  • Role-based access
  • Integration with existing LDAP or Active Directory
  • Physical and virtual hardware monitoring

Pretty cool fact: HPE OneView User Experience (UX) will be available for the HC250 later this year. Part of a 2-node cluster are not only the two DL380 Gen9 servers, but also three VMs:

  • HC380 Management VM
  • HC380 OneView VM
  • HC380 Management UI VM

The Management VM is used for VMware vCenter (local install) and HPE OneView for vCenter. You can use a remote vCenter (or a vCenter Server Appliance), but you have to make sure that the remote vCenter has HPE Oneview for vCenter integrated. The OneView VM running HPE OneView for for HW/ SW management. The Management UI VM is running HPE OneView User Experience.

The shared storage is provided by HPE StoreVirtual VSA. A VSA is running on each node. As you might know, StoreVirtual VSA comes with an all-inclusive license. No need to buy additional licenses. You can have it all: Snapshots, Remote Copy, Clustering, Thin Provisioning, Tiering etc. The StoreVirtual VSA delivers sustainable performance, a good VMware vSphere integration and added value, for example support for Veeam Storage Snapshots.

When dealing with a 2-node cluster, the 25 TB usable capacity per node means in fact 25 TB usable for the whole 2-node cluster. This is because of the Network RAID 1 between the two StoreVirtual VSA. The data is mirrored between the VSAs. When adding more nodes, the data is striped accross the nodes in the cluster (Network RAID 10+2).

Also important in case of the 2-node cluster: The quorum. At least two StoreVirtual VSA build a cluster. As in every cluster, you need some kind of quorum. StoreVirtual 12.5 added support for a NFSv3 based quorum witness. This is in fact a NFS file share, which has to be available for both nodes. This is only supported in 2-node clusters and I highly recommend to use this. I have a customer that uses a Raspberry Pi for this…

Start the engine

You have to meet some requirements before you can start.

  • 1 GbE connections for each nodes iLO and 1 GbE ports
  • 1 GbE or 10 GbE connections for each node FlexLOM ports
  • Windows-based computer directly connected to a node (MacOS X or Linux should also work)
  • VMware vSphere Enterprise or Enterprise Plus licenses
  • enough IP addresses and VLANs (depending on the use case)

For general purpose server virtualization, you need at least three subnets and three VLANs:

  • Management
  • vMotion
  • Storage (iSCSI)

Although you have the choice between a flat (untagged) and a VLAN-tagged network design, I would always recomment a VLAN-tagged approach. It’s highly recommended to use multiple VLANs to get the traffic seperated. The installation guide includes worksheets and examples to help you planning the deployment. For a 2-node cluster you need at least:

  • 5 IP addresses for the management network
  • 2 IP addresses for the vMotion network
  • 8 IP addresses for the iSCSI storage network

You should leave space for expansion nodes. A proper planning saves you later trouble.

HP OneView InstantOn is used for the automated deployment. It guides you through the necessary configuration steps. HPE says that the deployment requires less than 60 minutes and all you need to enter are

  • IP addresses
  • credentials
  • VMware licenses

After the deployment, you have to install the StoreVirtual VSA licenses. Then you can create datastores and, finally, VMs.

hpehc380_ux

HPE/ hpw.com

Summary

Hyper-Converged has nothing to do with the form factor. Despite the fact that a 2-node cluster comes in 4U, the HC380 has everything you would expect from a HCIA. The customers will decide if HPE held promise. The argument for the HC380 shouldn’t be the lower price compared to Nutanix or other HCI players. Especially, HPE should not repeat the mistake of the HC200 EVO:RAIL: To buggy and to expensive. The HC380 combines known and mature products (ProLiant DL380 Gen9, StoreVirtual VSA, OneView). It’s now up to HPE.

I have several small and mid-sized customers that are running two to six nodes VMware vSphere environments. Also the HC380 for VDI can be very interesting.

HP ProLiant BL460c Gen9: MicroSD card missing during ESXi 5.5 setup

This posting is ~7 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Today, I was at a customer to prepare a two node vSphere cluster for some MS SQL server tests. Nothing fancy, just two HP ProLiant BL460c Gen9 blades and two virtual volumes from a HP 3PAR. Each blade had two 400 GB SSDs, two 64 GB M.2 SSDs and a 1 GB MicroSD card. Usually, I install ESXi to a SD card. In this case, a MicroSD card. The SSDs were dedicated for PernixData FVP. Although I saw the MicroSD card in the boot menu, ESXi doesn’t showed it as a installation target.

bl460_no_sdcard

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

I’ve read a lot about similar observations of other users, but no solution seemed to be plausible. It’s not a solution to switch from UEFI to legacy BIOS or play with power management settings. BIOS and ILO firmware were up to date. But disabling USB3 support seemed to be a possible solution.

You can disable USB3 support in the RBSU.

bl460_rbsu

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

After a reboot, the MicroSD card appeared as installation target during the ESXi setup.

bl460_with_sdcard

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

I never noticed this behaviour with ProLiant DL360 Gen9, 380 Gen9, 560 Gen9 or 580 Gen9. I only saw this with BL460c Gen9. The affected blade servers had System ROM I36 05/06/2015 and I36 09/24/2015, as well as ILO4 2.30 installed.

Reset the HP iLO Administrator password with hponcfg on ESXi

This posting is ~8 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Sometimes you need to reset the ILO Administrator password. Sure, you can reboot the server, press F8 and then reset the Administrator password. If you have installed a HP customized ESXi image, then there is a much better way to reset the password: HPONCFG.

Check the /opt/hp/tools directory. You will find a binary called hponcfg.

~ # ls -l /opt/hp/tools/
total 5432
-r-xr-xr-x 1 root root 5129574 Oct 28 2014 conrep
-r--r--r-- 1 root root 108802 Oct 28 2014 conrep.xml
-r-xr-xr-x 1 root root 59849 Jan 16 2015 hpbootcfg
-r-xr-xr-x 1 root root 251 Jan 16 2015 hpbootcfg_esxcli
-r-xr-xr-x 1 root root 232418 Jul 14 2014 hponcfg
-r-xr-xr-x 1 root root 12529 Oct 31 2013 hptestevent
-r-xr-xr-x 1 root root 250 Oct 31 2013 hptestevent_esxcli

All you need is a simple XML file. You can use the VI editor or you can copy the necessary file with WinSCP to the root home directory on your ESXi host. I prefer VI. Change the directory to /opt/hp/tools. Then open the pwreset.xml.

~ # vi pwreset.xml

Press i to switch to the insert mode. Then paste this content into the file. You don’t have to know the current password!

<RIBCL VERSION="2.0">
<LOGIN USER_LOGIN="Administrator" PASSWORD="unknown">
<USER_INFO MODE="write">
<MOD_USER USER_LOGIN="Administrator">
<PASSWORD value="password"/>
</MOD_USER>
</USER_INFO>
</LOGIN>
</RIBCL>

Press ESC and then :wq<ENTER> to save the file and leave the VI. Now use HPONCFG together with the XML file to reset the password.

~ # /opt/hp/tools/hponcfg -f pwreset.xml
HP Lights-Out Online Configuration utility

Version 4.4-0 (c) Hewlett-Packard Company, 2014
Firmware Revision = 1.85 Device type = iLO 3 Driver name = hpilo
iLO IP Address: 172.16.1.52
Script succeeded

That’s it! You can now login with “Administrator” and “password”.

HP Service Pack for ProLiant 2015.04

This posting is ~8 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Some weeks ago, HP has published an updated version of their HP Service Pack for ProLiant (SPP). The SPP 2015.04.0 has added support for

  • new HP ProLiant servers and options,
  • support for Red Had Enterprise Linux 6.6, SUSE Linux Enterprise Server 12, VMware vSphere 5.5 U2 and (of course) VMware vSphere 6.0,
  • HP Smart Update Manager v7.2.0 was added,
  • the HP USB Key Utility for Windows v2.0.0.0 can now handle downloads greater than 4GB (important, because this release may not fit on a standard DVD media…)
  • select Linux firmware components is now available in rpm format

In addition, the SPP covers two important customer advisories:

  • ProLiant Gen9 Servers – SYSTEM ROM UPDATE REQUIRED to Prevent Memory Subsystem Anomalies on Servers With DDR4 Memory Installed Due to Intel Processor BIOS Upgrades (c04542689)
  • HP Virtual Connect (VC) – Some VC Flex-10/10D Modules for c-Class BladeSystem May Shut Down When Running VC Firmware Version 4.20 or 4.30 Due to an Erroneous High Temperature Reading (c04459474)

Two CAs fixed, but another CA arised (and it’s an ugly one..):

  • HP OneView 1.20 – Upgrading Virtual Connect Version 4.40 with Service Pack for ProLiant (SPP) 2015.04.0 Will Result in a Configuration Error and an Inability to Manage the HP Virtual Connect 8Gb 24-Port Fibre Channel Module (c04638459)

If you are using HP OneView >1.10 and 1.20.04, you will be unable to manage HP Virtual Connect 8Gb 24-port Fibre Channel Modules after updating the module to firmware version 3.00 or later. This is also the case, if you use the smart components from Virtual Connect  firmware version 4.40! After the update, the VC module will enter a “Configuration Error” state. Currently there is no fix. The only workaround is not to update to HP Virtual Connect 8Gb 24-port Fibre Channel Module firmware version 3.00. This will be fixed in a future HP OneView 1.20 patch.

Important to know: With this release, the SPP may not fit on standard DVD media! But to be honest: I’ve never burned the SPP to DVD, I always used USB media.

Check the release notes for more information about this SPP release. You can download the latest SPP version from the HP website. You need an active warranty or HP support agreement to download the SPP.

Is Nutanix the perfect fit for SMBs?

This posting is ~8 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

There’s a world below clouds and enterprise environments with thousands of VMs and hundered or thousands of hosts. A world that consists of maximal three hosts. I’m working with quite a few customers, that are using VMware vSphere Essentials Plus. Those environments consist typically of two or three hosts and something between 10 and 100 VMs. Just to mention it: I don’t have any VMware vSphere Essentials customer. I can’t see any benefit for buying these license. Most of these environments are designed for a lifeime of three to four years. After that time, I come again and replace it with new gear. I can’t remember any customer that upgraded his VMware vSphere Essentials Plus. Even if the demands to the IT infrastructure increases, the license stays the same. The hosts and storage gets bigger, but the requirements stays the same: HA, vMotion, sometimes vSphere Replication, often (vSphere API for) Data Protection. Maybe this is a german thing and customers outside of german are growing faster and invest more in their IT.

Hyperconverged, scale-out IT infrastructure for SMBs?

Think enterprise and break it down to smaller customers. That is said easily, but we saw so many technologies coming from the enterprise down to the SMBs over the last years. Think about SAN. 15 years ago, no SMB even thought about it. Today it’s standard.

I’ve taken this statement from the Nutanix webseite.

Nutanix simplifies datacenter infrastructure by integrating server and storage resources into a turnkey appliance that is deployed in just 30 to 60 minutes, and runs any application at any scale.

When working with SMBs, most of them have to deal with a tight budget. This means that they use the maximum principle, to get most hardware, software and service for their money. Customers do not like long implementation phases. Long implementation phases means, that lots of money can’t invested in hardware or software. Every single Euro/ Dollar/ $CURRENCY invested for service can’t be invested in hardware and software.

Another important requirement for the most SMBs is simple operation. I know a lot customers with only one, two or three people, that doing all that stuff around helpdesk, server, networking etc. IT infrastructure, or IT in general, isn’t the main focus for many of them. It should just work. Every day. Until it’s replaced.This applies not only to the area of server virtualization, it applies to IT in general. This often requires lean and simple designs, designs that follow the principle of error prevention. Because of this, it’s a good practice to reduce the components used in a design and automate where it’s useful and valuable. And if a solution is robust, then this can only be an advantage.

Why Nutanix?

In my opinion, simplicity is the key to sucess. If you see Nutanix for the first time, you will be surprised how easy it is to manage. Deployment, operation, updates. It’s slick, it’s simple, it’s lightweight. Everything the customer needs, is combined on 2U. The same applies to the support. I’ve followed the discussion on Twitter between Nutanix and VMware on who may/ can/ is allowed to provide support for VMware. It was started by a blog post of Chuck Hollis (10 Reasons why VMware is leading the hyperconverged industry). To make it short: I don’t share his opinion. In my opinion, Nutanix focus on customer experience is the key.

Simplicity and the ability to change

I don’t think that pre-configured systems like Fujitsu Cluster-in-a-boxVCE vBlocks or HP ConvergedSystems are the answer to simplified IT infrastructure for SMBs. They are not hyperconverged. They are pre-configured. That’s an important difference. Pre-configured doesn’t mean that it’s easy to manage or fast and easy to implement. SMBs want hyperconverged platforms to simplify their IT infrastructure. Okay, so why not buy any other offered hyperconverged platform on the market, like SimpliVity OmniCubeHP ConvergedSystems HC or VMware EVO:RAIL? Because these offerings are focused on VMware. The question was: Why Nutanix? Because you can run KVM, Microsoft Hyper-V and VMware ESXi on it. That’s an unique selling point (USP). You can offer the customer a hyperconverged platform, that allows him to change to another hypervisor later. I think we all agree that VMware is the market leader. But Microsoft is catching up. All features of the Essentials Plus kit can be delivered with Microsoft Hyper-V (and much more if you add SCVMM). Remeber: I talk about the typical Essentials Plus customer. VMware vSphere Essentials Plus includes all what a customer needs: Failover, live migration, data protection, and if needed, replication. In my experience, DRS, Host Profiles and vSphere Distributed Switches are nice, but SMBs can’t take advantage of it (exceptions are not excluded…). Add the Microsofts SCVMM and the gap between VMware vSphere and Microsoft Hyper-V is even smaller. The licensing of Microsoft Windows Server makes it interesting for customers to take a look at Microsoft Hyper-V, especially if you take the licensing costs into account. Sure, it’s not all about CAPEX (capital expenditure), OPEX (operational expenditures) is also important. Don’t get me wrong, I love VMware. But it’s important to be prepared. If the customer decides to change to Microsoft Hyper-V, you should be able to deliver it.

How can it look like?

Depending on the computing and storage needs, take a closer look at the Nutanix NX-1000 or NX-3000 series. I think a NX-1350 or NX-3350/ 3360 block is a good catch. Add a VMware vSphere Essentials Plus kit (or some Microsoft Windows Server 2012 R2 licenses… maybe also System Center 2012), Veeam Backup Essentials, something to store the backups on, like a HP StoreOnce 2700, and your favorite switches for 10 GbE networking connectivity (for example two HP 2920 switches in a stack with 10 GbE modules). A complete datacenter on 5U. This is only an example, but I think this should fit for most SMB customers (depending how you define SMB…).

Famous last words

Is Nutanix the perfect fit for SMBs? Yes! Easy to implement, easy to manage and robust. Nutanix stands out with its platform independence. This allows customers to have a choice in regard of the used hypervisor. Investment protection is a valuable asset, if you constantly have to fight for budgets.