Category Archives: HPE

Virtually reseated: Reset blade in a HPE C7000 enclosure

After a reboot, a VMware ESXi 6.7 U3 told me that he has no compatible NICs. Fun fact: Right before the reboot everything was fine.

The ILO also showed no NICs. Unfortunately, I wasn’t onsite to pull the blade server and put it back in. But there is a way to do this “virtually”.

You have to connect to the IP address of the Onboard Administrator via SSH. Then issue the reset server command with the bay of the server you want to reset and an argument.

OA1-C7000> reset server 13

WARNING: Resetting the server trips its E-Fuse. This causes all power to be momentarily removed from the server. This command should only be used when physical access to the server is unavailable, and the server must be removed and
reinserted.

Any disk operations on direct attached storage devices will be affected. I/O
will be interrupted on any direct attached I/O devices.

Entering anything other than 'YES' will result in the command not executing.

Do you want to continue ? yes

Successfully reset the E-Fuse for device bay 13.

The server will power up automatically. Please note, that the OBA is unable to display certain information right after this operation. It will take a couple of minutes until all information, like serial number or device bay name are visible again.

Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7

I’ve got several mails and comments about this topic. It looks like that the latest ESXi 6.7 updates are causing some trouble on HPE ProLiant Gen10 servers.

I’ve blogged about recurring host hardware sensor state alarm messages some weeks ago. A customer noticed them after an update. Last week, I got the first comments under this blog post abot fan failure messages after applying the latest ESXi 6.7 updates. Then more and more customers asked me about this, because they got these messages too in their environment after applying the latest updates.

Last Saturday I tweeted my blog post to give a hint to my followers who may be experiencing the same problem.

Fortunately one of my followers (Thanks Markus!) pointed me to a VMware KB article with a workaround: Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7 (78989).

This is NOT a solution, but a workaround. Keep that in Mind.

Thanks again to Markus. Make sure to visit his awesome blog (MY CLOUD-(R)EVOLUTION) , especially if you are interested in vSphere, Veeam and automation!

VMware ESXi 6.7: Recurring host hardware sensor state alarm

If you found this blog post because you are searchting for a solution for a FAN FAILURE on your ProLiant Gen10 HW after applying the latest ESXi 6.7 patches, then use this shortcut for the workaround: Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7


I had a really annoying problem at one of my customers. After deploying new VMware ESXi hosts (HPE ProLiant DL380 Gen10) along with an upgrade of the vCenter Server Appliance to 6.7 U2, the customer reported recurring host hardware sensor state alarm messages in the vCenter for all hosts.

After acknowledging the alarm, it recurred after a couple of minutes or hours. The hardware was finde, no errors or warnings were noticed in the ILO Management Log. But the vCenter reported periodically a Sensor -1 type error in the Events window. The /var/log/syslog.log contained messages like this:

2019-11-29T04:39:48Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:49Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:50Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:51Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1
 2019-11-29T04:39:52Z sfcb-vmw_ipmi[4263212]: IpmiIfcSelGetInfo: IPMI_CMD_GET_SEL_INFO cc=0xc1

Sure, you can ignore this. But you shouldn’t ignore this, because these events can result in the vCenter database increasing in size. vCenter can crash once the SEAT partition size goes above the 95% threshold. So you better fix this!

Long story short: This bug is fixed with the latest November updates for ESXi 6.7 U3. A workaround is to disable the WBEM service. The WBEM service might be enabled after a reboot. In this case you have to disable the sfcbd-watchdog service.

But the best way to solve this is to install the latest patches (VMware ESXi 6.7, Patch Release ESXi670-201911001)

Veeam and StoreOnce: Wrong FC-HBA driver/ firmware causes Windows BSoD

This posting is ~2 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

One of my customers bought a very nice new backup solution, which consists of a

  • HPE StoreOnce 5100 with ~ 144 TB usable capacity,
  • and a new HPE ProLiant DL380 Gen10 with Windows Server 2016

as new backup server. StoreOnce and backup server will be connected with 8 Gb Fibre-Channel and 10 GbE to the existing network and SAN. Veeam Backup & Replication 9.5 U3a is already in use, as well as VMware vSphere 6.5 Enterprise Plus. The backend storage is a HPE 3PAR 8200.

This setup allows the usage of Catalyst over Fibre-Channel together with Veeam Storage Snapshots, and this was intended to use.

I wrote about a similar setup some month ago: Backup from a secondary HPE 3PAR StoreServ array with Veeam Backup & Replication.

The OS on the StoreOnce was up-to-date (3.16.7), Windows Server 2016 was installed using HPE Intelligent Provisioning. Afterwards, a drivers and firmware were updated using the latest SPP 2018.11 was installed. So all drivers and firmware were also up-to-date.

After doing zoning and some other configuration tasks, I installed Veeam Backup and Replication 9.5 U3, configured my Catalyst over Fibre-Channel repository. I configured a test backup… and the server failed with a Blue Screen of Death… which is pretty rare since Server 2008 R2.

geralt / pixabay.com/ Creative Commons CC0

I did some tests:

  • backup from 3PAR Storage Snapshots to Catalyst over FC repository – BSoD
  • backup without 3PAR Storage Snapshots to Catalyst over FC repository – BSoD
  • backup from 3PAR Storage Snapshots to Catalyst over LAN repository – works fine
  • backup without 3PAR Storage Snapshots to Catalyst over LAN repository – works fine
  • backup from 3PAR Storage Snapshots to default repository – works fine
  • backup without 3PAR Storage Snapshots to default repository – works fine

So the error must be caused by the usage of Catalyst over Fibre-Channel. I filed a case at HPE, uploaded gigabytes of memory dumps and heard pretty less during the next week.

HPE StoreOnce Support Matrix FTW!

After a week, I got an email from the HPE support with a question about the installed HBA driver and firmware. I told them the version number and a day later I was requested to downgrade (!) drivers and firmware.

The customer has got a SN1100Q (P9D93A & P9D94A) HBA in his backup server, and I was requested to downgrade the firmware to version 8.05.61, as well as the driver to 9.2.5.20. And with this firmware and driver version, the backup was running fine (~ 750 MB/s hroughput).

I found the HPE StoreOnce Support Matrix on the SPOCK website from HPE. The matrix confirmed the firmware and driver version requirement (click to enlarge).

Fun fact: None of the listed HBAs (except the Synergy HBAs) is supported with the latest StoreOnce G2 products.

Lessons learned

You should take a look at those support matrices – always! HPE confirmed that the first level recommendation “Have you trieed to update to the latest firmware” can cause similar problems. The fact, that the factory ships the server with the latest firmware does not make this easier.

Backup from a secondary HPE 3PAR StoreServ array with Veeam Backup & Replication

This posting is ~2 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

When taking a backup with Veeam Backup & Replication, a VM snapshot is created to get a consistent state of the VM. The snapshot is taken prior the backup, and it is removed after the successful backup of the VM. The snapshot grows during its lifetime, and you should keep in mind, that you need some free space in the datastore for snapshots. This can be a problem, especially in case of multiple VM backups at a time, and if the VMs share the same datastore.

Benefit of storage snapshots

If your underlying storage supports the creation of storage snapshots, Veeam offers an additional way to create a consistent state of the VMs. In this case, a storage snapshot is taken, which is presented to the backup proxy, and is then used to backup the data. As you can see: No VM snapshot is taken.

Now one more thing: If you have a replication or synchronous mirror between two storage systems, Veeam can do this operation on the secondary array. This is pretty cool, because it takes load from you primary storage!

Backup from a secondary HPE 3PAR StoreServ array

Last week I was able to try something new: Backup from a secondary HPE 3PAR StoreServ array. A customer has two HPE 3PAR StoreServ 8200 in a Peer Persistence setup, a HPE StoreOnce, and a physical Veeam backup server, which also acts as Veeam proxy. Everything is attached to a pretty nice 16 Gb dual Fabric SAN. The customer uses Veeam Backup & Replication 9.5 U3a. The data was taken from the secondary 3PAR StoreServ and it was pushed via FC into a Catalyst Store on a StoreOnce. Using the Catalyst API allows my customer to use Synthetic Full backups, because the creation is offloaded to StoreOnce. This setup is dramatically faster and better than the prior solution based on MicroFocus Data Protector. Okay, this last backup solution was designed to another time with other priorities and requirements. it was a perfect fit at the time it was designed.

This blog post from Veeam pointed me to this new feature: Backup from a secondary HPE 3PAR StoreServ array. Until I found this post, it was planned to use “traditional” storage snapshots, taken from the primary 3PAR StoreServ.

With this feature enabled, Veeam takes the snapshot on the 3PAR StoreServ, that is hosting the synchronous mirrored virtual volume. This graphic was created by Veeam and shows the backup workflow.

Veeam/ Backup process/ Copyright by Veeam

My tests showed, that it’s blazing fast, pretty easy to setup, and it takes unnecessary load from the primary storage.

In essence, there are only three steps to do:

  • add both 3PARs to Veeam
  • add the registry value and restart the Veeam Backup Server Service
  • enable the usage of storage snapshots in the backup job

How to enable this feature?

To enable this feature, you have to add a single registry value on the Veeam backup server, and afterwards restart the Veeam Backup Server service.

  • Location: HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication\
  • Name: Hp3PARPeerPersistentUseSecondary
  • Type: REG_DWORD (0 False, 1 True)
  • Default value: 0 (disabled)

Thanks to Pierre-Francois from Veeam for sharing his knowledge with the community. Read his blog post Backup from a secondary HPE 3PAR StoreServ array for additional information.

HPE Networking expert level certifications

This posting is ~2 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

A couple of days ago, I took the HP0-Y47 exam “Deploying HP FlexNetwork Core Technologies”. It was one of two required exams to achive the HPE ASE – Data Center Network Integrator V1, and the HP ASE – FlexNetwork Integrator V1 certification. It was a long planned upgrade to my HP ATP certification, and it is a necessary certification for the HPE partner status of my employer.

You might find it confusing that I’m talking about an HP ASE and a HPE ASE. That is not a typo. The HP ASE was released prior the HP/ HPE split. The HPE ASE was released after the split in HP and HPE.

The HP/ HPE ATP is a professional level certification, comparable to the Cisco Certified Network Associate (CCNA). The HP/ HPE ASE is an expert level certification, so the typical candidate for a HP/ HPE ASE certification is a professional with three to five years experience in designing and architecting complex enterprise-level networks.

Requirements

There are different ways to achieve this certification. Regardless of the way you chose, you need a certification from which you can upgrade. This does not have to be a HP/ HPE certification! If you hold a valid CCNA/ CCNP or JNCIP-ENT, you can upgrade from this certification without the need of a valid HP/ HPE ATP Networking certification.

If you want to earn the HPE ASE – Data Center Network Integrator V1, and the HP ASE – FlexNetwork Integrator V1 certification in a single step, you need at least one of these certifications:

  • HP ATP – FlexNetwork Solutions V3
  • HPE ATP – Data Center Solutions V1

Or if you want to upgrade from a non-HP/ HPE certification:

  • Cisco – CCNP (any CCNP regardless of technology)
  • Cisco – Certified Design Professional (CCDP)
  • Juniper – JNCIP-ENT

Now you need to pass two exams:

HP2-Z34 (Building HP FlexFabric Data Centers)

The HP2-Z34 exam focuses on deployment and implementation of HPE FlexFabric Data Center solutions. Therefore, the exams covers topics like

  • Multitenant Device Context (MDC)
  • Datacenter Bridging (DCB)
  • Multiprotocol Label Switching (MPLS)
  • Fibre Channel over Ethernet (FCoE)
  • Ethernet Virtual Interconnect (EVI),
  • Multi-Customer Edge (MCE),
  • Transparent Interconnection of Lots of Links (TRILL), and
  • Shortest Path Bridging Mac-in-Mac mode (SPBM).

HPE offers a study guide to prepare for this exam: Building HP FlexFabric Data Centers (HP2-Z34 and HP0-Y51). I used this guide to prepare for the exam (eBook). The guide was of an average quality. Its sufficient to prepare for the exam, but I used other materials to get a better understanding of some topics.

HP2 exams are web-based exams. To pass the HP2-Z34 exam, I had to answer 60 questions in 105 minutes, with a passing score of 70%. The exam was quite demanding, especially if you don’t have much real-world experience with some of the covered topics.

HP0-Y47 (Deploying HP FlexNetwork Core Technologies)

The HP0-Y47 exam covers the configuration, implementation, and the troubleshoot enterprise level HPE FlexNetwork solutions. The exam covers different topics, e.g.

  • Quality of Service (QoS)
  • redundancy (VRRP, Stacking)
  • multicast routing (IGMP, PIM)
  • dynamic routing (OSPF, BGP)
  • ACLs, and
  • port authentication/ port security (Mac-auth, Web-auth, 802.1x)

I used the HP ASE FlexNetwork Solutions Integrator (HP0-Y47) study guide to prepre myself for the exam. Unfortunately, it had the same average quality as the HP2 Z34 guide: Good enough to pass the exam, but don’t expect to much.

HP0-Y47 is a proctored exam. I had to answer 55 questions in 150 minutes, with a passing score of 65%. The exam is not very hard, if you were familiar with the covered topics. Experience with ProVision and Comware is absolutely necessary, because both platforms have their peculiarities, e.g. processing of ACLs, differences in Stacking technologies, commands, STP support etc.

It took me some time to prepare for both exams, despite the fact that I work with ProVision and Comware Switches every day. So I’m pretty happy that I passed both exams on the first try.

vSphere Distributed Switch health check fails on HPE Comware switches

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

During the replacement of some VMware ESXi hosts at a customer, I discovered a recurrent failure of the vSphere Distributed Switch health checks. A VLAN and MTU mismatch was reported. On the physical side, the ESXi hosts were connected to two HPE 5820 switches, that were configured as an IRF stack. Inside the VMware bubble, the hosts were sharing a vSphere Distributed Switch.

cre8tive / pixelio.de

The switch ports of the old ESXi hosts were configured as Hybrid ports. The switch ports of the new hosts were configured as Trunk ports, to streamline the switch and port configuration.

Some words about port types

Comware knows three different port types:

  • Access
  • Hybrid
  • Trunk

If you were familiar with Cisco, you will know Access and Trunk ports. If you were familiar with HPE ProCurve or Alcatel-Lucent Enterprise, these two port types refer to untagged and tagged ports.

So what is a Hybrid port? A Hybrid port can belong to multiple VLANs where they can be untagged and tagged. Yes, multiple untagged VLANs on a port are possible, but the switch will need additional information to bridge the traffic into correct untagged VLANs. This additional information can be  MAC addresses, IP addresses, LLDP-MED etc. Typically, hybrid ports are used for in VoIP deployments.

The benefit of a Hybrid port is, that I can put the native VLAN of a specific port, which is often referred as Port VLAN identifier (PVID), as a tagged VLAN on that port. This configuration allows, that all dvPortGroups have a VLAN tag assigned, even if the VLAN tag represents the native VLAN of a switch port.

Failing health checks

A failed health check rises a vCenter alarm. In my case, a VLAN and MTU alarm was reported. In both cases, VLAN 1 was causing the error. According to VMware, the three main causes for failed health checks are:

  • Mismatched VLAN trunks between a vSphere distributed switch and physical switch
  • Mismatched MTU settings between physical network adapters, distributed switches, and physical switch ports
  • Mismatched virtual switch teaming policies for the physical switch port-channel settings.

Let’s take a look at the port configuration on the Comware switch:

#
interface Ten-GigabitEthernet1/0/9
 port link-mode bridge
 description "ESX-05 NIC1"
 port link-type trunk
 port trunk permit vlan all
 stp edged-port enable
#

As you can see, this is a normal trunk port. All VLANs will be passed to the host. This is an except from the display interface Ten-GigabitEthernet1/0/9  output:

 PVID: 1
 Mdi type: auto
 Port link-type: trunk
  VLAN passing  : 1(default vlan), 2-3, 5-7, 100-109
  VLAN permitted: 1(default vlan), 2-4094
  Trunk port encapsulation: IEEE 802.1q

The native VLAN is 1, this is the default configuration. Traffic, that is received and sent from a trunk port, is always tagged with a VLAN id of the originating VLAN – except traffic from the default (native) VLAN! This traffic is sent without a VLAN tag, and if frames were received with a VLAN tag, this frames will be dropped!

If you have a dvPortGroup for the default (native) VLAN, and this dvPortGroup is sending tagged frames, the frames will be dropped if you use a “standard” trunk port. And this is why the health check fails!

Ways to resolve this issue

In my case, the dvPortGroup was configured for VLAN 1, which is the default (native) VLAN on the switch ports.

There are two ways to solve this issue:

  • Remove the VLAN tag from the dvPortGroup configuration
  • Change the PVID for the trunk port

To change the PVID for a trunk port, you have to enter the following command in the interface context:

[ToR-Ten-GigabitEthernet1/0/9] port trunk pvid vlan 999

You have to change the PVID on all ESXi facing switch ports. You can use a non-existing VLAN ID for this.

vSphere Distributed Switch health check will switch to green for VLAN and MTU immediately.

Please note, that this is not the solution for all VLAN-related problems. You should make sure that you are not getting any side effects.

Meltdown & Spectre: What about HPE Storage and Citrix NetScaler?

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

In addition to my shortcut blog post about Meltdown and Spectre with regard of Microsoft Windows, VMware ESXi and vCenter, and HPE ProLiant, I would like to add some additional information about HPE Storage and Citrix NetScaler.

When we talk about Meltdown and Spectre, we are talking about three different vulnerabilities:

  • CVE-2017-5715 (branch target injection)
  • CVE-2017-5753 (bounds check bypass)
  • CVE-2017-5754 (rogue data cache load)

CVE-2017-5715 and CVE-2017-5753 are known as “Spectre”, CVE-2017-5754 is known as “Meltdown”. If you want to read more about these vulnerabilities, please visit meltdownattack.com.

Due to the fact that different CPU platforms are affected, one might can guess that also  other devices, like storage systems or load balancers, are affected. Because of my focus, this blog post will focus on HPE Storage and Citrix NetScaler.

HPE Storage

HPE has published a searchable and continously updated list with products, that might be affected (Side Channel Analysis Method allows information disclosure in Microprocessors). Interesting is, that a product can be affected, but not vulnerable.

ProductImpactedComment
Nimble StorageYesFix under investigation
StoreOnceYESNot vulnerable – Product doesn’t allow arbitrary code execution.
3PAR StoreServYESNot vulnerable – Product doesn’t allow arbitrary code execution.
3PAR Service ProcessorYESNot vulnerable – Product doesn’t allow arbitrary code execution.
3PAR File ControllerYESVulnerable- further information forthcoming.
MSAYESNot vulnerable – Product doesn’t allow arbitrary code execution.
StoreVirtualYESNot vulnerable – Product doesn’t allow arbitrary code execution.
StoreVirtual File ControllerYESVulnerable- further information forthcoming.

The File Controller are vulnerable, because they are based on Windows Server.

So if you are running 3PAR StoreServ, MSA, StoreOnce or StoreVirtual: Relax! If you are running Nimble Storage, wait for a fix.

Citrix NetScaler

Citrix has also published an article with information about their products (Citrix Security Updates for CVE-2017-5715, CVE-2017-5753, CVE-2017-5754).

The article is a bit spongy in its statements:

Citrix NetScaler (MPX/VPX): Citrix believes that currently supported versions of Citrix NetScaler MPX and VPX are not impacted by the presently known variants of these issues.

Citrix believes… So nothing to do yet, if you are running MPX or VPX appliances. But future updates might come.

The case is a bit different, when it comes to the NetScaler SDX appliances.

Citrix NetScaler SDX: Citrix believes that currently supported versions of Citrix NetScaler SDX are not at risk from malicious network traffic. However, in light of these issues, Citrix strongly recommends that customers only deploy NetScaler instances on Citrix NetScaler SDX where the NetScaler admins are trusted.

No fix so far, only a recommendation to check your processes and admins.

The Meltdown/ Spectre shortcut blogpost for Windows, VMware and HPE

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

TL;DR

Jump to

I will try to update this blog post regularly!

Change History

01-13-2018: Added information regarding VMSA-2018-0004
01-13-2018: HPE has pulled Gen8 and Gen9 system ROMs
01-13-2018: VMware has updated KB52345 due to issues with Intel microcode updates
01-18-2018: Updated VMware section
01-24-2018: Updated HPE section
01-28-2018: Updated Windows Client and Server section
02-08-2018: Updated VMware and HPE section
02-20-2018: Updated HPE section
04-17-2018: Updated HPE section


Many blog posts have been written about the two biggest security vulnerabilities discovered so far. In fact, we are talking about three different vulnerabilities:

  • CVE-2017-5715 (branch target injection)
  • CVE-2017-5753 (bounds check bypass)
  • CVE-2017-5754 (rogue data cache load)

CVE-2017-5715 and CVE-2017-5753 are known as “Spectre”, CVE-2017-5754 is known as “Meltdown”. If you want to read more about these vulnerabilities, please visit meltdownattack.com.

Multiple steps are necessary to be protected, and all necessary information are often repeated, but were distributed over several websites, vendor websites, articles, blog posts or security announcements.

Two simple steps

Two (simple) steps are necessary to be protected against these vulnerabilities:

  1. Apply operating system updates
  2. Update the microcode (BIOS) of your server/ workstation/ laptop

If you use a hypervisor to virtualize guest operating systems, then you have to update your hypervisor as well. Just treat it like an ordinary operating system.

Sounds pretty simple, but it’s not. I will focus on three vendors in this blog post:

  • Microsoft
  • VMware
  • HPE

Let’s start with Microsoft. Microsoft has published the security advisory ADV180002  on 01/03/2018.

Microsoft Windows (Client)

The necessary security updates are available for Windows 7 (SP1), Windows 8.1, and Windows 10. The January 2018 security updates are ONLY offered in one of theses cases (Source: Microsoft):

  • An supported anti-virus application is installed
  • Windows Defender Antivirus, System Center Endpoint Protection, or Microsoft Security Essentials is installed
  • A registry key was added manually

To add this registry key, please execute this in an elevated CMD. Do not add this registry key, if you are running an unsupported antivirus application!! Please contact your antivirus application vendor! This key has to be added manually, only in case if NO antivirus application is installed, otherwise your antivirus application will add it!

reg ADD HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\QualityCompat /f /v cadca5fe-87d3-4b96-b7fb-a231484277cc /t REG_DWORD /d 0
OSUpdate
Windows 10 (1709)KB4056892
Windows 10 (1703)KB4056891
Windows 10 (1607)KB4056890
Windows 10 (1511)KB4056888
Windows 10 (initial)KB4056893
Windows 8.1KB4056898
Windows 7 SP1KB4056897

Please note, that you also need a microcode update! Reach out to your vendor. I was offered automatically to update the microcode on my Lenovo ThinkPad X250.

Update 01-28-2018

Microsoft has published an update to disable mitigation against Spectre (variant 2) (Source: Microsoft). KB4078130 is available for Windows 7 SP1, Windows 8.1 and Windows 10, and it disables the mitigation against Spectre Variant 2 (CVE 2017-5715) independently via registry setting changes. The registry changed are described in KB4073119.

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverride /t REG_DWORD /d 1 /f

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverrideMask /t REG_DWORD /d 1 /f

A reboot is required to disable the mitigation.

Windows Server

The necessary security updates are available for Windows Server 2008 R2, Windows Server 2012 R2, Windows Server 2016 and Windows Server Core (1709). The security updates are NOT available for Windows Server 2008 and Server 2012!. The January 2018 security updates are ONLY offered in one of theses cases (Source: Microsoft):

  • An supported anti-virus application is installed
  • Windows Defender Antivirus, System Center Endpoint Protection, or Microsoft Security Essentials is installed
  • A registry key was added manually

To add this registry key, please execute this in an elevated CMD. Do not add this registry key, if you are running an unsupported antivirus application!! Please contact your antivirus application vendor! This key has to be added manually, only in case if NO antivirus application is installed, otherwise your antivirus application will add it!

reg ADD HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\QualityCompat /f /v cadca5fe-87d3-4b96-b7fb-a231484277cc /t REG_DWORD /d 0
OSUpdate
Windows Server, version 1709 (Server Core Installation)KB4056892
Windows Server 2016KB4056890
Windows Server 2012 R2KB4056898
Windows Server 2008 R2KB4056897

After applying the security update, you have to enable the protection mechanism. This is different to Windows Windows 7, 8.1 or 10! To enable the protection mechanism, you have to add three registry keys:

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverride /t REG_DWORD /d 0 /f

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f

reg add "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization" /v MinVmVersionForCpuBasedMitigations /t REG_SZ /d "1.0" /f

The easiest way to distribute these registry keys is a Group Policy. In addition to that, you need a microcode update from your server vendor.

Update 01-28-2018

The published update for Windows 7 SP1, 8.1 and 10 (KB4073119) is not available for Windows Server. But the same registry keys apply to Windows Server, so it is sufficient to change the already set registry keys to disable the mitigation against Spectre Variant 2 (CVE 2017-5715).

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverride /t REG_DWORD /d 1 /f

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverrideMask /t REG_DWORD /d 1 /f

A reboot is required to disable the mitigation.

VMware vSphere

VMware has published three important VMware Security Advisories (VMSA):

VMware Workstation Pro, Player, Fusion, Fusion Pro, and ESXi are affected by CVE-2017-5753 and CVE-2017-5715. VMware products seems to be not affected by CVE-2017-5754. On 09/01/2017, VMware has published VMSA-2018-0004, which also addresses CVE-2017-5715. Just to make this clear:

  • Hypervisor-Specific Remediation (documented in VMSA-2018-0002.2)
  • Hypervisor-Assisted Guest Remediation (documented in VMSA-2018-0004)

I will focus von vCenter and ESXi. In case of VMSA-2018-002, security updates are available for ESXi 5.5, 6.0 and 6.5. In case of VMSA-2018-0004, security updates are available for ESXi 5.5, 6.0, 6.5, and vCenter 5.5, 6.0 and 6.5. VMSA-2018-0007 covers VMware Virtual Appliance updates against side-channel analysis due to speculative execution.

Before you apply any security updates, please make sure that you read this:

  • Deploy the updated version of vCenter listed in the table (only if vCenter is used).
  • Deploy the ESXi security updates listed in the table.
  • Ensure that your VMs are using Hardware Version 9 or higher. For best performance, Hardware Version 11 or higher is recommended.

For more information about Hardware versions, read VMware KB article 1010675.

VMSA-2018-0002.2

OSUpdate
ESXi 6.5ESXi650-201712101-SG
ESXi 6.0ESXi600-201711101-SG
ESXi 5.5ESXi550-201709101-SG

In case of ESXi550-201709101-SG it is important to know, that this patch mitigates CVE-2017-5715, but not CVE-2017-5753! Please see KB52345 for important information on ESXi microcode patches.

VMSA-2018-0004

OSUpdate
ESXi 6.5ESXi650-201801401-BG, and
ESXi650-201801402-BG
ESXi 6.0ESXi600-201801401-BG, and
ESXi600-201801402-BG
ESXi 5.5ESXi550-201801401-BG
vCenter 6.56.5 U1e
vCenter 6.06.0 U3d
vCenter 5.55.5 U3g

The patches ESXi650-201801402-BG, ESXi 6.0 ESXi600-201801401-BG, and
ESXi550-201801401-BG will patch the microcode for supported CPUs. And this is pretty interesting! To enable hardware support for branch target mitigation (CVE-2017-5715 aka Spectre) in vSphere, three steps are necessary (Source: VMware):

  • Update to one of the above listed vCenter releases
  • Update the ESXi 5.5, 6.0 or 6.5 with
    • ESXi650-201801401-BG
    • ESXi600-201801401-BG
    • ESXi550-201801401-BG
  • Apply microcode updates from your server vendor, OR apply these patches for ESXi
    • ESXi650-201801402-BG
    • ESXi600-201801402-BG
    • ESXi550-201801401-BG

In case of ESXi 5.5, the hypervisor and microcode updates are delivered in a single update (ESXi550-201801401-BG).

Update 01-13-2018

Please take a look into KB52345 if you are using Intel Haswell and Broadwell CPUs! The KB article includes a table with affected CPUs.

All you have to do is:

  • Update your vCenter to the latest update release, then
  • Update your ESXi hosts with all available security updates
  • Apply the necessary guest OS security updats and enable the protection (Windows Server)

For the required security updates:

Make sure that you also apply microcode updates from your server vendor!

VMSA-2018-0007

This VMSA, published on 08/02/2018, covers several VMware Virtual appliances. Relevant appliances are:

  • vCloud Usage Meter (UM)
  • Identity Manager (vIDM)
  • vSphere Data Protection (VDP)
  • vSphere Integrated Containers (VIC), and
  • vRealize Automation (vRA)
ProductPatch pending?Mitigation/ Workaround
UM 3.xyesKB52467
vIDM 2.x and 3.xyesKB52284
VDP 6.xyesNONE
VIC 1.xUpdate to 1.3.1
vRA 6.xyesKB52497
vRA 7.xyesKB52377

 

HPE ProLiant

HPE has published a customer bulletin (document ID a00039267en_us) with all necessary information:

HPE ProLiant, Moonshot and Synergy Servers – Side Channel Analysis Method Allows Improper Information Disclosure in Microprocessors (CVE-2017-5715, CVE-2017-5753, CVE-2017-5754)

CVE-2017-5715 requires that the System ROM be updated and a vendor supplied operating system update be applied as well. For CVE-2017-5753, CVE-2017-5754 require only updates of a vendor supplied operating system.

Update 01-13-2018

The following System ROMs were previously available but have since been removed from the HPE Support Site due to the issues Intel reported with the microcode updates included in them. Updated revisions of the System ROMs for these platforms will be made available after Intel provides updated microcodes with a resolution for these issues.

Update 01-24-2018

HPE will be releasing updated System ROMs for ProLiant and Synergy Gen10, Gen9, and Gen8 servers including updated microcodes that, along with an OS update, mitigate Variant 2 (Spectre) of this issue. Note that processor vendors have NOT released updated microcodes for numerous processors which gates HPE’s ability to release updated System ROMs.

I will update this blog post as soon as HPE releases new system ROMs.

For most Gen9 and Gen10 models, updated system ROMs are already available. Check the bulletin for the current list of servers, for which updated system ROMs are available. Please note that you don’t need a valid support contract to download this updates!

Under Software Type, select “BIOS-(Entitlement Required”) – (Note that Entitlement is NOT required to download these firmware versions.

Update 02-09-2018

Nothing new. HPE has updates the bulletin on 31-01-2018 with an updated timeline for new system ROMs.

Update 02-25-2018

HPE hast published Gen10 system ROMs. Check the advisory: HPE ProLiant, Moonshot and Synergy Servers – Side Channel Analysis Method Allows Improper Information Disclosure in Microprocessors (CVE-2017-5715, CVE-2017-5753, CVE-2017-5754).

Update 04-17-2018

HPE finally published updated System ROMS for several Gen10, Gen9, Gen8, G7 and even G6 models, which also includes bread-and-butter servers like the ProLiant DL360 G6 to Gen10, and DL380 G6 to Gen10.

If you are running Windows on your ProLiant, you can use the online ROM flash component for Windows x64. If you are running VMware ESXi, you can use the systems ROMPaq firmware upgrade for for USB key media.

Data Protector Exchange GRE and IP-less Exchange DAG

This posting is ~3 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

When dealing with Microsoft Exchange restore requests, you will come across three different restore situations:

  • a database
  • a single mailbox
  • a single mailbox item (mail, calendar entry etc.)

Restoring a complete database is not a complicated task, but restoring a single mailbox, or a single mailbox item, is. First, you need to restore the mailbox, that includes the desired mailbox, into a recovery database. Then you can restore the mailbox, or the mailbox items, from the recovery database. Some of the tasks can only be done with the Exchange Management Shell.

The HPE Data Protector Granular Recovery Extension (GRE) for Microsoft Exchange helps you to simplify the necessary steps to recover a single mailbox, or mailbox items. But the GRE can only assist you during the restore. It hids the above described tasks behind a nice GUI. The backup of Microsoft Exchange is still something you have to do with HPE Data Protector.

Database Availability Group without an Administrative Access Point

With Exchange 2013 SP1, Microsoft introduced the IP-less Database Availability Group (DAG). This type of DAG does not need a Cluster Name Object (CNO), and therefore has no IP address. With Exchange 2016, the IP-less DAG is the default DAG configuration.

But how to backup a DAG, that has no IP address and no name? It is easier than imagined. You have to create a DNS A-Record that includes all IP addresses of the cluster nodes, resulting in a DNS round-robin A-Record. You also have to install the Data Protector Disk Agent and On-line Extension on all cluster nodes. After that, you simply import the DAG by using the DNS A-Record into Data Protector. Then you can proceed with the creation and configuration of a backup job, that uses the newly imported cluster.

Backup runs fine, but the GRE fails

During the test phase of a new Exchange 2016 cluster, a customer of mine discovered a strange error, when he tried to restore a mailbox, or mailbox item, using the Exchange GRE.

Either parsing of command builder output failed or no databases were backed up.

Data Protector Exchange GRE Error

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

The customer and I double-checked the installation of the GRE on both nodes. Everything was fine. We also found out, that Data Protector was able to list the backup objects. This is a shortened output of the command.

Object Name                                                     Object type
===============================================================================
exchange-2010.domain.tld:/25e0d4ac-5a63-4035-b718-09d07bbbba47/DB3 E2010
dag-backup.domain.tld:/c7982a7e-76cd-4759-9d03-019c0c410957/DB03 E2010
dag-backup.domain.tld:/5d3faa67-6fd9-493e-bf56-5044f75620a8/DB01 E2010

As you can see, dag-backup.domain.tld is the DNS A-Record, that was created to backup the DAG with Data Protector.

Connection between A-Record and DAG name

It took some time to get this sorted, but at the end, a new A-Record was the key. The DAG has a name, e.g. customer-dag1.domain.tld. But there is no matching A-Record, and the DAG has no IP address.

When the GRE searches for available database backups, it stumbles over the mismatch between the DAG name, that is reported by the Exchange organization, and the name of the Data Protector client that was used to backup the databases.

The key to success was to change the DNS A-Record from dag-backup.domain.tld to customer-dag1.domain.tld. Latter is the name of the DAG, that is given during DAG creation. After removing the Data Protector client, the re-import of the DAG with the new A-Record, and a successful backup, the customer was able to restore mailboxes and mailbox items using the GRE for Microsoft Exchange.

This process is not described in detail in the Data Protector documentation. All you find is this foot note in the Data Protector Platform Integration Matrix (page 12, foot note 19):

Microsoft Exchange Server DAG configured without a Cluster Administrator Access Point is supported with Round Robin DNS mapping of DAG name to all the node IPs.

Make sure that the DNS round-robin A-Record matches your DAG name.