Author Archives: Patrick Terlisten

About Patrick Terlisten

vcloudnine.de is the personal blog of Patrick Terlisten. Patrick has a strong focus on virtualization & cloud solutions, but also storage, networking, and IT infrastructure in general. He is a fan of Lean Management and agile methods, and practices continuous improvement whereever it is possible.Feel free to follow him on Twitter and/ or leave a comment.

Once in a year: How to update TLS certificates on ADFS server and proxies

You might got this news some days ago: Starting with September 1, 2020, browsers and devices from Apple, Google, and Mozilla will show errors for new TLS certificates that have a lifespan greater than 398 days. Due to this move from Apple, Google and Mozilla, you have to deal with the replacement of certificates much more often. And we all know: Replacing certificates can be a real PITA!

Bild von skylarvision auf Pixabay

Replacing TLS certificates used for ADFS and Office 365 can be a challenging task, and this blog post will cover the neccessary steps.

ADFS Server

The first service, for which we will replace the certificate, is the ADFS server, or the ADFS server farm. At this point it is important to understand that we are dealing with two different points to which the certificate is bound:

  • the ADFS service communications certificate, and
  • the ADFS SSL certificate

The first step is to replace the service communication certificate. After importing the certificate with private key, you need to assign “read” permission to the ADFS service account. Right click on the certificate, then “All Tasks” > “Manage Private Keys”.

Make sure to import the certificate on all farm servers! Next step: Start the ADFS management console on the primary node. Select “Certificates” and then “Select service communication certificate” on the right window pane.

Now we have successfully replaced the service communication certificate. But we are no finished yet! Now we have to set the ADFS SSL certificate. Depending on your OS, you have to run the PowerShell command on the primary node. If your are running Windows Server 2012 R2 or older, you have to run the PowerShell command on EVERY ADFS farm server!

You can get the certificate thumbprint using the Get-AdfsSslCertificate command. Set the ADFS SSL certificate with

Then restart the ADFS service.

ADFS Proxies

In most cases you will have one or more ADFS proxies in your DMZ. The ADFS proxy is nothing more than a Web Application Proxy (WAP) and therefore the PowerShell commands for WAP will be used.

First of all: Import the new certificate with the private key on all ADFS proxies, and then get the certificate hash of the new certificate. Then open an elevated PowerShell on each proxy.

Then we have to re-establish the trust between the proxies and the primary ADFS farm server. You will need the local (!) administrator account of the primary farm server.

The last step is to update thefederated trust with Office 365.

Update the federated trust with Office 365

To update the federated trust with Office 365, you will need the Windows Azure Active Direcotry Module for Windows PowerShell and an elevated PowerShell. Connect to Office 365 and update the federated trust:

That’s it! Bookmark this page and set a calendar entry on today +12 months. :)

Passed Microsoft exam AZ-103 – Azure Administrator Associate

Six weeks ago, I passed the Microsoft AZ-103 exam and earned the Azure Administrator Associate. A last minute pass, because AZ-104 was already launched. But better late than never. I had to re-schedule the exam a couple of times because the test center was closed due to COVID19.

The Azure Administrator Associate is a Administrator-role certification and it is all about implementing, managing and monitoring the Azure identity, governance, storage, compute, and virtual network solutions.

The exam covers a couple of topics and you should have knowledge and hands-on experience in administering Azure services using the Azure Portal, PowerShell, Azure CLI, and Azure Resource Manager templates.

Your knowledge is tested over a broad band of topics. These topics are:

  • Manage Azure identities and governance
  • Implement and manage storage
  • Deploy and manage Azure compute resources
  • Configure and manage virtual networking
  • Monitor and back up Azure resources

How to prepare for the exam

Fortunately I have a monthly Azure credit which I can use to gain new skills. I used these Azure credit together with the Microsoft Learning Path for AZ-103 (now 104).

It is pretty important no only to focus on VMs, storage or networking. Web Apps was one of my blind spots, and I had to get my head around it. Azure identities and governance is not so hard, if you are already familiar with Office 365.

I learned a lot from the Microsoft Documentation for Azure, and I was really impressed how much I was able to find, read and learn from there.

Next stop: Microsoft Certified: Azure Solutions Architect Expert

Microsoft has announced to retire all remaining exams associated with Microsoft Certified Solutions Associate (MCSA), Microsoft Certified Solutions Developer (MCSD), Microsoft Certified Solutions Expert (MCSE) on January 31, 2021, so the role-based certifications introduced in September 2018 are the way to go.

I’m currently holding a MCSE for Core Infrastructure and one for Productivity. Based on this, the Azure Solutions Architect Expert is the next step for me.

Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7

I’ve got several mails and comments about this topic. It looks like that the latest ESXi 6.7 updates are causing some trouble on HPE ProLiant Gen10 servers.

I’ve blogged about recurring host hardware sensor state alarm messages some weeks ago. A customer noticed them after an update. Last week, I got the first comments under this blog post abot fan failure messages after applying the latest ESXi 6.7 updates. Then more and more customers asked me about this, because they got these messages too in their environment after applying the latest updates.

Last Saturday I tweeted my blog post to give a hint to my followers who may be experiencing the same problem.

Fortunately one of my followers (Thanks Markus!) pointed me to a VMware KB article with a workaround: Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7 (78989).

This is NOT a solution, but a workaround. Keep that in Mind.

Thanks again to Markus. Make sure to visit his awesome blog (MY CLOUD-(R)EVOLUTION) , especially if you are interested in vSphere, Veeam and automation!

Missing Microsoft Teams calendar tab with on-premise Exchange

Microsoft Teams got a big push due to the current COVID19 crisis and many of my customers deployed it in the past weeks. At ML Network, we are using Microsoft Teams for more than a year, and we don’t want to miss it anymore.

Source: Microsoft

We are running Exchange 2016 on-premises, currently CU16. We were missing the calendar tab in Teams since we started with Microsoft Teams. when you do some research about this issue, you will find many threads and blog posts, but these are the two key facts:

  • it is supported with on-premises hybrid Exchange deployments
  • it works flawless with Exchange Online

Our Exchange is configured as full-hybrid mode deployment. I did this as we deployed Office 365 at our organization.

Let’s summarize:

  • Exchange 2016 CU16
  • Hybrid Deployment
  • Office 365 with Teams enabled
  • no calendar tab when the Exchange mailbox is hosted on-premises

OAuth FTW!

While doing an Exchange Hybrid deployment for one of my customers some weeks ago, I’ve stumbled over an OAuth error message at the end of the Hybric Connection Wizard. The message was HCW8064

“HCW has completed, but was not able to perform the OAuth portion of your Hybrid configuration”

We were not able to fix this. Microsoft offers two solutions:

Yesterday I did the upgrade from CU15 to CU16 on our Exchange server and while watching the progress bar I did some research on this issue again. I found strong evidence that Microsoft Teams needs working OAuth to display the calendar tab and access the on-premises hosted mailbox. So I gave it a try and used the latest version of the HCW wizard.

What should I say? No OAuth configuration error and after a restart of Microsoft Teams, the calendar tab appeared.

Lessons Learned:

  • always use the latest CU für Exchange
  • always use the latest HCW Wizard

Connecting to Exchange Online with PowerShell

The task was simple: Change the alias and the primary SMTP address of a Microsoft Teams team. This can be done by changing the alias and the SMTP address of the underlaying Office 365 group. But how? All you need is a PowerShell connection to Exchange Online.

All you need is a PowerShell on your local computer and Office 365 credentials with the necessary privileges.

First we need to provide the necessary credentials.

A windows will come up and you must enter your Office365 credentials.

The next step is to create a PowerShell remote session with Exchange Online.

Please note that basic auth will be disabled in October 2020!

To connect to this remote session, use Import-PSSession.

When you finished your work, make sure to remove the remote session with Remove-PSSession!

Space reclamation of VMFS 5 Datastores using esxcli

It was a bit quiet here in January caused by a new “private project” which has attracted some resources, and will pull more resources in the future.

But this will not stop me from documenting useful stuff. This one is nothing new, but commonly asked by some customers: How do I get my storage capacity back after deleting VMs?!

The outlined steps are all done using esxcli. You need to execute them on a single ESXi host, not on each host in the cluster.

Connect to one of your ESXi hosts using SSH. You can use this small PowerCLI command to enable SSH on a specific host.

The first step is to identify the datastore(s) from which you want to reclaim storage.

We will need the device name, and later the UUID. The next step is to identify if the device is detected as a thin-provisioned disk, and if it is VAAI-capable. I’ve shortened the output of the esxcli output to the necessary output.

No we have to verify if all necessary VAAI options are supported.

Important for us is the “Delete” primitive. If this is supported, we can use UNMAP to reclaim storage.

This process will take some time depending on the amount of storage that has to be reclaimed. And it will put some load on your storage, so you might want to run this in a less productive time.

VCAP6.5-DCV Design – Objective 2.4 Build manageability requirements into a vSphere 6.x logical design

This seems to be my last blog post for 2019 and it covers covers objective 2.4 (Build manageability requirements into a vSphere 6.x logical design) of the VCAP6.5-DCV Design exam. It is based on the VMware Certified Advanced Professional 6.5 in Data Center Virtualization Design (3V0-624) Exam Preparation Guide (last update August 2017).

The necessary skills and abilities are documented in the exam prep guide for the older VCAP6-DCV Design exam (3V0-622). I think they also apply to the current version of the exam:

  • Evaluate which management services can be used with a given vSphere Solution
  • Differentiate infrastructure qualities related to management
  • Differentiate available command line-based management tools (PowerCLI, vMA etc.)
  • Evaluate VMware Management solutions based on customer requirements
  • Build interfaces into the logical design for existing operations practices
  • Address identified operational readiness deficiencies
  • Define Event, Incident and Problem Management practices
  • Analyze Release Management practices
  • Determine request fulfillment and release management processes
  • Determine requirements for Configuration Management
  • Define change management processes based on business requirements
  • Based on customer requirements, identify required reporting assets and processes

While the last blog post has covered the availability requirements, this blog posts focuses on the manageability requirements of a logical design. It’s all about how to manage the proposed solution.

Evaluate which management services can be used with a given vSphere Solution

You can use different “services” to manage a vSphere environment.

  • vCenter and vMA

Both appliances offer you different services to connect to in order to manage your environment, like

  • vSphere Client (Web Client, C# Client)
  • SSH
  • APIs
  • PowerCLI

The different tools help you to manage the different vSphere components, like

  • HA
  • DRS
  • Networking (vDS, vSS)
  • Auto Deploy
  • Host Profiles
  • etc.

Differentiate infrastructure qualities related to management

The different infrastructure qualities are

  • Availability
  • Manageability
  • Performance
  • Recoverability
  • Security

Depending on which infrastructure quality you consider, it affects the manageability of the proposed solution. For example: A single vCenter might not offer the required availability. Or a single datastore might not meet the required performance. But a highly-available vCenter or a SDRS cluster affects the way how you management the solution.

Differentiate available command line-based management tools (PowerCLI, vMA etc.)

You should be able to differentiate between PowerCLI (PowerShell) and vMA (Appliance) or vCLI (command-line tools for ESXi).

Evaluate VMware Management solutions based on customer requirements

Depending on the customers requirements, some solutions might be out of scope. If the customer doesn’t have a vSphere Enterprise Plus license, there’s no way to use Storage DRS.

Build interfaces into the logical design for existing operations practices

This topic is about what existing interfaces (in terms of systems) the customer already using and how to build them into the design. Think about Syslog servers, Active Directory for authentication (infrastructure quality design), Public Key Infrastructure (PKI) for certificates etc.

Address identified operational readiness deficiencies

Operational Readiness (OR) is the capability of an organization to (efficiently) deploy, operate, and maintain a system and/ or its processes. Before the proposed solution is going to production, any deficits in regard of OR has to be identified and addresses.

Define Event, Incident and Problem Management practices

This sounds like ITIL, and I would assume that the definition of event, incident and problem of ITIL is meant. ITIL defines

  • Event: An event can be defined as any detectable or discernible occurrence that has significance for the management of the IT Infrastructure or the delivery of IT service and evaluation of the impact a deviation might cause to the services. Events are typically notifications created by an IT service, Configuration Item (CI) or monitoring tool. (Wikipedia)
  • Incident: An incident is an event that could lead to loss of, or disruption to, an organization’s operations, services or functions. (Wikipedia)
  • Problem: The Information Technology Infrastructure Library defines a problem as the cause of one or more incidents. (Wikipedia)

The design should include practices for event, incident and problem management. Most customers will already have practices for this, but they might be adjusted for the proposed solution.

Analyze Release Management practices

Release management is the process of managing, planning, scheduling and controlling the deployment of new or modified services. This topic covers the currently deployed Release Management processes of the customers.

Determine request fulfillment and release management processes

This topic is related to the prior topic. You should determine if the customers has already deployed request fulfillment and release management processes, and if they are already deployed, you should check if they are suitable for the proposed solution.

The request fulfillment will allow users to request and receive standardized services. Think about the automated deployment of VMs after requesting a new VM using a portal web site.

Determine requirements for Configuration Management

Changes to the proposed solution will be required over time. Configuration Management covers the management of all Configuration Items (CI). Event if it’s not mentioned in this topic, Configuration Management is related to Change Management, because all changes to CIs has to be documented.

Define change management processes based on business requirements

The objective of change management in this context is to ensure that standardized methods and procedures are used for efficient and prompt handling of all changes to control IT infrastructure, in order to minimize the number and impact of any related incidents upon service. (Wikipedia)

If a customer already has ITSM processes in place, they most likely will have a change management process. This process has to be defined to fulfill the requirements of the proposed solution.

Based on customer requirements, identify required reporting assets and processes

Especially when it comes down to security, it’s important to talk about monitoring and logging. This topic is about

  • What CIs have to be monitored?
  • What events have to be logged/ tracked?
  • How to keep track of changes to configuration items?
  • How keep documentation up-to-date?

Summary

This objective is full of ITSM/ ITIL. It’s pretty helpful if you were familiar with the concepts of ITSM/ ITIL. You should have a good understanding of the different management tools and management solutions and services of a vSphere design.

VCAP6.5-DCV Design – Objective 2.3 Build availability requirements into a vSphere 6.x logical design

This blog post covers objective 2.3 (Build availability requirements into a vSphere 6.x logical design) of the VCAP6.5-DCV Design exam. It is based on the VMware Certified Advanced Professional 6.5 in Data Center Virtualization Design (3V0-624) Exam Preparation Guide (last update August 2017).

The necessary skills and abilities are documented in the exam prep guide for the older VCAP6-DCV Design exam (3V0-622). I think they also apply to the current version of the exam:

  • Evaluate which logical availability services can be used with a given vSphere solution
  • Differentiate infrastructure qualities related to availability
  • Describe the concept of redundancy and the risks associated with single points of failure
  • Explain class of nines methodology
  • Determine availability component of service level agreements (SLAs) and service level management processes
  • Determine potential availability solutions for a logical design based on customer requirements
  • Create an availability plan, including maintenance processes
  • Balance availability requirements with other infrastructure qualities
  • Analyze a vSphere design and determine possible single points of failure

Let’s start with…

Evaluate which logical availability services can be used with a given vSphere solution

VMware vSphere offers a broad band of features that allows you to create highly available solutions. When we take a look at the infrastructure, feature like VMware HA, FT, or even multiple NICs at a distributed vSwitch allow to increase availablility. When we look at the application layer, other techniques, like DRS can help us to increase availability to use DRS to place VMs on different hosts (anti-affinity rules) etc.

Differentiate infrastructure qualities related to availability

The infrastructure qualities are:

  • Availability
  • Manageability
  • Performance
  • Recoverability
  • Security

Availability and Recoverability are tight together. René van den Bedem has written an very good blog post about how recoverability affectes availability.

Describe the concept of redundancy and the risks associated with single points of failure

This topic is pretty clear and should be easy to explain. You should be able to identify what a single point of failure is, and how you can avoid them. Examples for a single point of failure are:

  • only a single-port HBA in a server
  • only one network uplink from a Top-of-Rack switch to a Core-Switch
  • using of RAID 0

Explain class of nines methodology

This is also easy:

  • Two Nines- 99% – 3.65 days downtime per year
  • Three Nines- 99,9% – 8.76 hours downtime per year
  • Four Nines- 99,99% – 52.6 minutes downtime per year
  • Five Nines – 99,999% – 5.26 minutes downtime per year
  • Six Nines – 99,9999% – 31.56 seconds downtime per year

Important note: “Downtime” means “unplanned downtime”, not planned downtime, like in maintenance windows.

Determine availability component of service level agreements (SLAs) and service level management processes

An Service Level Agreement (SLA) is a contact between two parties, usually a supplier and a customer. The SLA describes targets that should be met. This can be an availability expressed using the “class of nines methodology”. If this target is missed,the supplier ofthen has to pay a penalty to the customer.

So it is pretty important to build a design that can fulfill the availability requirements. Depending on the requirements you may have to use VMware FT. If the availability requirements are lower, VMware HA may be sufficient. It is important that you can choose the best technique for the given SLA.

Determine potential availability solutions for a logical design based on customer requirements

Now it’s time to put things together. You know the different techniques that are offered by VMware vSphere, and you know the customer requirements. This allows you to determine the potential availability solutions for a logical design.

Create an availability plan, including maintenance processes

Again, I’d like to recommend the blog post of René van den Bedem. It’s all about RPO, RTO, MTD and how much does an unplanned downtime costs (result of a Business Impact Analysis).

Balance availability requirements with other infrastructure qualities

At some point of your design you need to holistically look at your design and you have to ensure that a decision, that was made, doesn not impact other requirements or other decision.

Analyze a vSphere design and determine possible single points of failure

This is pretty self-explanatory and can be done together with the preceding step.

Summary

Availability is the main theme of this objective. Do not lose sight of the customer’s requirements. Increasing availability is often associated with immense additional costs.

Read the mentioned blog post from René and I rellay recommend this vBrownBag video with Rebecca Fitzhugh.

Why we need a vSAN licensing for SMB customers

Not every customer is running a full-blown vSphere Enterprise Plus licensing. To be honest, when I look at the number of sold licenses, most of my customers are running vSphere Essentials Plus. Not Essentials, nor Standard or Enterprise (Plus), but two or three hosts with Essentials Plus. And that’s perfectly fine!

Two or three hosts with 10 GbE and pretty often 12G SAS. Some of them with Fibre-Channel, nearly no one with iSCSI. My colleagues and I developed a pretty rock solid setup over the last years, which we sell like some kind of building block: HPE ProLiant, HPE MSA, Aruba Switches, vSphere Essentials Plus. A perfect setup for most of our customers, which run something between 10 and 30 VMs on it. Some of them also add Horizon View (Add-On) to it.

But requirements change. More customers ask for more hosts. When customers break out of the Essentials Plus licensing, then often because of the host limitation. Less of them do this because they need DRS or even Storage vMotion.

Some of my customers have heard about vSAN and they like the idea behind it. Especially when you take into account, that hardware costs decrease and flash storage is getting cheaper. But when you discuss the idea of combining vSAN and Essentials licensing, you will hit the host limitation early.

VMware itself states in the vSAN licensing guide:

The 2-node vSAN deployment model is not restricted to a specific vSAN license edition. In other words, any of the licensing editions can be used with a 2-host configuration. vSphere Essentials Kit or vSphere Essentials Plus Kit licensing limits the number of hosts managed by
vCenter Server Essentials to three. The vSAN witness host – virtual appliance or physical – is considered a host in these Essentials licensing bundles.

Source: VMware vSAN Licensing Guide

When you take a look at the Horizon Desktop licensing, or at the RoBo licensing, you will see another kind of limitation: Limiting the number of VMs, not the number of hosts. This is pretty interesting when you think about combining vSAN and Essentials licensing.

Why not offering a “HCI Essentials Kit” limitied to 25 VMs, and the features offered by Essentials Plus and vSAN Standard? This would allow customers to run four or five hosts with vSAN. By limiting the number of VMs, customers can scale-out their infrastructure in terms of capacity.

Hey VMware, you might think about this over the Christmas holiday. ;) There is a customer segment that is not yet sufficiently addressed by your sales team. This is a chance for more YoY growth. ;)

VMware ESXi 6.7: Recurring host hardware sensor state alarm

If you found this blog post because you are searchting for a solution for a FAN FAILURE on your ProLiant Gen10 HW after applying the latest ESXi 6.7 patches, then use this shortcut for the workaround: Fan health sensors report false alarms on HPE Gen10 Servers with ESXi 6.7


I had a really annoying problem at one of my customers. After deploying new VMware ESXi hosts (HPE ProLiant DL380 Gen10) along with an upgrade of the vCenter Server Appliance to 6.7 U2, the customer reported recurring host hardware sensor state alarm messages in the vCenter for all hosts.

After acknowledging the alarm, it recurred after a couple of minutes or hours. The hardware was finde, no errors or warnings were noticed in the ILO Management Log. But the vCenter reported periodically a Sensor -1 type error in the Events window. The /var/log/syslog.log contained messages like this:

Sure, you can ignore this. But you shouldn’t ignore this, because these events can result in the vCenter database increasing in size. vCenter can crash once the SEAT partition size goes above the 95% threshold. So you better fix this!

Long story short: This bug is fixed with the latest November updates for ESXi 6.7 U3. A workaround is to disable the WBEM service. The WBEM service might be enabled after a reboot. In this case you have to disable the sfcbd-watchdog service.

But the best way to solve this is to install the latest patches (VMware ESXi 6.7, Patch Release ESXi670-201911001)