Author Archives: Patrick Terlisten

About Patrick Terlisten

vcloudnine.de is the personal blog of Patrick Terlisten. Patrick has a strong focus on virtualization & cloud solutions, but also storage, networking, and IT infrastructure in general. He is a fan of Lean Management and agile methods, and practices continuous improvement whereever it is possible. Feel free to follow him on Twitter and/ or leave a comment.

Outlook Web Access fails with “440 Login Timeout”

Today I faced an interesting problem. A customer told me that their Exchange 2010, which is currently part of a Exchange cross-forest migration project, has an issue with Outlook Web Access and the Exchange Control Panel. Both web sites fail with a white screen and a single message:

440 Login Timeout

I checked some basics, like certificate, configuration of the virtual directories and I found nothing suspicious. Most hints on the internet pointed towards problems with the IUSR_servername user, which is not used with IIS 7 and later. But authentication configuration and filesystem permissions were okay. Also the IIS end event logs were pretty unhelpful.

More interesting was the change date of the web.config! This file is part of the OWA web app and it’s typically stored under C:\Program Files\Microsoft\Exchange Server\V14\ClientAccess\Owa.

Long story short: I found this entry in the file and removed it.

<add name=”kerbauth” />

Looks like someone wanted to setup Kerberos auth for OWA, or did not reverse a change.

Modify ProxyAddresses of Office 365 users without Exchange Online

As part of a Office 365 tenant rebuild, I had to move a custom domain to the new Office 365 tenant. The old tenant was not needed anymore, and the customer had to move to a Non-Profit tenant for compliance reasons. So the migration itself was no big deal:

  • disable AzureAD sync
  • change UPN of all users
  • remove the domain
  • connect the domain to the new tenant
  • setup a new AzureAD sync
  • assign licenses
  • time for a beer

That was my, honestly, naive plan for this migration.

Image by Gerd Altmann from Pixabay 

Disabling the AzureAD sync was easy. Even the change from ADFS to Password Hash Sync was easy. Changing the UPN for all users was a bit challenging, but the PowerSHell code in this article was quite helpful.

$users = Get-MsolUser -All | Where {$_.UserPrincipalName -like "*customdomain.tld"} | select UserPrincipalName 

foreach ($user in $users) {
 
   #Create New User Principal Name
   $newUser = $user.UserPrincipalName -replace "customdomain.tld", "customdomain.onmicrosoft.com"
 
   #Set New User Principal Name
   Set-MsolUserPrincipalName -UserPrincipalName $user.UserPrincipalName -NewUserPrincipalName $newUser
 
   #Display New User Principal Name
   $newUser
 }

But after this, I still was unable to remove the custom domain from the tenant. The domain was still referenced in the ProxyAddresses attribute, which was synced by the AzureAD sync…

Removing the domain from the users in the on-prem Active Directory was not solution. The users were already cloud-only because the sync was switched off. With this in mind my plan was to modify the cloud-only users in the tenant. To be honest: This solution worked in this specific case!

The customer was using Microsoft Teams Commercial Cloud trial licenses, so I had no Exchange Online to edit the proxy addresses. But luckily, the Exchange Online Management PowerShell Module was quite helpful.

Get-MailUser | Select -ExpandProperty emailaddresses | ? {$_ -like "*customdomain.tld"}

This line of code gave me an idea how many users were affected… quite a lot… With my colleague Claudia I quickly developed some dirty PowerShell code to remove all proxy addresses that included the custom domain.

$users = Get-MailUser -ResultSize Unlimited

foreach ($u in $users) {

    Get-MailUser -Identity $u.Alias |select -ExpandProperty emailaddresses | 
    ? {$_ -like "*customdomain.tld"} |
    % {Set-MailUser -Identity $u.Alias -EmailAddresses @{remove="$_"}}
     
}

It tool about 45 minutes to modify ~ 2000 users. After this, I was able to remove the domain and connect it to the new tenant.

This solution worked in my case. Another way might be using the AzureAD sync itself, masking out the custom domain and wait until the domain is removed from all proxy addresses. But I didn’t tested this.

Escaping special characters in proxy auth passwords in vCenter

EDIT: It seems that his was fixed in vCenter 7.0 U3.

While debugging a vCener Lifecycle Manager, which was unable to download updates, I’ve stumbled over a weird behaviour, which is (IMHO) by design.

Some of you might use a proxy server. And some of you might use a proxy server which requires credentials. In my case, my customer uses a Sophos SG appliance as a web proxy server with authentication. The customer creaded a user with a complex password. But I was unable to get a working internet connection.

Image by Ed Webster from Pixabay 

I played a bit with curl on the bash of the vCenter. The proxy settings are stored under /etc/sysconfig/proxy. These settings are used to populate the http_proxy and https_proxy environment variable. It’s important to know, that the credentials stored in the /etc/sysconfig/proxy are encoded with the percent-encoding, also known as URL encoding. So someone with root access can grab credentials from these file.

But then I noticed something weird. I set the http_proxy variable manually with

http://username:password@proxy.domain.tld:8080

and I got this error:

-bash: !": event not found

Okay… there was a ! in the password and the BASH tried to execute the part behind the !. But it was part of the password, so I had to tell the BASH that it has to take this literally.

I escaped the ! in the password with a \. And to my surprise: The vCenter was able to download updates. I decoded the percent-encoded string in the /etc/sysconfig/poxy and found the escaped ! (\!). For example. Instead of Passw0rd! I had to enter Passw0rd\! in the password field.

Long story short: Use a password without special characters, otherwise escape them, because the password is stored in BASH variables.

On the road to… nowhere?

Its been four month since my last blog post, and the blog frequency was quite low before that. This blog is, to be honest, a giant pile of stuff that has not worked as expected. Okay, some random thoughts or howto’s, but most blog posts are about stuff that failed in some way. That’s a bit “depressing”. I should write more about the fun things in my life

Picture by Gerd Altmann on Pixabay 

For a pretty long time my focus was on infrastructure. And my focus _is_ on infrastrucutre – Networks, lots of storage, virtualization with VMware. And always full stack: Networking, Storage, Servers, Operating System, always with a little focus here and there. Sure, products shifted over time, but in the bigger picture, my focus was always on infrastrucutre and datacenter stuff. No client devices, no end user support, no managed services/ admin tasks, no leadership. Technical stuff and projects. But my focus continued to shift. Microsoft Exchange for example. A product I really hate. Not really infrastructure. But I’m good at it and so I got projects and stuff to do. Or Office 365. Or Microsoft Azure. And since 2013 more leadership tasks. And since January 2020 I held some kind of a higher management position.

I’m doing much less VMware for the past 24 months than I like to do. Therefore much more Office 365 and Azure. And consulting for Microsoft stuff, transition to cloud, transition of IT services into managed services, or deployment of managed services. I lost my VCP/ VCAP through, IMHO, unnecessarily complicated recertification requirements. That was very frustrating for me. Of course, I learned other things in return.

Companions from the last 20 years are now mostly in management positions. Head of … whatever. Most of them are not doing technical stuff anymore. And they are happy with it. It looks like a typical career path, but it’s one that I don’t necessarily like right now. I’m still doing technical stuff, even if I’m in a management position. Actually quite good, but it also feels kind of weird.

I’m turning 40 this year. 23 years in IT behind me, 25 years to go until retirement. Not even half-time. :/ A wife, three nice kids, we just moved to our new house. Actually everything should be really great, but currently I can’t see a career path for me that makes me happy. And this sucks pretty hard.

So, to make a long story short, come back from time to time. Add this blog to your RSS reader. I hope to post nice content here again soon.

Configure VMware Horizon View client device certificate authentication

Adding a second factor to your authentication is always a good idea. Typically the second factor is a One-Time Password (OTP) or a push notification. But what if you want to allow the login into your Horizon View environment only from specific devices? This implies that you need some kind of second factore that also identifies the device. At this point the arch enemy of many of us comes into play: Certificates!

To be honest: It is not so hard to get client device certificate authentication to work. All you need is:

  • Unified Access Gateway 2.6 or later
  • Horizon 7 version 7.5 or later
  • A certificate installed on the client device that Unified Access Gateway accepts

Configure X.509 authentication settings

The first step is to configure the UAG to accept a device certificate. To do so, log into the UAG admin interface, expand the authentication settings and open the X.509 settings.

You need to upload the Root CA certificate, which is used to sign the device certificates, as a Base64 coded file. I always recommend to enable “Cert Revocation”. You can enable “Use CRL from Certificates”, if the certificates include the URL to the CRL. Otherwise you can add the CRL location. This location must be accessable for the UAG! Click “Save” and you are ready to configure the Horizon settings.

Configure Horizon settings to use X.509 authentication

After you have configured the X.509 authentication, you have to enable the device certificate authentication for Horizon View. Expand “Horizon Settings” and enter the configuration settings.

It is important to select “Device X.509 Certificate AND Passthrough”.

Save the settings and you are ready to go. At this point a user must use a device with a valid device certificate.

Device Certificate

It is important to know that you have to create a new certificate template. The computer certificate template, which is included in a standard Microsoft PKI, cannot be used! It is mandatory to use the “Microsoft Enhanced RSA and AES Cryptographic Provider” in the template. It only works with this Cryptographic Service Provider (CSP)!

The easiest way is to duplicate the “Computer” template and change the necessary settings. First of all: The CSP must changed to “Microsoft Enhanced RSA and AES Cryptographic Provider” and it must be the only provider.

The subject name of the certificate should automatically be populated with information from the Active Directory, in this case the computer name.

Because the certificate is only for authentication purposes, you should remove “Server Authentication” from the Application Polices. Otherwise this certificate could be used to run a webserver.

Depending on your policies, you should mark the private key as “not exportable”!

The last step is important. After you enrolled the certificate to your computer, you need to add permissions to the user that should be able to use the certificate for authentication! This is necessary because it is a device certificate, and only SYSTEM and the local administrators group has permissions to access the private key of the certificate.

That’s it. If you open the View Client and try to connect to your View environment, then you should get a certificate selection dialog. After chosing the correct certificate, you need to enter user credentials.

Only with a valid certificate and valid credentials a connection to your View environment is possible.

VMware vCenter 7.0 U2 deployment fails at stage 2

Today I had to deploy a new vCenter appliance. Nothing fancy, new deployment. Stage 1 was easy, but stage 2 failed several times. I re-deployed the vCenter appliance two times, but as the deployment failed for the third time, I took a look into the logs.

The deployment failed without any error, but it didn’t finished. It stopped during the start of different services without any error.

First of all: Log into the appliance using SSH or the console. Use the root account and the root password you have entered during the setup.

A good point to start are the logs under /var/log/firstboot. I used ls -lt to get the last written logs. Most services will write two logs: One log ends with _stdout.log, and the second one will end with _stderr.log. The _stdout.log contails the log messages of the service. The _stderr.log contains the errors. I searched for a service that has written to a _stderr.log – and I found it: scafirstboot.py_10507_stderr.log.

And this log gave me a hint what the root cause was. One of the last log entries was:

ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate is not yet valid

What what? A certificate not only has an end date, but also a date before which it is not valid – a start date. And this is often indicates a problem with – NTP. And it was NTP. I have configured NTP for the vCenter, but not for the ESXi on which I deployed the vCenter. -.- If it is not DNS, it’s NTP. Or a invalid certificate. Or both.

Veeam B&R backup failes with “No scale-out repository extents are available”

One of my customers replaced the old Veeam environment with new gear. The HW was pretty simple designed:

  • two HPE ProLiant
  • per server two HPE D3610 enclosures with 6 TB disks
  • ~ 5km between backup server and backup copy destination

One server was designed to act as the Veeam backup server and repository, and the second server was designed to act as the backup copy destination. Both servers were running Windows Server 2019 Standard. We planned to use Windows Deduplication and ReFS, but it turned out that we have to adjust the filesystem size to get Windows Dedup working. Windows Dedup supports filesystems up to 64 TB. Due to the 24x 6 TB disks, we had to create to logical volumes to stay under 64 TB usable capacity.

I created one Scale-Out Backup Repository per server and configured my backup jobs. At this point things got worse…

The backup ran fine, but as soon as the copy kicked in, the copy job failed. Error “No scale-out repository extents are available”.

Huh? Everything was fine. If no backup were running, the copy ran fine. Setting limits (throughput or concurrent tasks) doesn’t fixed it. So I opened a case at Veeam.

We had to take debug logs to come to a solution.

Solution

The support advised us to set a registry key:

Key: HKLM\SOFTWARE\Veeam\Veeam Backup and Replication\
Value Name: SobrForceExtentSpaceUpdate
Value Type: DWORD
Value Data: 1

After a restart of the Veeam services, the backup and copy job ran fine. No further issues.

This key is described in Veeam KB2282. The option was introduced with Backup & Replication 9.5 U2. The customer is running the v10.0.1.4854. The key forces Veeam to update free space information with the real values, and it subtracts the estimated sizes of all the tasks currently going to the selected extent.

WatchGuard Network Security Essentials Exam

Yesterday, I passed the first exam of the year. In this case the WatchGuards Network Security Essentials exam. The exam covers basic networking and firewalling skills, as well as the necessary knowledge to configure, manage, and monitor a WatchGuard Firebox. If you were familier with networking and firewalls in general, this exam is a “low hanging fruit”. I had to take it due to partner conditions.

WatchGuards offers a pretty good study guide for this exam which you can get for free.. The exam is delivered by  Kryterion and can be taken in a test center or as online proctored exam.

The closed book exam consists of 70 questions. You have 2h and you need at least 70% to pass the exam. The exam covers six different topics:

  • Network and network security basics
  • Administration and setup
  • Monitoring, logging, and reporting
  • Networking and NAT
  • Policies, proxies, and security services
  • Authentication and VPN

I passed the exam with a some preparation (I’ve only used the study guide). As long as you have experience with WatchGuard firewalls, which is mandartory IMHO, it is sufficient to read the study guide a couple of times.

VCAP-DCV Design 2021 – Objective 1.1 Gather and analyze business requirements

This blog post covers objective 1.1 (Gather and analyze business requirements) of the VCAP-DCV Design 2021 exam. It is based on the VMware Certified Advanced Professional 6.5 in Data Center Virtualization Design (3V0-624) Exam Preparation Guide (last update December 2019).

When you get the task to design something , you will instinctively start gathering information about the requirements that have to be fulfilled. Everything IT is doing should support the business in some way.

The necessary skills and abilities are documented in the exam prep guide for the older VCAP6-DCV Design exam (3V0-622). I think they also apply to the current version of the exam:

  • Associate a stakeholder with the information that needs to be collected
  • Utilize inventory and assessment data from a current environment to define a baseline state
  • Analyze customer interview data to explicitly define customer objectives for a conceptual design
  • Determine customer priorities for defined objectives
  • Ensure that Availability, Manageability, Performance, Recoverability and Security (AMPRS) considerations are applied during the requirements gathering process
  • Given results of the requirements gathering process, identify requirements for a conceptual design
  • Categorize requirements by infrastructure qualities to prepare for logical design requirements

Associate a stakeholder with the information that needs to be collected

Let’s start with the stakeholders and why they are important for us. But what is a stakeholder? A stakeholder is a person with an interest or concern in something, especially a business (Oxford). Stakeholders can be internal or external parties. An internal stakeholder is someone with a direct relationship to the company. An external stakeholder has no direct connection to the company, but it is affected in some way. This can be suppliers, the government, or other groups. A stakeholder can be anyone, but in our context stakeholders are typically

  • C-Level Executives (CEO, CFO, CIO etc.)
  • Vice Presidents
  • Managers, but also
  • Engineers and end users

As always: It depends. :)

Utilize inventory and assessment data from a current environment to define a baseline state

We also need to understand the current environment and what is currently deployed at the company. Interviews with the stakeholders are important, but in most cases they will not answer all questions. Depending on what is currently deployed, different tools can be used to gain the necessary data. Some examples:

  • RVTools, PowerCLI, vSphere Web Client, vROps etc
  • Custom scripts
  • Windows Server Manager
  • Network Monitoring Tools, like HPE Intelligent Management Center
  • Asset Management

It is important to document the results of the assessment. This is the baseline state of the current environment.

Analyze customer interview data to explicitly define customer objectives for a conceptual design

Now we need to get back to the results of the interviews that we did with the stakeholders to define the goals and the scope of the design. We also need to understant the the

  • Constraints
  • Assumptions,
  • Requirements, and
  • Risks

When we talk about requirements, we have to differ between functional (WHAT) and non-functional (HOW) requirements.

These information will allow us to create a conceptual design, which is written down in a workbook document.

Determine customer priorities for defined objectives

The next step is to define the priorities over the defined objectives. It is important to weight e.g. requirements and risks. Milestones have to be defined. They will help us to measure the success of the project and keep it on track.

Ensure that AMPRS considerations are applied during the requirements gathering process

AMPRS stands for

  • Availability
  • Manageability
  • Performance
  • Recoverability, and
  • Security

It is important to understand the meaning of each of these terms.

Availability considerations address the availability requirements of our design. These are typically expressed by percent uptime of a specific system. For example: 99,5% availability for file services.

Manageability considerations address the management and operational requirements of our design. This can be alerting, reports, access concepts etc.

Performance considerations express the required performance characteristics of the design. For example: Mails per second by a given size.

Recoverability considerations cover the ability to recover from an unexpected incident or disaster. This topic typically addresses backup and recovery of our design.

Security considerations cover the requirements around data control, access management, governance, risk management etc.

Given results of the requirements gathering process, identify requirements for a conceptual design

Now we have collected information from the relevant stakeholders, including the goals, scope, and CARR (constraints, assumptions, requirements, risks), and we have collected details about the current environment. Now it is time to put these information together and create a conceptual design.

The conceptual design must be approved by the stakeholders. This assures that everything is covered. Creating a conceptual design is an iterative process. The conceptual design is finished when the relevant stakeholders have approved it.

Categorize requirements by infrastructure qualities to prepare for logical design requirements

Sounds simple, but it can be challenging: The documented requirements have to be grouped by infrastructure categories, eg.

  • Networking
  • Storage
  • Recovery
  • Compute
  • VM
  • Security

Based on the CARR and the AMPRS considerations, we made design decisions. These decisions affect each of the infrastructure categories. At this point, we can review each of our decisions and mapping the requirements to the infrastructure will ease the creation of a high-level logical design.

Summary

Let me try to simplify this complex process a bit.

We were asked to solve a problem for a company. To solve this problem, we have to design a solution. To create this design, we have to identify the relevant stakeholders. These stakeholders will help us to gather information about the goals, the scope, about constraints, assumptions, requirements and risks. Especially when it comes to the requirements, we have to take availability, manageability, performance, recoverability and security considerations into account.

We can use different tools to collect information about the current environment.

At this point we know WHAT the company want, and we know WHAT they are currently running.

Now we can start with the creation of a conceptual design, which has to be approved by the relevant stakeholders.

To prepare the logical design, we need to map the documented requirements to the different categories of the infrastructure.

Links

VMware Certified Advanced Professional 6.5 – Data Center Virtualization Design Exam (VCAP-DCV Design 2021)

In August 2018 I’ve passed the VCAP6-DCV Deployment exam. After a busy first half of 2019 it’s time to start preparing the VMware Certified Advanced Professional — Data Center Virtualization Design 2019 exam. But I lost focus and in 2020 I had a lot to do – but not VMW related and so I also missed my goal to take the VCAP-DCV Design exam.

I have to push myself, so I decided to re-cap my half finished blog series to get myself back on track.

There are many great study guides out there, but in most cases I need “my own study guide” to feel well prepared. I hope this blog series will keep me on track, and I stay focused. This is my third try to prepare for this exam… :/

Image by Pexels from Pixabay

In opposite to the Deploy exam, the Design exam is a MC exam. 130 Minutes for 60 questions. Sounds easy, but it’s told that it’s one of the hardest exams available by VMware.

The exam consists of three sections:

  • Section 1 – Create a vSphere 6.5 Conceptual Design
  • Section 2 – Create a vSphere 6.x Logical Design from an Existing Conceptual Design
  • Section 3 – Create a vSphere 6.x Physical Design from an Existing Logical Design

Each section contains several objects.

  • Objective 3.1 – Transition from a logical design to a vSphere 6.x physical design
  • Objective 3.2 – Create a vSphere 6.x physical network design from an existing logical design
  • Objective 3.3 – Create a vSphere 6.x physical storage design from an existing logical design
  • Objective 3.4 – Determine appropriate computer resources for a vSphere 6.x physical design
  • Objective 3.5 – Determine virtual machine configuration for a vSphere 6.x physical design
  • Objective 3.6 – Determine data center management options for a vSphere 6.x physical design

I will try to cover each objective in a blog post and add a link here. Feel free to add comments, corrections and questions. :) The already added links link to already written blog posts, but I will revise the alreay posted blog posts.

Leave a comment if you have questsions. :)