vcloudnine.de is the personal blog of Patrick Terlisten. Patrick has a strong focus on virtualization & cloud solutions, but also storage, networking, and IT infrastructure in general. He is a fan of Lean Management and agile methods, and practices continuous improvement whereever it is possible. Feel free to follow him on Twitter and/ or leave a comment.
We have to deal with COVID19 for a year now and from the IT perspective, 2020 was a pretty strange year. Many project were not cancelled, but were placed on-hold. But two kinds of projects went through the roof:
Microsoft 365, and
As you might noticed I blogged a lot about Exchange, Exchange Online and Horizon this year. The reason for this is pretty simple: That was driving my business this year.
In early 2020, when we decided to move into our home offices, we deployed Horizon View on physical PCs at ML Network (my employer). This was a simple solution and it works for us until today.
Some of my customers also deployed Horizon View for the same reason: A secure and easy way to get a desktop. For some of them, the tech was new and they struggled with DEM, Linked Clones, customization etc. The solution in this case was easy: Full Clones with dedicated assignment.
One customer moved from Windows 7 and floating-assignment and Linked Clones to Windows 10 and Full Clones and dedicated assignment (not my project).
Another customer started to implement Horizon View with Horizon 2006 and he started with Instant-Clones, dedicated assignment and DEM. I told him to go with Full Clones, but his IT-company moved on with Instant Clones. Now he’s complaining about gaining complexity.
My 2 cents
Many customers struggle with Windows 10 and the customization of Windows 10. Tools like Dynamic Environment Manager (DEM) are powerful, but they can be quite complex, especially when it comes down to small IT orgs with 50,100 oder 200 desktops, were each member of the IT has to be a jack of all trades.
I always recommend to start with Full Clones, just to get in touch with the technology. And I always recommend to get the requirements clear with the stakeholders and the user. Things like not working software, missing settings after a logoff/ logon or slow response are the main difficutiles who will force a VDI project to fail.
When you are familiar with the technology, proceed further with DEM, Instant Clones, floating assignment. But you should learn to walk, before you start to run.
Maybe I’m getting old. :D I’m not against modern technology and new features. I’m not a grumpy old senior consultant. But I think I’ve learned the hard way why it’s a bad idea to overburden IT-orgs and their users with new tech, especially in times like these.
December 31, 2020 will not only be the end of the miserable year 2020, it will also be the end of an era – the era of Adobe Flash! Adobe has announced that they will stop supporting Adobe Flash after December 31, 2020. Furthermore, Adobe will block Flash from running in Flash Player on January 12, 2021. Adobe strongly recommends that all users immediately uninstall Flash Player. I got a popup a couple of times, asking me if I want to uninstall Adobe Flash. It’s still installed… :/
Adobe Flash isn’t a big thing in web development anymore, but there is a reason why I still have Adobe Flash installed – Admin Interfaces!
We all had to deal with Flash after VMware started with the vSphere Web Client. It was slow and partially painful buggy. New newer HTML5 based Web Client was much better, but not feature complete until vSphere 6.7.
But the vSphere Web Client was not the only admin interface based on Flash used in a VMware product. The Horizon Administrator, which was the main administration interface until Horizon 7.8, is also based on Flash. Or vRealize Operations uses Flash until version 6.6.
If you want to remove Adobe Flash from your computer, you have to update your whole, or at least parts, of your VMware infrastructure.
The simple rule is: Update to the latest release and everything will be fine. If you are running vSphere 6.7 U3, the HTML5 based Web Client is feature complete. The same applies to Horizon View. If you are running 7.10 or a newer release, everything is fine.
In this case, there is an easy approach: Disconnect your systems from the internet or at least block the internet access for them. The alternative approach is not recommended! Stop the automatic updates on your web browser and use the Flash-based User Interfaces on a browser which still supports Flash. Again: This is really not recommended!
While migrating a customer from Exchange 2010 to Exchange 2016, I had to create an Exchange Hybrid Deployment, because the customer wants to use Microsoft Teams. Nothing fancy and I’ve did this a couple of times.
Unfortunantely the Hybrid Connection Wizard failed to create the migration endpoint. A quick check of the logs showed this error:
Microsoft.Exchange.MailboxReplicationService.MRSRemotePermanentException: The Mailbox Replication Service could not connect to the remote server because the certificate is invalid. The call to 'https://mail.contoso.com/EWS/mrsproxy.svc' failed. Error details: Could not establish trust relationship for the SSL/TLS secure channel with authority 'mail.contoso.com'. -->The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel. --> The remote certificate is invalid according to the validation procedure
The customer had not plans to move mailboxes to Exchange Online, so we didn’t care about this error. But the Calendar tab in Teams was not visible, and Teams logs stated that Teams was unable to discover the mailbox. A typical sign of a not working EWS connection.
It’s always TLS… or DNS… or NTP
The customer used a certificate from its own PKI, so it was not trusted by Microsoft. In addition, the Exchange was located behind a Sophos XG which was running Webserver Protection (Reverse Proxy). But this was not the main cause for the problems.
The root cause was the certificate from the customers PKI.
And therefore you should make sure to use a proper certificate from a 3rd CA for Exchange Hybrid Deployments. I really please every customer to stop using self-signed certificates, or certificates from their own PKI for external connections.
The customer has switched to a Let’s Encrypt certificate for testing purposes and the problems went away, without running the HCW again. He will now purchase a certificate from a 3rd party CA.
During the deployment of a vSAN cluster consisting of multiple HPE ProLiant DL380 Gen10 hosts, I noticed a memory health warning after updating the firmware using the Support Pack for ProLiant. The error was definitely not shown before the update, so it was clear, that this was not a real issue with the hardware. Furthermore: All hosts showed this error.
The same day, a customer called me and asked me about a strange memory health error after he has updated all of his hosts with the latest SPP…
My first guess, that this was not caused by a HW malfunction was correct. HPE published a advisory about this issue:
To fix this issue, you have to update the ILO5 firmware to version 2.31. You can do this manually using the ILO5 interface, or you can add the file to the SPP. I’ve added the BIN file to the USB stick with the latest SPP.
If you want to update the firmware manually, simply upload the BIN file using the built-in firmware update function.
Navigate to Firmware & OS Software in the navigation tree, and then click Update Firmware
Select the Local file option and browse to the BIN file
To save a copy of the component to the iLO Repository, select the Also store in iLO Repository check box
To start the update process, click Flash
You can download the latest ILO5 2.31from HPE using this link. After the FW update, the error will resolve itself.
Only ESXi 6.7 is affected, and only ESXi 6.7 running on HPE ProLiant hosts, regardless if ML, DL or BL series.
A couple of days ago, I wrote about our first steps to move our on-prem stuff to Azure. This post will cover how we adopted Office 365 and how we have started with our Azure deployment.
Our first step into Office 365 was Microsoft Teams. We needed a solution for calls (audio/ video) and chat. We skipped Skype 4 Business and started with Microsoft Teams.
Our Microsoft Teams deployment was pretty simple: We used our Microsoft IUR Office 365 E3 plans. Microsoft Azure AD Connect was quickly deployed and the Microsoft Exchange Hybrid Connection Wizard did the rest. Some weeks later we deployed ADFS/ ADFS Proxy. We used this setup over several months and it was pretty slick and was working flawless. At this point, we only used Teams, Planner and OneDrive 4 Business (SharePoint).
Some months went by until we decided to move to Azure.
Resource groups in Azure
You can imagine a resource group (RG) as a container that contains one or more resources, like VMs, NICs, SQL instances etc. The resource group can contain all the resources for the solution, or only those resources that you want to manage as a group.
First question: What do we need to deploy?
The answer was easy:
in sum 9 VMs
Recovery Services Vault
Log Analytics Workspace
Second question: One or multiple resource groups?
An easy rule of thumb is, that a resource group should contain only resources that share the same life cycle and sponsor.
Third question: Who needs delegated priviledges to manage this stuff?
In our case there was no need to fine-graded RBAC. All of our technical staff has a personalized admin account and should be able to do whatever is necessary.
To connect our on-prem network to Azure, we had to setup a Site-2-Site VPN. This was the first thing after creating our first resource group. We used a Gen 1 Basic VPN Gateway, which was sufficient for our needs (max 100 Mbit, no OpenVPN, no BGP).
Keep in mind to choose your networks and subnets wisely. If you need to deploy 9 VMs, don’t use 10.0.0.0/8. ;) In our case we added two network ranges with a single subnet in each network range. One for our server VMs, and a second subnet as gateway subnet.
We deployed our VMs as B-Series VMs. A common mistake is to use the wrong VM size. Start small and right-size a VM if necessary. Most of our VMs are B2s (2 CPUs, 4 GB RAM). Only the Exchange (B4m), the management (B2ms) and the RDS server (B2ms) differ from this. This looks pretty small for Server 2019, but it is working pretty nice.
After deploying the VMs, we assigned static IP addresses to them. To our suprise most things in Azure are lacking proper IPv6 support. :( That hurt a lot.
For most VMs we used Standard HDDs instead of SSDs. Even for your file server, because the bottleneck is not the disk, it is the connection between clients and server. Beside this, we used managed disks for all VMs, and we deployed a second disk for data if necessary (Exchange, domain Controller, file server etc.).
If a server had a DNAT in our on-prem network, we deployed a public IP, and secured the access to it.
All VMs are connected to the same Network Security Group (NSG), which we use to get control over what a VM can reach, and who can access a VM.
Over a couple of days we moved more and more services to Azure, starting with our Domain Controllers, PKI and file services. These were low hanging fruits. The file server was easy because we already had a DFS namespace in place, so all we had to do were to change the DFS Links and point them to the new file server. The data was copied by using DFS replication.
DHCP was moved to our on-prem firewall. A printserver was not necessary any more. Windows Updates were switched back to download from Microsoft and Delivery Optimization.
The applications that were running on our Linux and Windows application server were also easy to migrate. After a couple of days we had our server workload running on Azure.
To get our ERP running, we deployed a single RDS host (quick deployment), and deployed our ERP as a remote app. It was too slow to use it over the VPN. Unfortunately the application lacks a proper database backend. :/ But as a remote app, it is working pretty good.
A bigger challenge was Exchange, but not because of the mailbox migrations.
The migration to Exchange Online was pretty simple. Since our first HCW run, we used the central mail transport, so that all mails are received and sent by our on-prem mail gateway.
The mailbox migration was pretty easy and we had zero issues. Then we tried to switch the mail transport from central of Exchange Online. This was flawless too… except the fact, that our ticket system was unable to send e-mails.
Our ticket system relays its mail over our Exchange server. After switching the mail server in our ticket system to the new Azure based VM, the mails stuck in the outbound queue, even if the server tried to send the mail to our on-prem mail gateway. This quote from Microsoft explains the whole problem:
Starting on November 15, 2017, outbound email messages that are sent directly to external domains (such as outlook.com and gmail.com) from a virtual machine (VM) are made available only to certain subscription types in Microsoft Azure. Outbound SMTP connections that use TCP port 25 were blocked. (Port 25 is primarily used for unauthenticated email delivery.)
This change in behavior applies only to new subscriptions and new deployments since November 15, 2017.
This is the case for MSDN, Azure Pass, Azure in Open, Education, BizSpark, and Free Trial subscriptions!
If you created an MSDN, Azure Pass, Azure in Open, Education, BizSpark, Azure Sponsorship, Azure Student, Free Trial, or any Visual Studio subscription after November 15, 2017, you’ll have technical restrictions that block email that’s sent from VMs within these subscriptions directly to email providers. The restrictions are done to prevent abuse. No requests to remove this restriction will be granted.
If you’re using these subscription types, you’re encouraged to use SMTP relay services, as outlined earlier in this article or change your subscription type.
We accelerated our migration and disabled the central mail transport earlier than planned. Then we configured our Linux application server to authenticate against Exchange Online using SMTP Auth and SMTP Submission (587/tcp). For incoming mails, the mails are routed to the application server using a Exchange Online connector and a transport rule which matches to specific mail addresses.
The Azure based Exchange VM is only needed because we still have an Azure AD Connect running. Microsoft has planned to replace this by a new solution. And until this, we will run this Exchange 2016 in Azure. But it is not part of our mail flow.
Moving Azure AD Connect & decommissioning ADFS
Because we had to get rid of the ADFS server and ADFS Proxy, we deployed Pass-Through Authentication and Seamless SSO. Then we decommissioned the ADFS setup.
Moving Azure AD Connect was a bit quirky. We had conditional access already in place and the Azure AD Connect setup was unable to handle this. The synchronisation account was unable to sync, because it ran into a MFA request. We optimized our policies and got this sorted out.
Decommissioning old stuff
Whenever we moved a service successful to Azure, we switched off the on-prem server, and modified our documentation to reflect the made changes. At the end, we were able to switch off three of our four ESXi hosts. A last ESXi Host is still running for our Horizon View deployment and our firewall.
The next post will cover how we automated this, how we do backups and whatever you’re interested in. Leave a comment! :)
As part of an ongoing Exchange 2010 to 2016 migration, I had to replace the self-signed certificate with a certificate from the customers PKI. Everything went fine, the customer had a suitable template, we’ve added the necessary hostnames and bound IIS and SMTP to the certificate. The mess started with an iisreset /noforce…
The iisreset took longer than expected. After that, I tried to login into the ECP, entered username and password and got an error.
<Provider Name="MSExchange Front End HTTP Proxy" />
<TimeCreated SystemTime="2020-10-22T12:16:38.934123400Z" />
<Data>System.NullReferenceException: Object reference not set to an instance of an object. at
Microsoft.Exchange.HttpProxy.FbaModule.ParseCadataCookies(HttpApplication httpApplication) at
Microsoft.Exchange.HttpProxy.FbaModule.OnBeginRequestInternal(HttpApplication httpApplication) at
Microsoft.Exchange.Common.IL.ILUtil.DoTryFilterCatch(Action tryDelegate, Func`2 filterDelegate, Action`1 catchDelegate)
Pretty strange. We switched back to the self-singned certificate, did an iisreset and everyting was fine again.So it was pretty obvious that the error was related to the certificate, or to be more clear, to the certificate template.
A short research confirmed this. The template was a modified v3 web server template from an Enterprise CA running Windows Server 2008 R2.
With Windows Server 2008, Microsoft introduced a new cryptographic API called Cryptography Next Generation (CNG), which separates cryptographic providers (algorithm implementation) from key storage providers (create, delete, export, import, open and store keys). The older CryptoAPI does not differ between this and implements cryptographic algorithms and key storage.
The modified template used CNG instead of CryptoAPI. We noticed this when we checked the certificate with certutil -store my <thumbprint>.
If the listed provider for the certificate is Microsoft Software Key Storage Provider, then you will have to re-import the certificate. If Microsoft RSA SChannel Cryptographic Provider is used, everything is fine.
You have to remove the certificate, then re-import it using
It was a bit quiet here due to the current COVID 19 pandemic. But now I’m back with a pretty interesting story on how my colleagues and I moved most of our on-prem server stuff to Microsoft Azure and Office 365.
It all started with the COVID19 lockdown in Germany in March 2020. We moved into our home offices after setting up a small VMware Horizon View deployment to access our PCs using physical View Agents and manual desktop pools. Most projects were stopped, and we did most of our work remote. No lay-offs or short-time work.
We were running a small VMware vSphere cluster for a couple of years. Nothing fancy: Two HPE ProLiants, vCenter, two DCs, File-/ Printserver, WSUS, Exchange, Linux maschines for web services, Sophos UTM, a pfSense, View Connection Server, UAG, ADFS/ ADFS Proxy, PKI etc. In sum 18 VMs on two hosts, some VLANs with firewalls in between etc. We were running Exchange 2016, AzureAD Sync, Exchange Hybrid, but we only used Microsoft Teams from our Office 365 deployment. Veeam Backup & Replication was used for backups, a backup copy to a NAS and some Robocopy jobs that moved Veeam Backups to USB drives for DR. Everything was pretty simple and designed to work without much operations. Our focus is on our customers, not on our internal IT. It was stable, secure and pretty slick.
In March 2020 we asked “What if?”. What if we lose our offices due to a fire (we are located in a bigger office building and we had a couple of fire alarms this year due to remodeling work). How can we work if your DSL line is cut? How can we get our backups offsite? How can we modernize our IT withtout big invests? Money, that we don’t spend on our internal IT can given to our employees. ;) (By the way, that’s the same reason why we try to drive smaller and more efficient cars…).
We developed a couple of ideas, including new servers, storage etc. and put that stuff into a datacenter. But in the end, we decided to move most of our stuff to Microsoft Azure and Office 365.
I want to share some of the things we have learned on the road to Azure.
We used the Azure Migrate Server Assessment tool to assess our vSphere environment. We wanted to get a ball park on how we had to size the VMs. We knew, that we wont need to migrate all VMs. For example our virtual pfSense firewall, the vCenter, the Sophos UTM or our ADFS setup were not planned to migrate.
After the first assessment, we started to play around the the Azure pricing calculator. Just to get an idea on how different VM sizes affect the costs.
As a Microsoft partner, we were able to use our internal user rights (IUR) for Microsoft Office 365 and Azure. Microsoft offers us 25 Office 365 E3 plans and a 6000 US-$ budget for Azure (Azure Sponsorship Subscription). Our plan was to stretch the Azure budget over 12 months, so that we don’t have additional costs until we re-apply for our Microsoft partnership. Starting with 6000 US-$ Azure budget, it makes ~16 US-$ per day for our complete Azure deployment.
Now, as we knew that we have 16 US-$ per day, we planned our Azure deployment. First of all, we planned the number of VMs. We had 18 VMs on-prem, and we managed to get down to 9 VMs.
two Domain Controller
Remote Desktop Host (all-in-one Deployment)
SQL/ App server
Exchange 2016 for Hybrid Deployment
The View Connection Servers and UAG are still running on-prem. Our virtual pfSense will be moved to a WatchGuard Firebox soon. Sophos UTM and ADFS are gone. A dedicated WSUS server is not necessary any more, we moved back to simple Windows Update and Delivery Optimization.
Instead of D-Series VMs, we decided to go for B-Series VMs. The main reason for this were costs, but today I can say: The performance is quite good. I can’t see any reason for us to move to D-Series.
To connect to our Azure deployment, we had to setup Site-2-Site VPN. We deployed a simple Gen1 Basic SKU VPN Gateway. We had no need for more than 100 Mbit (we’re using a 50/10 VDSL at our office location), BGP or zone redundancy.
Backups are kept in a Recovery Services Vault with pretty simple polices. Either a VM needs to be current, in this case we keep 7 restore points, or we might need to keep more restore points. In this case we keep 7 daily, 5 weekly, 12 monthly and 3 yearly restore points. And this is only the case for our fileserver.
Additional cost savings
But with this setup we would not get under 16 US-$ a day. :( So we took another approach to break the mark: We shut down VMs at night and at the weekends! It took a bit until my colleagues and I get used to this. Nobody wants to shut down servers without a good reason.
But: We are currently at 18 US-$ per workday, and 10 US-$ for saturday and sunday. Everything, except domain controllers and ticket system, is shutdown at night and on the weekends.
We are using an Automation Account with some simple scripts and schedules to shutdown VMs and start them again.
The next blog post will be around how we planned the usage of Office 365, and how we started with Azure.
The View Agent Direct-Connection (VADC) Plug-In was designed as an extension to the Horizon Agent, which allows a Horizon Client to directly connect to a VM or physical machine withtout using a Horizon Connection Server.
The VADC is nothing new, it is part of the Horizon View eco system for a couple of years now. Meanwhile, the VADC supports the Blast Exteme protocol, which makes it pretty interesting for remote access to lab environment or home office equipment.
There are a couple of requirements which I want to highlight:
VADC Plug-In has the following additional requirements:
The VM or or physical machine must have a minimum of 128 MB of video RAM
For a virtual machine, you must install VMware Tools before you install Horizon Agent
A physical machine supports Windows 10 Enterprise version 1803 or version 1809, newer releases tend to work flawless
A VM supports Blast and PCoIP protocols
A physical machine supports Blast only
The installation of the VADC is divieded into two steps:
Installation of the View Agent
Installation of the VADC
The View Agent has to be installed silently, because you are unable to add it to a Connection Server. The silent installation allows you to skip this step.
I used this command line to install the View Agent:
The second step is to install the VADC. This is pretty easy: Setup > Next, next, next. :)
Finally, you can start the View Client on another machine and add a Connection Server with the IP or FQDN of you newly installed VADC machine.
This is the output of netstat on my X250 after connecting using the VADC:
TCP 192.168.20.52:443 t480s:50996 ESTABLISHED
TCP 192.168.20.52:443 t480s:50997 ESTABLISHED
TCP 192.168.20.52:443 t480s:50998 ESTABLISHED
TCP 192.168.20.52:22443 t480s:51014 ESTABLISHED
TCP 192.168.20.52:32111 t480s:51027 ESTABLISHED
You might notice the typical Horion View Ports 22443 for Blast Extreme and 32111 for USB redirection.
This issue was a bit annoying. I faced this issue not in a customer environment, rather then on my second Lenovo laptop, an X250 with Windows 10 20H2. My intention was to use it headless in a docking station. So how should I access it? RDP? TeamViewer? Why not use the Horizon Direct Connection Plug-in?
The Horizon Direct Connection Plug-in is not a new feature and you can think of it as a View Agent without a Connection Server. You can access it using the View Client, but you don’t have to run the connection through a Connection Server. For pretty small environments or direct access a perfect fit!
In order to use the Horizon Direct Connection Plug-in, you have to install the View Agent. So I downloaded the latest View 2006 Agent (VMware-Horizon-Agent-x86_64-8.0.0-16530789.exe) and started the setup.
It fails right at the beginning. Okay, I just installed Windows Updates, so I rebooted my laptop. But the setup fails again. Next reboot. And it fails, and fails, and fails.
There are some registry keys you can check of you get such an error. “PendingFileRenameOperations” is one of the common issues when you face this problem. I found a script, but there was no reboot pending.
I finally found it: RunOnce.
There was an entry under HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce. After deleting this entry, the setup went through.
Fun Fact: There is an MSI Property for this, when you want to Silently install the View Agent : SUPPRESS_RUNONCE_CHECK.
After a reboot, a VMware ESXi 6.7 U3 told me that he has no compatible NICs. Fun fact: Right before the reboot everything was fine.
The ILO also showed no NICs. Unfortunately, I wasn’t onsite to pull the blade server and put it back in. But there is a way to do this “virtually”.
You have to connect to the IP address of the Onboard Administrator via SSH. Then issue the reset server command with the bay of the server you want to reset and an argument.
OA1-C7000> reset server 13
WARNING: Resetting the server trips its E-Fuse. This causes all power to be momentarily removed from the server. This command should only be used when physical access to the server is unavailable, and the server must be removed and
Any disk operations on direct attached storage devices will be affected. I/O
will be interrupted on any direct attached I/O devices.
Entering anything other than 'YES' will result in the command not executing.
Do you want to continue ? yes
Successfully reset the E-Fuse for device bay 13.
The server will power up automatically. Please note, that the OBA is unable to display certain information right after this operation. It will take a couple of minutes until all information, like serial number or device bay name are visible again.
To change your privacy setting, e.g. granting or withdrawing consent, click here: