A customer of mine is currently refreshing his branch office server infrastructure. A part of this project is to demote the Active Directory Domain Controllers, that are currently running in each branch office. The customer has multiple branch offices and each branch office has an Active Directory Domain Controller which is acting as file-/ print- and DHCP server. Each branch office has its own Active Directory site. The Domain Controller and the used IP subnets are assigned to the corresponding AD site. Due to this configuration the clients at the branch office choose the site-local Domain Controller as logon server. This works totally flawless since a couple of years. Over the year bandwidth of site connection has increased and even small branch offices have a redundant MPLS connection to the HQ. And no one likes single domain AD forests with 20 or more Domain Controllers…
After demoting the Domain Controller in the first branch office I visited, my colleague and I discovered an interesting behaviour: The removal of the Domain Controller was flawless. Everything went fine. But when we tried to logon at a client, we got no GPOs and no network drives mapped. The name resolution was fine, so this was not our problem. I checked the content of the % LOGONSERVER% variable and discovered that it contained the name of the (now) demoted Domain Controller. After another logout and login, everything was fine. The client had chosen a new domain controller, now from the HQ AD site. This was correct and an expected behaviour. The branch office IP subnets were changed at the same time and the new IP subnets were assigned to the HQ AD site days before we demoted the DC.
Assumption: The client (Windows 7 Enterprise) used cached credentials to logon. These credentials included the old Domain Controller. During the second logon, a new Domain Controller is discovered based in the AD site.
To the lab!
Because I had to do the same in other branch offices, I searched for a solution. I used a couple of VMs to create a similar situation.
- 2 sites
- Each site has its own port group
- 1 IP subnet per site
- Router VM routes traffic between IP subnets
- Subnets were assigned to AD sites
- 1 DC per AD site
- Client gets IP address from a DHCP in the site
- Client is moved from one site to another site by switching port group
- Client uses “last” logonserver (from the old site)
- After the second logon, the site-local DC is chosen
- nltest shows the correct DC for the site
- If usage of cached credentials is disabled, client uses the site-local DC at every logon
I took some screenshots to clarify this. Logon as [email protected] on client2 in site 2.
I powered off the VM, switched the port group and powered on the VM. I logged on as [email protected] on client2 in site 1. You can see, that the client still uses DC2 as logon server. The client got an IP from site 1.
Logout and logon as [email protected] on client2 in site 1. Now the client uses DC als logon server.
I tried this several times. The behaviour was always the same. Then I disabled cached credentials using a GPO (“Interactive logon: Number of previous logons to cache” set to 0). Now the client always chose the site-local DC on the first attempt.
I don’t know if this is willed behavior. It’s reproducible and I don’t think that this is the result of a misconfiguration or a bug. If you demote a Domain Controller in a branch office, and it’s the only Domain Controller, the clients will try to reach it on the next logon. If the Domain Controller is still available, maybe because you moved it to another site, everything’s fine. But if it’s gone, you will get in trouble due to cached credentials.
In my case, the customer and I decided to assign all used IP subnets from the branch offices to the HQ AD site. Even if the branch offices still have a Domain Controller, the clients now chose the Domain Controller from the HQ. The Domain Controller in the branch offices now acts as only as DHCP and DNS until they are demoted.
- Moving a small on-prem environment to Azure/ O365 – Part 1 - October 18, 2020
- Virtually reseated: Reset blade in a HPE C7000 enclosure - July 19, 2020
- Update Manager fails with unknown error during host remediation - July 19, 2020