Tag Archives: active directory

NetScaler native OTP does not work for users with many group memberships

Some days ago, I have implemented one-time passwords (OTP) for NetScaler Gateway for one of my customers. This feature was added with NetScaler 12, and it’s a great way to secure NetScaler Gateway with a native NetScaler feature. Native OTP does not need any third party servers. But you need a NetScaler Enterprise license, because nFactor Authentication is a requirement.

To setup NetScaler native OTP, I followed the availbe guides on the internet.

The setup is pretty straightforward. But I used the AD extensionAttribute15  instead of userParameters, because my customer already used userParameters  for something else. Because of this, I had to change the search filter from  userParameters>=#@  to extensionAttribute15>=#@ .

Everything worked as expected… except for some users, that could not register their devices properly. They were able to register their device, but a test of the OTP failed. After logoff and logon, the registered device were not available anymore. But the device was added to the extensionAttribute. While I was watching the nsvpn.log with tail -f , I discovered that the built group string for $USERNAME  seemed to be cut off (receive_ldap_user_search_event). My first guess was, that the user has too many group memberships, and indeed, the users was member for > 50 groups. So I copied the user, and the copied user had the same problem. I removed the copied user from some groups, and at some point the test of the OTP worked (on the /manageotp website).

With this information, I quickly stumbled over this thread: netscaler OTP not woring for certain users. This was EXACTLY what I discovered. The advised solution was to change the “Group Attribute” from memberOf  to userParameter , or in my case, extensionAttribute15. Problem solved!

Logon problems after demoting a branch office Domain Controller

A customer of mine is currently refreshing his branch office server infrastructure. A part of this project is to demote the Active Directory Domain Controllers, that are currently running in each branch office. The customer has multiple branch offices and each branch office has an Active Directory Domain Controller which is acting as file-/ print- and DHCP server. Each branch office has its own Active Directory site. The Domain Controller and the used IP subnets are assigned to the corresponding AD site. Due to this configuration the clients at the branch office choose the site-local Domain Controller as logon server. This works totally flawless since a couple of years. Over the year bandwidth of site connection has increased and even small branch offices have a redundant MPLS connection to the HQ. And no one likes single domain AD forests with 20 or more Domain Controllers…

j4mib

After demoting the Domain Controller in the first branch office I visited, my colleague and I discovered an interesting behaviour: The removal of the Domain Controller was flawless. Everything went fine. But when we tried to logon at a client, we got no GPOs and no network drives mapped. The name resolution was fine, so this was not our problem. I checked the content of the % LOGONSERVER% variable and discovered that it contained the name of the (now) demoted Domain Controller. After another logout and login, everything was fine. The client had chosen a new domain controller, now from the HQ AD site. This was correct and an expected behaviour. The branch office IP subnets were changed at the same time and the new IP subnets were assigned to the HQ AD site days before we demoted the DC.

Assumption: The client (Windows 7 Enterprise) used cached credentials to logon. These credentials included the old Domain Controller. During the second logon, a new Domain Controller is discovered based in the AD site.

To the lab!

Because I had to do the same in other branch offices, I searched for a solution. I used a couple of VMs to create a similar situation.

  • 2 sites
  • Each site has its own port group
  • 1 IP subnet per site
  • Router VM routes traffic between IP subnets
  • Subnets were assigned to AD sites
  • 1 DC per AD site
  • Client gets IP address from a DHCP in the site
  • Client is moved from one site to another site by switching port group
  • Client uses “last” logonserver (from the old site)
  • After the second logon, the site-local DC is chosen
  • nltest shows the correct DC for the site
  • If usage of cached credentials is disabled, client uses the site-local DC at every logon

I took some screenshots to clarify this. Logon as [email protected] on client2 in site 2.

client_ad_site_logon_server_01

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

I powered off the VM, switched the port group and powered on the VM. I logged on as [email protected] on client2 in site 1. You can see, that the client still uses DC2 as logon server. The client got an IP from site 1.

client_ad_site_logon_server_02

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Logout and logon as [email protected] on client2 in site 1. Now the client uses DC als logon server.

client_ad_site_logon_server_03

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

I tried this several times. The behaviour was always the same. Then I disabled cached credentials using a GPO (“Interactive logon: Number of previous logons to cache” set to 0). Now the client always chose the site-local DC on the first attempt.

Solution?

I don’t know if this is willed behavior. It’s reproducible and I don’t think that this is the result of a misconfiguration or a bug. If you demote a Domain Controller in a branch office, and it’s the only Domain Controller, the clients will try to reach it on the next logon. If the Domain Controller is still available, maybe because you moved it to another site, everything’s fine. But if it’s gone, you will get in trouble due to cached credentials.

In my case, the customer and I decided to assign all used IP subnets from the branch offices to the HQ AD site. Even if the branch offices still have a Domain Controller, the clients now chose the Domain Controller from the HQ. The Domain Controller in the branch offices now acts as only as DHCP and DNS until they are demoted.

STOP c00002e2 after changing SCSI Controller to PVSCSI

Today I changed the SCSI controller type for my Windows VMs in my lab from LSI SAS to PVSCSI. Because the VMs were installed with LSI SAS, I used the procedure described in VMware KB1010398 (Configuring disks to use VMware Paravirtual SCSI (PVSCSI) adapters) to change the SCSI controller type. The main problem is, that Windows doesn’t have a driver for the PVSCSI installed. You can force the installation of the driver using this procedure (taken from KB1010398):

  1. Power off the virtual machine.
  2. Create a new temporary 1GB disk(SCSI 1:0) and assign a new SCSI controller (default is LSI LOGIC SAS).
  3. Change the new SCSI controller to PVSCSI for the new SCSI controller.
  4. Click Change Type.
  5. Click VMware Paravirtual and click OK.
  6. Click OK to exit the Virtual Machine Properties dialog.
  7. Power on the virtual machine.
  8. Verify the new disk was found and is visible in Disk Management. This confirms the PVSCSI driver is now installed.
  9. Power off the virtual machine.
  10. Delete the temporary 1GB vmdk disk and associated controller (SCSI 1:0).
  11. Change the original SCSI controller(SCSI 0:X) to PVSCSI as detailed in Steps 3 to 5.
  12. Power on the virtual machine.

Please note, that this change is not a supported method to change the controller type. Usually you should install a server with disks already attached to a PVSCSI controller.

The problem

I changed the controller type for a couple of VMs using the above described method. This worked really fine until I changed the controller type of my Domain Controller. The DC failed to boot with the new controller. I changed the controller type back to LSI SAS and the VM started without problems. A change in the type of controller led to another BSoD (Blue Screen of Death). But it was not the usual STOP 0x0000007B.

bsod_after_pvscsi_change

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

STOP: c00002e2 Directory Services could not start because of the following error:

A device attached to the system is not functioning.

Error Status: 0xc000000f.
Please shutdown this system and reboot into Directory Services Restore Mode, check the event log for more detailed information.

Something new… I booted into Directory Services Restore Mode (DSRM) and logged in with my directory services restore mode password (I hope you remember your password…).

boot_into_dsrm

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

I checked the disks and found my SYSVOL volume offline.

offline_sysvol_disk

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

I usually let DCPROMO install the NTDS and SYSVOL onto a separate drive instead using the system volume. In my case this drive was offline! This causes that Windows failed to start.

The solution

The solution was simple: Bring the volume online and reboot the VM.

online_sysvol

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Windows started without problems and the AD was also okay.

Final words

I think it’s clear, that you can’t replace the controller type without additional steps. And it should also be clear, that changing the controller type isn’t a good idea until you’re really know what you’re doing. I also think that it’s unsupported to change the controller type (but I haven’t found a statement if it’s supported or not). I did this a couple of times and I never had problem with the above described procedure. In this case, the problem is related to the controller type change. The cause for the BSoD was the offline NTDS/ SYSVOL volume. Bringing the volume online in DSRM solved the problem.

Active Directory property ‎’homeMDB‎’ is not writeable on recipient

During an Exchange 2013 migration project the  first attempt to migrate a mailbox failed with the following error:

The error message clearly stated, that this was a permission issue. A quick search pointed me to the right direction. I found a thread in the TechNet forums, in which the same error message were discussed. This error occurs, if the Exchange Trusted Subsystem group is missing in the ACL of the user object. This group contains the exchange server and it’s usually inherited from the domain object to all child containers and objects. I checked the ACL of the user and the Exchange Trusted Subsystem group was missing in the ACL. This was caused by the disabled permissions inheritance. An object ACL with disabled permissions inheritance is sometimes called a protected ACL. Bill Long wrote a nice Power Shell script to search for object which have permissions inheritance disabled.

How to enable permissions inheritance?

Start the Active Directory Users and Computers MMC. Click “View” and enable “Advanced Features”. Otherwise the needed tabs will not shown.

permissions_inheritance_1

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Open the user that can’t be migrated. Switch to the “Security” tab. Click “Advanced” and enable the “Include inheritable permissions from this object’s parent” check box. Click “OK” and confirm popups regarding added ACL entries.

permissions_inheritance_2

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Check if the Exchange Trusted Subsystem group is now part of the ACL. If yes, try the migration of the mailbox again.

Why is  permissions inheritance disabled?

Good question, the customer asked me the same question. But the answer is easy. As above mentioned, is a ACL with disabled permission inheritance called a protected ACL. When should a ACL be protected? If the ACL of the associated object shouldn’t be changed, when a permission from a parent object is inherited. A good example for such an object is the Administrator account. Active Directory protects members of the following groups

  • Enterprise Admins
  • Schema Admins
  • Domain Admins
  • Administrators
  • Account Operators
  • Server Operators
  • Print Operators
  • Backup Operators
  • Cert Publishers

and the following users

  • Administrator
  • krbtgt

with three simple steps:

  1. Set the value of the AdminCount attribute to 1.
  2. Disable permission inheritance
  3. Apply permissions from the AdminSDHolder template

This process runs once an hour on a Active Directory Domain Controller and recovers the ACL of the two user accounts and the member accounts of the mentioned groups to a protected state and well-known permissions. What happens if you add an user account to the Domain Admins group? AdminCount is set to 1, the permission inheritance is disabled and the permissions specified in the AdminSDHolder template are applied. If you remove the user account from the Domain Admins group, the changes are not rolled back and so  permission inheritance stays disabled! Changes to parent ACLs are not applied to the user account and, in the case of my customer, the Exchange Trusted Subsystem group wasn’t added to the user objects ACL.

You can search for accounts with AdminCount=1 easily with the PowerShell:

Why is this a problem? With AdminCount=1 and disabled permission inheritance you will face a lot of problems:

  • Changes to an object ACLs are rolled back after an hour
  • Send-As doesn’t work
  • ActiveSync doesn’t work
  • Language settings in Outlook Web Access isn’t saved
  • ..and some other nasty effects

I have seen this error so often. Mostly at customers, that use normal work accounts for administrative tasks, or that add normal user accounts to admin groups during the setup of new profiles or clients. The solution is easy: Don’t do it!