Category Archives: Security

Exam prep & experience: Citrix NetScaler Advanced Topics: Security, Management, and Optimization (1Y0-340)

In May 2018, Citrix released their new Citrix Certified Expert – Networking certification, which completet the networking certification path at the upper end (blog post on training.citrix.com). The track starts with the Associate (CCA-N), the lower-level certification is a requirement for achieving the higher-level certification, continues with the Professional (CCP-N), and ends with the Expert (CCE-N) certification. This is pretty cool, and I’m very happy that Citrix now offers the CCE-N, because the expert-level certification was missing all the time.

kmicican/ pixabay.com/ Creative Commons CC0

Everything is cool… except you have passed exam 1Y0-351 to gain your CCP-N. In this case, you have to pass 1Y0-340 until Dec 31 2018. Otherwise you have to start with the CCA-N, after the validity period of your CCP-N is over (3y after passing the exam).

Bad move, Citrix, bad move. I’m really disappointed. I passed 1Y0-351 in Nov 2017, and now, 12 months later, I have to book, pay, and pass 1Y0-340 if I not want to start with a CCA-N in Nov 2020. Bad move, Citrix, bad move!

Exam 1Y0-340 is titled as “Citrix NetScaler Advanced Topics: Security, Management, and Optimization”, where as 1Y0-351 was titeld as “Citrix NetScaler 10.5 Essentials and Networking”. You can assume that more in-depth knowledge is needed to pass the exam, as it was necessary for 1Y0-351. Note the “Advanced Topics” in the exam title.

But what are these “advanced topics”?  According to the exam prep guide, the perfect candidate for the 1Y0-340 exam can deploy and/or manage

  • Citrix NetScaler Application Firewall (AppFirewall) to secure application access in a Citrix NetScaler 12 environment, as well as
  • NetScaler Management and Analytics System (NMAS) to administer a Citrix NetScaler environment, or
  • Optimize NetScaler-managed application delivery traffic

Citrix NetScaler Application Firewall (AppFirewall)

You should take an in-depth look at these topics:

  • Application Firewall Overview
  • Application Firewall Profiles and Policies
  • Regular Expression
  • Attacks and Protections
  • Monitoring and Troubleshooting
  • Security and Filtering

NetScaler Management and Analytics System (NMAS)

  • NetScaler MAS: Introduction and Configuration
  • Managing and Monitoring NetScaler Instances
  • Managing NetScaler Configurations
  • NetScaler Web Logging

Optimize NetScaler-managed application delivery traffic

  • Integrated Caching
  • Front-End Optimization
  • Tuning and Optimizations

How to prep?

The exam prep guide referres to the NetScaler documentation, as also to training material. Unfortunately I don’t have access to the newer training material, only to the training material from my CNS-220 course. But hey: At least we have tons of publically available NetScaler 12.0 documentation available!

The exam prep guide has a section in which Citrix outlines sections, objectives and references. You will find links to the NetScaler 12.0 documentation, as well as knowledge base articles, or blog posts. Go through it. Read it carefully!

The exam prep guide also outlines the section titles and weights. Two areas stand out:

  • Section 4: Attacks and Protections, and
  • Section 8: Managing and Monitoring NetScaler Instances

The section weights are directly map to the number of questions in the exam. If the exam has 60 questions, and section 4 has a weight of 21%, at least 12 questions will relate to “Attacks and Protections”.

How did it go?

First things first: I passed with a good score. The exam had 62 questions and I needed at least 62% to pass the exam. I passed with 82%. As a non-native English speaker that took the exam in a country where english is a foreign language, I got 30 minutes extra, resulting in 120 minutes for 62 questions. Plenty of time…

What should I say? It was a multiple choice test. Read the questions carefully. The exam guide did not lie to me. It came pretty close to the topics that were described in the guide. For most questions, my first “educated guess” was right. Sometimes, the least dumb answer seemed to be correct. ;)

It was a bit frustrating that Citrix has changed product names. NetScaler is no “Application Delivery Controller”, MAS is now known as “Citrix Application Delivery Management”. There was a button which showed a mapping table “old name – new name”.

If you are experienced with Citrix ADC deployments and configuration, I think the exam prep guide is enough to pass the exam.

Good luck!

Using Let’s Encrypt DNS-01 challenge validation with local BIND instance

I’m using Let’s Encrypt certificates for a while now. In the past, I used the standalone plugin (TLS-SNI-01) to get or renew my certificates. But now I switched to the DNS plugin. I run my own name servers with BIND, so it was a very low hanging fruit to get this plugin to work.

Clker-Free-Vector-Images/ pixabay.com/ Creative Commons CC0

To get or renew a certificate, you need to provide some kind of proof that you are requesting the certificate for a domain that is under your control. No certificate authority (CA) wants to be the CA, that hands you out a certificate for google.com or amazon.com…

The DNS-01 challenge uses TXT records in order to validate your ownership over a certain domain. During the challenge, the Automatic Certificate Management Environment (ACME) server of Let’s Encrypt will give you a value that uniquely identifies the challenge. This value has to be added with a TXT record to the zone of the domain for which you are requesting a certificate. The record will look like this:

This record is for a wildcard certificate. If you want to get a certificate for a host, you can add one or more TXT records like this:

There is a IETF draft about the ACME protocol. Pretty interesting read!

Configure BIND for DNS-01 challenges

I run my own name servers with BIND on FreeBSD. The plugin for certbot automates the whole DNS-01 challenge process by creating, and subsequently removing, the necessary TXT records from the zone file using RFC 2136 dynamic updates.

First of all, we need a new TSIG (Transaction SIGnature) key. This key is used to authorize the updates.

This key has to be added to the named.conf. The key is in the .key file.

The key is used to authroize the update of certain records. To allow the update of TXT records, which are needed for the challenge, add this to the zone part of you named.con.

The records start always with _acme-challenge.domainname.

Now you need to create a config file for the RFC2136 plugin. This file also includes the key, but also the IP of the name server. If the name server is running on the same server as the DNS-01 challenge, you can use 127.0.0.1 as name server address.

Now we have everything in place. This is a --dry-run  from on of my FreeBSD machines.

This is a snippet from the name server log file at the time of the challenge.

You might need to modify the permissons for the directory which contains the zone files. Usually the name server is not running as root. In my case, I had to grant write permissions for the “bind” group. Otherwise you might get “permission denied”.

 

CloudFlare API v4 and Fail2ban: Fixing the unban action

In January 2017, I wrote an article about how to protect your WordPress blog using the WP Fail2Ban plugin, fail2ban on your Linux/ FreeBSD host, and CloudFlare. Back then, the fail2ban was using the CloudFlare API V1, which was already deprecated since November 2016.

Free-Photos/ pixabay.com/ Creative Commons CC0

Although the actions were updated later to use CloudFlare API V4, I still had problems with the unbaning of IP addresses. IP addresses were banned, but the unban action failed. 

This is the unban action, which is included in fail2ban (taken from fail2ban-0.10.3.1 which is shipped with FreeBSD 11.1-RELEASE-p10):

And this is the unban action, which finally solved this issue:

I found the solution at serverfault.com. The only difference is an additional tr -d '\n'  in the last line of the statement. Kudos to Jake for fixing this!

To prevent the action file to being overwritten, you should copy the original cloudflare.conf  located in the  action.d  directory, e.g. to mycloudflare.conf , and use the copied action file in your fail definition.

Windows Network Policy Server (NPS) server won’t log failed login attempts

This is just a short, but interesting blog post. When you have to troubleshoot authentication failures in a network that uses Windows Network Policy Server (NPS), the Windows event log is absolutely indispensable. The event log offers everything you need. The success and failure event log entries include all necessary information to get you back on track. If failure events would be logged…

geralt/ pixabay.com/ Creative Commons CC0

Today, I was playing with Alcatel-Lucent Enterprise OmniSwitches and Access Guardian in my lab. Access Guardian refers to the some OmniSwitch security functions that work together to provide a dynamic, proactive network security solution:

  • Universal Network Profile (UNP)
  • Authentication, Authorization, and Accounting (AAA)
  • Bring Your Own Device (BYOD)
  • Captive Portal
  • Quarantine Manager and Remediation (QMR)

I have planned to publish some blog posts about Access Guardian in the future, because it is a pretty interesting topic. So stay tuned. :)

802.1x was no big deal, mac-based authentication failed. Okay, let’s take a look into the event log of the NPS… okay, there are the success events for my 802.1x authentication… but where are the failed login attempts? Not a single one was logged. A short Google search showed me the right direction.

Failed logon/ logoff events were not logged

In this case, the NPS role was installed on a Windows Server 2016 domain controller. And it was a german installation, so the output of the commands is also in german. If you have an OS installed in english, you must replace “Netzwerkrichtlinienserver” with “Network Policy Server”.

Right-click the PowerShell Icon and open it as Administrator. Check the current settings:

As you can see, only successful logon and logoff events were logged.

The option /success:enable /failure:enable activeates the logging of successful and failed logon and logoff attempts.

Replace SSL certificates on Citrix NetScaler using the CLI

Sometimes you have to replace SSL certificates instead of updating them, e.g. if you switch from a web server SSL certificate to a wildcard certificate. The latter was my job today. In my case, the SSL certificate was used in a Microsoft Exchange 2016 deployment, and the NetScaler configuration was using multiple virtual servers. I’m using this little script for my NetScaler/ Exchange deployments.

skylarvision/ pixabay.com/ Creative Commons CC0

When using multiple virtual servers, replacing a SSL certificate using the GUI can be challenging, because you have to navigate multiple sites, click here, click there etc. Using the CLI, the same task is much easier und faster. I like the Lean mindset, so I’m trying to avoid “waste”, in this case, “waste of time”.

Update or replace?

There is a difference between updating or replacing of certificates. When using the same CSR and key as for the expired certificate, you can update the certificate. If you use a new certificate/ key pair, you have to replace it. Replacing a certificate  includes the unbinding of the old, and binding the new certificate.

Replacing a certificate

The new certificate usually comes as a PFX (PKCS#12) file. After importing it, you have to install (create) a new certificate/ key pair.

Do yourself a favor and add the expiration date to the name of the certificate/ key pair.

Now you can unbind the old, and bind the new certificate. Please note, that this causes a short outage of your service!

SSL Cert Unbind Causing NetScaler Crash

You should check what NetScaler software release you are running. There is a bug, which is fixed in 12.0 build 57.X, which causes the NetScaler appliance to crash if a SSL certificate is unbound and a SSL transaction is running. Check CTX230965 for more details.

Bypass stateful firewall on a Sophos XG

Usually, bypassing a firewall is not the best idea. But sometimes you have to. One case, where you want to bypass a firewall, is asymmetric routing.

MichaelGaida/ pixabay.com/ Creative Commons CC0

What is asymmetric routing? Imagine a scenario with two routers on the same network. One router offeres access to the internet, the other router provides access to other sites with site-2-site VPN tunnels.

Asymmetric Routing

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Host 1 uses R1 as default gateway. R1 has static routes configured to the networks reachable over the VPN, or it has learned them dynamically using a routing protocol from R2. A packet from host 1 arrives at R1, is routed to R2, and is sent over the VPN tunnel. The answer to this packet arrives at R2, and is sent directly to host 1, because host 1 is the destination. This works because R2 and host 1 are on the same network. This is asymmetric routing, because request and answer go different ways.

In case of routing, this is not a problem. But if R1 is a firewall, this firewall might be stubborn, because it does not see the whole traffic.

Bypass the stateful firewall

I recently had such a setup due to some technical debts. The firewall dropped that “Invalid Traffic”. Fortunately, there is a way to bypass the statefull firewall. You can create advanced firewall rules using the CLI. There is no way to create these rules using the GUI. And this only applies to the Sophos XG (former Cyberoam products).

Login to the device console and select option 4. Then enter on the console the following commands, one per destination:

Make sure that you have a static or dynamically learned route to the networks. This is not a routing entry, it only tells the firewall what traffic should bypass the stateful firewall.

DOT1X authentication failed on HPE OfficeConnect 1920 switches

The last two days, I have supported a customer during the implementation of 802.1x. His network consisted of HPE/ Aruba and some HPE Comware switches. Two RADIUS server with appropriate policies was already in place. The configuration and test with the ProVision based switches was pretty simple. The Comware based switches, in this case OfficeConnect 1920, made me more headache.

blickpixel/ pixabay.com/ Creative Commons CC0

The customer had already mac authentication running, so all I had to do, was to enable 802.1x on the desired ports of the OfficeConnect 1920. The laptop, which I used to test the connection, was already configured and worked flawless if I plugged it into a 802.1x enabled port on a ProVision based switch. The OfficeConnect 1920 simply wrote a failure to its log and the authentication failed. The RADIUS server does not logged any failure, so I was quite sure, that the switch caused the problem.

After double-checking all settings using the web interface of the switch, I used the CLI to check some more settings. Unfortunately, the OfficeConnect 1920 is a smart-managed switch and provides only a very, very limited CLI. Fortunately, there is a developer access, enabling the full Comware CLI. You can enable the full CLI by entering

after logging into the limited CLI. You can find the password using your favorite internet search engine. ;)

Solution

While poking around in the CLI, I stumbled over this option, which is entered in the interface context:

RADIUS is the authentication domain, which was used on this switch. The command specifies, that the authentication domain RADIUS has to be for 802.1x authentication requests. Otherwise the switch would use the default authentication domain SYSTEM, which causes, that the switch tries to authenticate the user against the local user database.

I have not found any way to specify this setting using the web GUI! If you know how, of if you can provide additional information about this “issue”, please leave a comment.

Demystifying “Interfaces on which heartbeats are not seen”

By accident, I found a heartbeat/ VLAN issue on a NetScaler cluster at one of my customers. The NetScaler ADC appliances have three interfaces connected to a switch stack. Two of the three interfaces were configured as a channel (LAG). This is a snippet from the config:

On the switch stack, the port to which interface 1/3 is connected, is configured as an access port. The ports, to which the channel is connected, is configured as a trunk port with some permitted VLANs. The customer is using HPE Comware based switches. The terminology is the same for Cisco. If you use HPE ProVision or Alcatel Lucent Enterprise, translate “access” to “untagged” and “trunk” to “tagged”. Because the channel is configured as a trunk port on the switch, the tagall option was set.

Issue

While examining the output of  show ha node I saw this:

Because interface 1/3 was not affected, this had to be a VLAN issue. During the initial troubleshooting, I was able to discover heartbeat packets in VLAN 1 and in VLAN 10.

Solution

The solution was easy: Remove the tagged option for VLAN 10 on LA/1.

instead of

Because of the configured tagall  option, all packets sourced by LA/1 are tagged with the corrosponding VLAN ID. But because it’s now explicitly configured without a tag for VLAN 10, VLAN 10 is now also the native VLAN for LA/1.

Now the NetScaler was sending heartbeat packets with a tag for VLAN 10, and the issue was solved.

Explanation

Heartbeat packets are always send without a VLAN tag (untagged). There are two exceptions:

  • The NSVLAN is configured with a specific VLAN ID, or
  • an interface used for hearbeats is configured with the tagall

In this case, the heartbeat packets are tagged with the ID of the native VLAN ID of the interface. A show interface of the channel showed, that the channel was using VLAN 1 as the native VLAN.

How does the NetScaler determine the native VLAN for an interface? The native VLAN is the VLAN, to which an interface is bound untagged. An interface can only be bound untagged to a single VLAN. But it can be bound tagged to multiple VLANs.

If you take a look at the config snippet at the top of this blog post, you might notice, that interface 1/3 is bound untagged to VLAN 10. So this is the native VLAN for interface 1/3. But this interface is not using the tagall  option. Therefore, heartbeat packets are not tagged. The channel LA/1 is bound tagged to VLAN 10. But it was also bound to VLAN 1, without the tagged  option. This caused, that VLAN 1 was used as the native VLAN for channel LA/1. And because LA/1 is configured with the tagall  option, the heartbeats were tagged with a tag for VLAN 1. That’s why I was able to see the heartbeats, that were send over channel LA/1, in VLAN 1.

In the end, the NetScaler appliances were sending heartbearts from interface 1/3 to VLAN 10, and from channel LA/1 to VLAN 1. This caused the message “Interfaces on which heartbeats are not seen: LA/1”.

Security: If it doesn’t hurt, you’re doing it wrong!

The Informationsverbund Berlin-Bonn (IVBB), the secure network of the german government , was breached by an unknown hacker group. Okay, a secure government network might be a worthy target for an attack, but your network not, right? Do you use the same password for multiple accounts? There were multiple massive data breaches in the past. Have you ever checked if your data were also compromised? I can recommend haveibeenpwned.com. If you want to have some fun, scan GitHub for -----BEGIN RSA PRIVATE KEY-----. Do you use a full disk encryption on your laptop or PC? Do you sign and/ or encrypt emails using S/MIME or PGP? Do you use different passwords for different services? Do you use 2FA/ MFA to secury importan services? Do you never work with admin privileges when doing normal office tasks? No? Why? Because it’s uncomfortable to do it right, isn’t it?

My focus is on infrastructure, and I’m trying to educate my customers that hey have to take care about security. It’s not the missing dedicated management network, or the usage of self signed certificates that makes an infrastructre unsecure. Mostly it’s the missing user management, the same password for different admin users, doing office work with admin privileges, or missing security patches because of “never touch a running system”, or “don’t ruin my uptime”. I don’t khow how often I heard the story of ransomware attacks, that were caused by admins opening email attachments with admin privileges…

My theory

Security must approach infinitely near the point, where it becomes unusable.

Correlation Between Security and Comfort

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Security is nothing you can take care about later. It has to be part of the design. It has to be part of the processes. Most security incidents doesn’t happen because of 0-day exploits. It’s because of default passwords for admin accounts, missing security patches, and because of lazy admins or developers.

Don’t be lazy. Do it right. Even if it’s uncomfortable.

NetScaler native OTP does not work for users with many group memberships

Some days ago, I have implemented one-time passwords (OTP) for NetScaler Gateway for one of my customers. This feature was added with NetScaler 12, and it’s a great way to secure NetScaler Gateway with a native NetScaler feature. Native OTP does not need any third party servers. But you need a NetScaler Enterprise license, because nFactor Authentication is a requirement.

To setup NetScaler native OTP, I followed the availbe guides on the internet.

The setup is pretty straightforward. But I used the AD extensionAttribute15  instead of userParameters, because my customer already used userParameters  for something else. Because of this, I had to change the search filter from  userParameters>=#@  to extensionAttribute15>=#@ .

Everything worked as expected… except for some users, that could not register their devices properly. They were able to register their device, but a test of the OTP failed. After logoff and logon, the registered device were not available anymore. But the device was added to the extensionAttribute. While I was watching the nsvpn.log with tail -f , I discovered that the built group string for $USERNAME  seemed to be cut off (receive_ldap_user_search_event). My first guess was, that the user has too many group memberships, and indeed, the users was member for > 50 groups. So I copied the user, and the copied user had the same problem. I removed the copied user from some groups, and at some point the test of the OTP worked (on the /manageotp website).

With this information, I quickly stumbled over this thread: netscaler OTP not woring for certain users. This was EXACTLY what I discovered. The advised solution was to change the “Group Attribute” from memberOf  to userParameter , or in my case, extensionAttribute15. Problem solved!