Tag Archives: networking

Failed to connect to IKEv2 VPN using iPhone USB tethering

Usually I tend to use the iPhone WiFi hotspot feature. But lately, I had to switch to USB tethering, because I had to work a whole workday using the hotspot feature. USB tethering saves battery and the connection was more reliable for me. Please note, that you need to install iTunes to use USB tethering, because the necessary Ethernet driver is only available with iTunes. Without this driver, Windows won’t recorgnize the iPhone as an Ethernet connection.

While using USB tethering I noticed that my IKEv2 VPN connection to my office wasn’t working. I use the native Windows 11 VPN client. At the office operates a WatchGuard T80 firewall with TotalSecurity Subscription. Interestingly the VPN connection was working fine with the WiFi hotspot. I double-checked it with another IKEv2 connection to a customer. This connection showed the same issue. Won’t work with USB tethering, connection was fine using the WiFi hotspot.

Troubleshooting

First things first: The traffic log showed some interesting facts. The connection attempt was recognized by the firewall.

2023-06-24 16:22:24iked(212.117.xx.yy<->91.41.xx.yy)The peer is behind NAT
2023-06-24 16:22:24iked(212.117.xx.yy<->91.41.xx.yy)The local is NOT behind NAT
2023-06-24 16:22:24iked(212.117.xx.yy<->91.41.xx.yy)Processed IKE_SA_INIT request message successfully
2023-06-24 16:22:24iked(212.117.xx.yy<->91.41.xx.yy)'IKE_SA_INIT response' message created successfully. length:496
2023-06-24 16:22:24iked(212.117.xx.yy<->91.41.xx.yy)Sent out IKE_SA_INIT response message (msgId=0) from 212.117.xx.yy:500 to 91.41.xx.yy:64172 for 'WG Default IKEv2 Gateway' gateway endpoint successfully.


2023-06-24 16:22:43iked(212.117.xx.yy<->80.187.xx.yy)The peer is behind NAT
2023-06-24 16:22:43iked(212.117.xx.yy<->80.187.xx.yy)The local is NOT behind NAT
2023-06-24 16:22:43iked(212.117.xx.yy<->80.187.xx.yy)Processed IKE_SA_INIT request message successfully
2023-06-24 16:22:43iked(212.117.xx.yy<->80.187.xx.yy)'IKE_SA_INIT response' message created successfully. length:496
2023-06-24 16:22:43iked(212.117.xx.yy<->80.187.xx.yy)Sent out IKE_SA_INIT response message (msgId=0) from 212.117.xx.yy:500 to 80.187.xx.yy:500 for 'WG Default IKEv2 Gateway' gateway endpoint successfully.

The uppper connection attempt was successfull. You might recorgnize the port used for the destination IP for the IKE_SA_INIT. The lower attempt was using USB tethering and it wasn’t successfull. In this case the connection attempt was made to 500/udp.

This is a Wireshark capture of the unsuccessful connection attempt.

This capture is from a successful attempt.

You will notice the difference after the IKE_AUTH MID=01 Initiator Request (Frame 620 and 1248). The response of the firewall is not received by the client. This behavior often is caused by MTU problems. A quick Google search showed evidence, that USB tethering might behave different from WiFi hotspot.

Solution

Connect your iPhone using an USB-to-Lightning cable. A new Ethernet device should come up. Open an elevated CMD and use the following command to adjust the MTU for the Apple Ethernet device.

PS C:\Users\adm-terlisten> netsh interface ipv4 show interfaces

Idx     Met         MTU          State                Name
---  ----------  ----------  ------------  ---------------------------
  1          75  4294967295  connected     Loopback Pseudo-Interface 1
 10          50        1500  disconnected  WLAN
 19          25        1500  disconnected  LAN-Verbindung* 1
  9          25        1500  disconnected  LAN-Verbindung* 2
 11          25        1500  connected     Ethernet 2
  2          65        1500  disconnected  Bluetooth-Netzwerkverbindung
 20          35        1432  connected     iPhone Hotspot

PS C:\Users\adm-terlisten> netsh interface ipv4 set subinterface "iPhone Hotspot" mtu=1472 store=persistent
OK.

I renamed my Apple Ethernet device, in your case it could something like “Ethernet 4” or similar. That’s it. Enjoy your VPN connection.

Fun fact: Cisco AnyConnect with an IKEv2 connection had no problem to any time, regardless if WiFi hotspot or USB tethering. I encountered the problem only with the native Windows VPN client.

Windows NPS – Authentication failed with error code 16

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Today, a customer called me and reported, on the first sight, a pretty weired error: Only Windows clients were unable to login into a WPA2-Enterprise wireless network. The setup itself was pretty simple: Cisco Meraki WiFi access points, a Windows Network Protection Server (NPS) on a Windows Server 2016 Domain Controller, and a Sophos SG 125 was acting as DHCP for different WiFi networks.

Pixybay / pixabay.com/ Pixabay License

Windows clients failed to authenticate, but Apple iOS, Android, and even Windows 10 Tablets had no problem.

The following error was logged into the Windows Security event log.

Authentication Details:
Connection Request Policy Name: Use Windows authentication for all users
Network Policy Name: Wireless Users
Authentication Provider: Windows
Authentication Server: domaincontroller.domain.tld
Authentication Type: PEAP
EAP Type: -
Account Session Identifier: -
Logging Results: Accounting information was written to the local log file.
Reason Code: 16
Reason: Authentication failed due to a user credentials mismatch. Either the user name provided does not map to an existing user account or the password was incorrect.

The credentials were definitely correct, the customer and I tried different user and password combinations.

I also checked the NPS network policy. When choosing PEAP as authentication type, the NPS needs a valid server certificate. This is necessary, because the EAP session is protected by a TLS tunnel. A valid certificate was given, in this case a wildcard certificate. A second certificate was also in place, this was a certificate for the domain controller from the internal enterprise CA.

It was an educated guess, but I disabled the server certificate check for the WPA2-Enterprise conntection, and the client was able to login into the WiFi. This clearly showed, that the certificate was the problem. But it was valid, all necessary CA certificates were in place and there was no reason, why the certificate was the cause.

The customer told me, that they installed updates on friday (today is monday), and a reboot of the domain controller was issued. This also restarted the NPS service, and with this restart, the Wildcard certificate was used for client connections.

I switched to the domain controller certificate, restarted the NPS, and all Windows clients were again able to connect to the WiFi.

Lessons learned

Try to avoid Wildcard certificates, or at least check the certificate that is used by the NPS, if you get authentication error with reason code 16.

EAPoL forwarding on NEC VoIP phones

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

A customer is running their PCs behind their VoIP phones. Nothing unusual, most VoIP phones I know have an embedded ethernet switch, so that you only need one cable to connect PC and VoIP phone to your network.

Martinelle/ pixabay.com/ Creative Commons CC0

As part of a network security project, my colleague and I implemented IEEE 802.1X port-based Network access control at one of our customers networks. The setup consists of multiple Alcatel-Lucent Enterprise OmniSwitches (6450-P10 and 6860/E) and Aruba ClearPass.

We noticed, that mac-address based authentication worked all the time, but 802.1x fails constantly if the client was connected to a VoIP phone (NEC DT700). The phones does not do any port authentication. We use a device classification rule and User Network Profiles to get them to their correct VLAN. But the connected PCs should do a 802.1x based port authentication.

Wireshark FTW!

We used Wireshark to take a look at the communication. We created a packet trace on a client behind a VoIP phone, and we mirrored the traffic of the port, to which the phone was connected. Our assumption was that the VoIP phones drop the EAP packets from the connected PC.

This is a packet trace from my ThinkPad X250 which was connected to the phone.

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

You can see the repeating “Request, Identity” from the switch, and the answer from my laptop (Response, Identity). The destination for the response is a multicast mac-address. But this frame was not captured behind the VoIP phone! It was missing. On the packet trace, that was created my mirroring the switch port to which the phone was connected, the “Request, Identity” was seen, but not the “Response, Identity”. The phone was dropping the EAP packets of my laptop!

RTFM!

The customer called the company who was maintaining the phones. But they did not understood our problem, so they enabled 802.1x on the phones. We disabled this instantly again.

I decided to take a look into the manual of the NEC DT700 and I found a point called “EAPoL forwarding” in the advanced network settings. After enabling this setting, EAP started working instantly.

Patrick Terlisten/ vcloudnine.de/ Creative Commons CC0

This is again a packet trace from my laptop, taken while it was connected to a VoIP phone. As you can see, the last EAP packet is “Success”!

EAPoL forwarding did the trick. :)

Windows Network Policy Server (NPS) server won’t log failed login attempts

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

This is just a short, but interesting blog post. When you have to troubleshoot authentication failures in a network that uses Windows Network Policy Server (NPS), the Windows event log is absolutely indispensable. The event log offers everything you need. The success and failure event log entries include all necessary information to get you back on track. If failure events would be logged…

geralt/ pixabay.com/ Creative Commons CC0

Today, I was playing with Alcatel-Lucent Enterprise OmniSwitches and Access Guardian in my lab. Access Guardian refers to the some OmniSwitch security functions that work together to provide a dynamic, proactive network security solution:

  • Universal Network Profile (UNP)
  • Authentication, Authorization, and Accounting (AAA)
  • Bring Your Own Device (BYOD)
  • Captive Portal
  • Quarantine Manager and Remediation (QMR)

I have planned to publish some blog posts about Access Guardian in the future, because it is a pretty interesting topic. So stay tuned. :)

802.1x was no big deal, mac-based authentication failed. Okay, let’s take a look into the event log of the NPS… okay, there are the success events for my 802.1x authentication… but where are the failed login attempts? Not a single one was logged. A short Google search showed me the right direction.

Failed logon/ logoff events were not logged

In this case, the NPS role was installed on a Windows Server 2016 domain controller. And it was a german installation, so the output of the commands is also in german. If you have an OS installed in english, you must replace “Netzwerkrichtlinienserver” with “Network Policy Server”.

Right-click the PowerShell Icon and open it as Administrator. Check the current settings:

Windows PowerShell
Copyright (C) 2016 Microsoft Corporation. Alle Rechte vorbehalten.

PS C:\Windows\system32> auditpol /get /subcategory:"Netzwerkrichtlinienserver"
Systemüberwachungsrichtlinie
Kategorie/Unterkategorie Einstellung
An-/Abmeldung
Netzwerkrichtlinienserver Erfolg

As you can see, only successful logon and logoff events were logged.

PS C:\Windows\system32> auditpol /set /subcategory:"Netzwerkrichtlinienserver" /success:enable /failure:enable
Der Befehl wurde erfolgreich ausgeführt.
PS C:\Windows\system32> auditpol /get /subcategory:"Netzwerkrichtlinienserver"
Systemüberwachungsrichtlinie
Kategorie/Unterkategorie Einstellung
An-/Abmeldung
Netzwerkrichtlinienserver Erfolg und Fehler

The option /success:enable /failure:enable activeates the logging of successful and failed logon and logoff attempts.

Bypass stateful firewall on a Sophos XG

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Usually, bypassing a firewall is not the best idea. But sometimes you have to. One case, where you want to bypass a firewall, is asymmetric routing.

MichaelGaida/ pixabay.com/ Creative Commons CC0

What is asymmetric routing? Imagine a scenario with two routers on the same network. One router offeres access to the internet, the other router provides access to other sites with site-2-site VPN tunnels.

Asymmetric Routing

Patrick Terlisten/ www.vcloudnine.de/ Creative Commons CC0

Host 1 uses R1 as default gateway. R1 has static routes configured to the networks reachable over the VPN, or it has learned them dynamically using a routing protocol from R2. A packet from host 1 arrives at R1, is routed to R2, and is sent over the VPN tunnel. The answer to this packet arrives at R2, and is sent directly to host 1, because host 1 is the destination. This works because R2 and host 1 are on the same network. This is asymmetric routing, because request and answer go different ways.

In case of routing, this is not a problem. But if R1 is a firewall, this firewall might be stubborn, because it does not see the whole traffic.

Bypass the stateful firewall

I recently had such a setup due to some technical debts. The firewall dropped that “Invalid Traffic”. Fortunately, there is a way to bypass the statefull firewall. You can create advanced firewall rules using the CLI. There is no way to create these rules using the GUI. And this only applies to the Sophos XG (former Cyberoam products).

Login to the device console and select option 4. Then enter on the console the following commands, one per destination:

Console> set advanced-firewall bypass-stateful-firewall-config add source_network 192.168.99.0 source_netmask 255.255.255.0 dest_network 192.168.20.0 dest_netmask 255.255.255.0

Make sure that you have a static or dynamically learned route to the networks. This is not a routing entry, it only tells the firewall what traffic should bypass the stateful firewall.

DOT1X authentication failed on HPE OfficeConnect 1920 switches

This posting is ~5 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

The last two days, I have supported a customer during the implementation of 802.1x. His network consisted of HPE/ Aruba and some HPE Comware switches. Two RADIUS server with appropriate policies was already in place. The configuration and test with the ProVision based switches was pretty simple. The Comware based switches, in this case OfficeConnect 1920, made me more headache.

blickpixel/ pixabay.com/ Creative Commons CC0

The customer had already mac authentication running, so all I had to do, was to enable 802.1x on the desired ports of the OfficeConnect 1920. The laptop, which I used to test the connection, was already configured and worked flawless if I plugged it into a 802.1x enabled port on a ProVision based switch. The OfficeConnect 1920 simply wrote a failure to its log and the authentication failed. The RADIUS server does not logged any failure, so I was quite sure, that the switch caused the problem.

DOT1X/6/DOT1X_AUTH_FAILURE: -IfName=GigabitEthernet1/0/1-UserName=DOM\USERNAME; DOT1X authentication failed

After double-checking all settings using the web interface of the switch, I used the CLI to check some more settings. Unfortunately, the OfficeConnect 1920 is a smart-managed switch and provides only a very, very limited CLI. Fortunately, there is a developer access, enabling the full Comware CLI. You can enable the full CLI by entering

_cmdline-mode on

after logging into the limited CLI. You can find the password using your favorite internet search engine. ;)

Solution

While poking around in the CLI, I stumbled over this option, which is entered in the interface context:

[1920-GigabitEthernet1/0/1] dot1x mandatory-domain RADIUS

RADIUS is the authentication domain, which was used on this switch. The command specifies, that the authentication domain RADIUS has to be for 802.1x authentication requests. Otherwise the switch would use the default authentication domain SYSTEM, which causes, that the switch tries to authenticate the user against the local user database.

I have not found any way to specify this setting using the web GUI! If you know how, of if you can provide additional information about this “issue”, please leave a comment.

HPE Networking expert level certifications

This posting is ~6 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

A couple of days ago, I took the HP0-Y47 exam “Deploying HP FlexNetwork Core Technologies”. It was one of two required exams to achive the HPE ASE – Data Center Network Integrator V1, and the HP ASE – FlexNetwork Integrator V1 certification. It was a long planned upgrade to my HP ATP certification, and it is a necessary certification for the HPE partner status of my employer.

You might find it confusing that I’m talking about an HP ASE and a HPE ASE. That is not a typo. The HP ASE was released prior the HP/ HPE split. The HPE ASE was released after the split in HP and HPE.

The HP/ HPE ATP is a professional level certification, comparable to the Cisco Certified Network Associate (CCNA). The HP/ HPE ASE is an expert level certification, so the typical candidate for a HP/ HPE ASE certification is a professional with three to five years experience in designing and architecting complex enterprise-level networks.

Requirements

There are different ways to achieve this certification. Regardless of the way you chose, you need a certification from which you can upgrade. This does not have to be a HP/ HPE certification! If you hold a valid CCNA/ CCNP or JNCIP-ENT, you can upgrade from this certification without the need of a valid HP/ HPE ATP Networking certification.

If you want to earn the HPE ASE – Data Center Network Integrator V1, and the HP ASE – FlexNetwork Integrator V1 certification in a single step, you need at least one of these certifications:

  • HP ATP – FlexNetwork Solutions V3
  • HPE ATP – Data Center Solutions V1

Or if you want to upgrade from a non-HP/ HPE certification:

  • Cisco – CCNP (any CCNP regardless of technology)
  • Cisco – Certified Design Professional (CCDP)
  • Juniper – JNCIP-ENT

Now you need to pass two exams:

HP2-Z34 (Building HP FlexFabric Data Centers)

The HP2-Z34 exam focuses on deployment and implementation of HPE FlexFabric Data Center solutions. Therefore, the exams covers topics like

  • Multitenant Device Context (MDC)
  • Datacenter Bridging (DCB)
  • Multiprotocol Label Switching (MPLS)
  • Fibre Channel over Ethernet (FCoE)
  • Ethernet Virtual Interconnect (EVI),
  • Multi-Customer Edge (MCE),
  • Transparent Interconnection of Lots of Links (TRILL), and
  • Shortest Path Bridging Mac-in-Mac mode (SPBM).

HPE offers a study guide to prepare for this exam: Building HP FlexFabric Data Centers (HP2-Z34 and HP0-Y51). I used this guide to prepare for the exam (eBook). The guide was of an average quality. Its sufficient to prepare for the exam, but I used other materials to get a better understanding of some topics.

HP2 exams are web-based exams. To pass the HP2-Z34 exam, I had to answer 60 questions in 105 minutes, with a passing score of 70%. The exam was quite demanding, especially if you don’t have much real-world experience with some of the covered topics.

HP0-Y47 (Deploying HP FlexNetwork Core Technologies)

The HP0-Y47 exam covers the configuration, implementation, and the troubleshoot enterprise level HPE FlexNetwork solutions. The exam covers different topics, e.g.

  • Quality of Service (QoS)
  • redundancy (VRRP, Stacking)
  • multicast routing (IGMP, PIM)
  • dynamic routing (OSPF, BGP)
  • ACLs, and
  • port authentication/ port security (Mac-auth, Web-auth, 802.1x)

I used the HP ASE FlexNetwork Solutions Integrator (HP0-Y47) study guide to prepre myself for the exam. Unfortunately, it had the same average quality as the HP2 Z34 guide: Good enough to pass the exam, but don’t expect to much.

HP0-Y47 is a proctored exam. I had to answer 55 questions in 150 minutes, with a passing score of 65%. The exam is not very hard, if you were familiar with the covered topics. Experience with ProVision and Comware is absolutely necessary, because both platforms have their peculiarities, e.g. processing of ACLs, differences in Stacking technologies, commands, STP support etc.

It took me some time to prepare for both exams, despite the fact that I work with ProVision and Comware Switches every day. So I’m pretty happy that I passed both exams on the first try.

vSphere Distributed Switch health check fails on HPE Comware switches

This posting is ~6 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

During the replacement of some VMware ESXi hosts at a customer, I discovered a recurrent failure of the vSphere Distributed Switch health checks. A VLAN and MTU mismatch was reported. On the physical side, the ESXi hosts were connected to two HPE 5820 switches, that were configured as an IRF stack. Inside the VMware bubble, the hosts were sharing a vSphere Distributed Switch.

cre8tive / pixelio.de

The switch ports of the old ESXi hosts were configured as Hybrid ports. The switch ports of the new hosts were configured as Trunk ports, to streamline the switch and port configuration.

Some words about port types

Comware knows three different port types:

  • Access
  • Hybrid
  • Trunk

If you were familiar with Cisco, you will know Access and Trunk ports. If you were familiar with HPE ProCurve or Alcatel-Lucent Enterprise, these two port types refer to untagged and tagged ports.

So what is a Hybrid port? A Hybrid port can belong to multiple VLANs where they can be untagged and tagged. Yes, multiple untagged VLANs on a port are possible, but the switch will need additional information to bridge the traffic into correct untagged VLANs. This additional information can be  MAC addresses, IP addresses, LLDP-MED etc. Typically, hybrid ports are used for in VoIP deployments.

The benefit of a Hybrid port is, that I can put the native VLAN of a specific port, which is often referred as Port VLAN identifier (PVID), as a tagged VLAN on that port. This configuration allows, that all dvPortGroups have a VLAN tag assigned, even if the VLAN tag represents the native VLAN of a switch port.

Failing health checks

A failed health check rises a vCenter alarm. In my case, a VLAN and MTU alarm was reported. In both cases, VLAN 1 was causing the error. According to VMware, the three main causes for failed health checks are:

  • Mismatched VLAN trunks between a vSphere distributed switch and physical switch
  • Mismatched MTU settings between physical network adapters, distributed switches, and physical switch ports
  • Mismatched virtual switch teaming policies for the physical switch port-channel settings.

Let’s take a look at the port configuration on the Comware switch:

#
interface Ten-GigabitEthernet1/0/9
 port link-mode bridge
 description "ESX-05 NIC1"
 port link-type trunk
 port trunk permit vlan all
 stp edged-port enable
#

As you can see, this is a normal trunk port. All VLANs will be passed to the host. This is an except from the display interface Ten-GigabitEthernet1/0/9  output:

 PVID: 1
 Mdi type: auto
 Port link-type: trunk
  VLAN passing  : 1(default vlan), 2-3, 5-7, 100-109
  VLAN permitted: 1(default vlan), 2-4094
  Trunk port encapsulation: IEEE 802.1q

The native VLAN is 1, this is the default configuration. Traffic, that is received and sent from a trunk port, is always tagged with a VLAN id of the originating VLAN – except traffic from the default (native) VLAN! This traffic is sent without a VLAN tag, and if frames were received with a VLAN tag, this frames will be dropped!

If you have a dvPortGroup for the default (native) VLAN, and this dvPortGroup is sending tagged frames, the frames will be dropped if you use a “standard” trunk port. And this is why the health check fails!

Ways to resolve this issue

In my case, the dvPortGroup was configured for VLAN 1, which is the default (native) VLAN on the switch ports.

There are two ways to solve this issue:

  • Remove the VLAN tag from the dvPortGroup configuration
  • Change the PVID for the trunk port

To change the PVID for a trunk port, you have to enter the following command in the interface context:

[ToR-Ten-GigabitEthernet1/0/9] port trunk pvid vlan 999

You have to change the PVID on all ESXi facing switch ports. You can use a non-existing VLAN ID for this.

vSphere Distributed Switch health check will switch to green for VLAN and MTU immediately.

Please note, that this is not the solution for all VLAN-related problems. You should make sure that you are not getting any side effects.

Demystifying “Interfaces on which heartbeats are not seen”

This posting is ~6 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

By accident, I found a heartbeat/ VLAN issue on a NetScaler cluster at one of my customers. The NetScaler ADC appliances have three interfaces connected to a switch stack. Two of the three interfaces were configured as a channel (LAG). This is a snippet from the config:

set channel LA/1 -tagall ON -throughput 0 -lrMinThroughput 0 -bandwidthHigh 0 -bandwidthNormal 0
...
bind vlan 10 -ifnum 1/3
bind vlan 10 -ifnum LA/1 -tagged
bind vlan 54 -ifnum LA/1 -tagged
bind vlan 55 -ifnum LA/1 -tagged

On the switch stack, the port to which interface 1/3 is connected, is configured as an access port. The ports, to which the channel is connected, is configured as a trunk port with some permitted VLANs. The customer is using HPE Comware based switches. The terminology is the same for Cisco. If you use HPE ProVision or Alcatel Lucent Enterprise, translate “access” to “untagged” and “trunk” to “tagged”. Because the channel is configured as a trunk port on the switch, the tagall option was set.

Issue

While examining the output of  show ha node I saw this:

Interfaces on which heartbeats are not seen : LA/1

Because interface 1/3 was not affected, this had to be a VLAN issue. During the initial troubleshooting, I was able to discover heartbeat packets in VLAN 1 and in VLAN 10.

Solution

The solution was easy: Remove the tagged option for VLAN 10 on LA/1.

bind vlan 10 -ifnum LA/1

instead of

bind vlan 10 -ifnum LA/1 -tagged

Because of the configured tagall  option, all packets sourced by LA/1 are tagged with the corrosponding VLAN ID. But because it’s now explicitly configured without a tag for VLAN 10, VLAN 10 is now also the native VLAN for LA/1.

> show channel

1)      Interface LA/1 (802.3ad Link Aggregate) #14
        flags=0x4100c020 <ENABLED, UP, AGGREGATE, UP, HAMON, HEARTBEAT, 802.1q, tagall>
        MTU=1500, native vlan=10, MAC=02:e0:ed:38:9d:d2, uptime 1362h58m51s

Now the NetScaler was sending heartbeat packets with a tag for VLAN 10, and the issue was solved.

Explanation

Heartbeat packets are always send without a VLAN tag (untagged). There are two exceptions:

  • The NSVLAN is configured with a specific VLAN ID, or
  • an interface used for hearbeats is configured with the tagall

In this case, the heartbeat packets are tagged with the ID of the native VLAN ID of the interface. A show interface of the channel showed, that the channel was using VLAN 1 as the native VLAN.

> show channel

1)      Interface LA/1 (802.3ad Link Aggregate) #14
        flags=0x4100c020 <ENABLED, UP, AGGREGATE, UP, HAMON, HEARTBEAT, 802.1q, tagall>
        MTU=1500, native vlan=1, MAC=02:e0:ed:38:9d:d2, uptime 1362h55m13s

How does the NetScaler determine the native VLAN for an interface? The native VLAN is the VLAN, to which an interface is bound untagged. An interface can only be bound untagged to a single VLAN. But it can be bound tagged to multiple VLANs.

If you take a look at the config snippet at the top of this blog post, you might notice, that interface 1/3 is bound untagged to VLAN 10. So this is the native VLAN for interface 1/3. But this interface is not using the tagall  option. Therefore, heartbeat packets are not tagged. The channel LA/1 is bound tagged to VLAN 10. But it was also bound to VLAN 1, without the tagged  option. This caused, that VLAN 1 was used as the native VLAN for channel LA/1. And because LA/1 is configured with the tagall  option, the heartbeats were tagged with a tag for VLAN 1. That’s why I was able to see the heartbeats, that were send over channel LA/1, in VLAN 1.

In the end, the NetScaler appliances were sending heartbearts from interface 1/3 to VLAN 10, and from channel LA/1 to VLAN 1. This caused the message “Interfaces on which heartbeats are not seen: LA/1”.

Citrix Certified Professional – Networking (CCP-N) exam experience

This posting is ~6 years years old. You should keep this in mind. IT is a short living business. This information might be outdated.

Last friday I passed the 1Y0-351 (Citrix NetScaler 10.5 Essentails and Networking) exam with a pretty good score. The exam was necessary, not only because I will do much more NetScaler projects in the future, but also because Citrix has made it mandatory to have a CCP-N in your company to to sell Citrix NetScaler.

Preparation

My employer booked me a 5-day course (CNS-220 Citrix NetScaler Essentials and Traffic Management). Very nice, although I already had experience with NetScaler deployments. This training was designed for NetScaler 12.0, not for 10.5.

A training might be recommended to prepare for an exam, but usually it is not sufficient to pass it. But I want to pass the exam in the first try, so I took a closer look into the Citrix NetScaler 10.5 Essentials and Networking Preparation Guide.

In addition to the student and lab material, I deployed three NetScaler VPX (10.5,11.1 and 12.0) in my lab. I really recommend this! Especially to learn the CLI and how to read the log files.

The exam

S. Hofschlaeger / pixelio.de

The exam 1Y0-351 is focused on NetScaler 10.5, and will be not available after January 19, 2018. The sucessor of this exam is 1Y0-340, which is based on NetScaler 12.0. It is available since October 20, 2017. You might have noticed that my course was designed for 12.0, but I took the 10.5 exam. Well, I could not identify a question that would have had to be answered differently for NetScaler 12.0. But I really recommend to take the exam matching your course.

You have to answer 72 questions in 120 minutes. I got 30 minutes extra, because I’m a non-native english speaker. I had to answer two survey before the exam. One of them was a self-assessment about my NetScaler skills.

The questions were pretty fair, no trick questions, or questions were multiple answers seemed to be correct. The exam met the exam objectives from the prep guide. And because I already wrote it: You really should work with the CLI, and you really should know the important logs.

In sum: A challenging, but pretty fair exam. No marketing, no factual knowledge from spec sheets etc. When you are quite familiar with NetScalers, there is a good chance to pass the exam in the first attempt.