The Changing Landscape of Identity Networking

I was asked to travel to the 2013 InfoSec security conference in Europe this year, and speak about the trends I am seeing in the identity networking game, and possibly speculate on the future of identity in networking as I see it.  So I thought to myself: “what a great blog post this could make”.

The Phases of Identity Networking (as seen by an overworked identity nut like me):

Phase 1:  [Circa late 1990’s] Identity networking stems from the age-old question: “How do I Control Who Gains Access to the Network?”  Along comes IEEE 802.1X!  802.1X provides Extensible Authentication Protocol (EAP) over Local Area Network (LAN) capabilities, to allow a client to transmit their identity credential into the network before gaining access.

802.1X
  • 802.1X Provides the User or Device Credential
  • User allowed to Connect to Network
  • Enforcement may be VLAN or ACL

802.1X is now providing “WHO” (e.g.: Employee, Contractor, Guest, etc.).

Phase 2:  [Circa 2004] “What happens in Vegas, stays in Vegas.. Unless you pick up a Virus.”  We ran through a major growth period in the computer security industry due to the spread of destructive malware, which also led us to try and extend the “WHO” that 802.1X provides to include even more information.  Knowing that you are a valid employee was not enough, because even valid employees may have picked up a virus somewhere.

Before allowing you to have normal access to our network, we need to examine the device you are using and look for updated patches, Anti-Virus software installations, and so forth.

But my desktop management team told me that’s what our Anti-Virus Management servers and patching systems like Microsoft’s SMS/SCCM will handle, so we are protected, right?”  WRONG.

I lost track of the number of audits I’ve seen companies fail where their Anti-Virus management, or patch system was not working, due to any slew of reasons; usually the reason was some variant of a PUSH mechanism from the server, instead of a PULL mechanism from the client.  Somewhere along the line, the server was unsuccessful in pushing the update.

Regardless of the reason why it failed, the fact is: the company failed audit.  So they were looking to enforce the update before providing network access.  Enter Network Access Control (NAC) also called Network Admission Control.

The idea behind NAC is to ensure patches & Anti-Virus are all up to date, and the system is compliant with the company’s security policy (known as posture assessment) BEFORE allowing them to access the network.  If any part of the posture assessment fails, the user/device is put into a Quarantine state (usually a VLAN assignment or special ACL) where the machine is remediated (the items that failed the posture assessment are fixed).  After remediation, the user/machine is moved back into a normal full access mode.  Image 3 shows this concept from an image I grabbed out of one of the original Cisco Network Admission Control (NAC) Framework slide decks from years long past.

NAC Framework

  • 802.1X Provides the User or Device Credential
  • Resident Agent Typically Provides the Posture Data
  • User is Quarantined Until Remediated
  • Enforcement may be VLAN or ACL

So now 802.1X provided the “WHO”, while NAC is providing the “WHAT” (posture assessment of patches and installed/running applications, etc…)

Phase 3:  [Circa 2007] “What kind of device is that??”  Posture assessment (read: traditional NAC) was humming along nicely.  We saw very active growth in the NAC market space, and identity networking continued to grow at a nice leisurely pace.

Then something happened that started to change the game a bit.  Mobile phones got one very cool addition, Wi-Fi.  With the addition of Wi-Fi to smart phones, we started seeing an influx of devices that were trying to join corporate wireless networks, but didn’t support Posture Assessments (NAC). This was really just an interesting uptick in a networking trend, but nothing too pressing at the time.  Then along comes one more advancement that changed everything forever:  the iPhone (followed by the iPad)!

It is undeniable that the iPhone changed everything for this industry.  Now, we had to allow certain executives to access the corporate network with their iPhones/Androids/etc, but not everyone.  This allowed us to extend and improve an existing technology – endpoint Profiling.

Originally, endpoint profiling was developed as a tool to aid in 802.1X deployments.  The original concept was to help identify devices that could not authenticate with 802.1X; endpoints like printers, badge-readers, camera’s, IP-Phones, and many more.  Once these devices were identified, their mac-addresses (hardware address burned into the network interface card) were added to a list of devices allowed to bypass the 802.1X authentication and gain access.

Now with the extensive proliferation of these mobile devices that are Wi-Fi enabled, along with concepts such as Bring Your Own Device (BYOD) and Choose Your Own Device (CYOD), we were advancing endpoint profiling to be used with 802.1X.  The ability to tie an authentication policy to the device type became commonplace and paramount.

A common policy could be:

Employee’s 802.1X authentication on the corporate wireless network with a workstation whose posture was compliant = Full Access.  Same Employee’s 802.1X authentication on the same corporate wireless network with an iPad = Internet Only Access.

  • 802.1X Provides the User or Device Credential
  • Endpoint Profiling is Used to Determine Device Type
  • Workstations are Quarantined Until Remediated (NAC)
  • Mobile Devices are Provided Internet-Only Access
  • Enforcement may be VLAN or ACL

At this stage we are still looking at “WHAT”, but it’s a different “WHAT”.  Instead of security policy compliance (posture), it is looking for device types using endpoint profiling.

One important note: it’s predicted that over 15 billion devices will be network connected by the year 2015.  That is only 2 years away from the writing on this blog post!

Phase 4:  [Circa 2012] “I don’t do ‘OR’! ‘OR’ makes you choose!” 802.1X is a fantastic technology.  It uses Extensible Authentication Protocol (EAP) to provide an identity to an authenticating device.  Let’s state that again, but focus on the key word:  “It uses EAP to provide AN identity to an authenticating device”.   That’s right; EAP is only carrying a single credential per transaction.  So an Enterprise that is using Microsoft’s Active Directory to manage their Windows workstations have two options per workstation to authenticate to the network:  Machine-Auth *OR* User-Auth.

Why are there two auth options for Windows?  Back in the late 1990’s when 802.1X was starting to pick up adoption, Microsoft came to an important realization:  If no network connectivity is provided until a user logs into the Windows box, then the connectivity between Active Directory and the workstation will be broken! In a brilliant move, they created a concept of a Machine-State and a User-State for their supplicant (the software ‘driver’ that speaks 802.1X).  So, when a user is not logged into a Windows box, the machine itself may be logged into the network using either the machine account and password created when the machine joined AD; or it may even use an Active-Directory issued certificate.  Then, once the user types CTRL-ALT-DEL & logs into the workstation, the endpoint will start a new EAP session that will use the User Credentials instead of the machine’s credentials.

Much like Dr. Ken Jeong in the Coke Zero Commercial, we don’t want “OR”, we want “AND”.  Immediately, administrators were trying to find ways to tie together the machine authentication and the user authentication, to be part of a single Authorization.  This way an organization can be assured that it is a valid corporate asset and a valid user.  Cisco created a concept of Machine Access Restrictions (MAR) – which maintained a cache of mac-addresses that passed machine authentication.  Then when a user authentication comes into the RADIUS server, it looks at the cache to see if a machine has already authenticated from that mac-address & if so, access would be allowed.  There’s more to it than just that, but we are not focusing on MAR – as there are limitations.  I want AND.  AND means that I don’t want limitations, I want it to work no matter what!

Enter EAP Chaining!  EAP Chaining is an enhancement that allows for multiple EAP credentials to be transmitted in a single transaction.  Created by Cisco as part of EAP-FASTv2, it was adopted by the IETF for the upcoming Tunnel EAP (TEAP) standard.  I could write an entire blog on EAP Chaining (and probably will), but let’s just explain that in a single transaction we are able to combine Machine and User Authentications and authorize based on that combo.  The mix can be any combo of certificates or names & passwords, which is a fantastic and long needed enhancement for 802.1X and EAP.

  • 802.1X Provides the User AND Device Credential
  • May Continue to use NAC/Posture where workstations are Quarantined Until Remediated
  • EAP Chaining is not dependent on infrastructure, since EAP does not terminate there, it terminates at the Policy Server.
  • Only Cisco AnyConnect and Cisco ISE support EAP Chaining at date of Blog Posting
  • Enforcement may be VLAN or ACL

At this stage we are still looking at “WHO” and “WHAT”.  We are using 802.1X to provide both, and it can be combined with Profiling and Posture.

Phase 5:  [Circa 2013] “Mobile Device Management” Providing Internet-Only access to mobile devices may be a workable policy for some, but certainly not all.  These mobile devices are continuing to grow in adoption and proliferation at very exciting rates.  As these are becoming more important for conducting business transactions every day, it also becomes important to manage these devices.  I find this trend to be truly intriguing because companies like RIM had this down to a Science – no, not science – RIM made managing Blackberry’s an Art Form!  It was absolutely beautiful, and still is as a matter of opinion.

However, the Blackberry mobile devices were not what the end-user wanted.  The end-user wants to use whatever endpoint they will be most productive with.  If I find my productivity is increased by using an iPhone or an Android device, then by golly – let me use an iPhone or Android device!  Enter Mobile Device Management companies such as:  AirWatch, ZenPrise (acquired by Citrix), Mobile Iron, Good Technologies, SAP Afaria, and others including Blackberry Enterprise Service 10 which will manage iOS and Android now, too.

While this blog is not focused on MDM, it was necessary to discuss these briefly, because they provide a level of posture capability for mobile devices!  Specifically, your policy server may need to have knowledge of an endpoint being managed by an MDM or owned by the company versus one that an employee just purchased off the shelf of his favorite neighborhood electronics store.

With MDM integration to your policy server you are able to establish a new type of posture, looking to see device ownership, if the device has been jail broken/rooted, if the device has encryption enabled, if the pin lock is enabled, etc.  This adds even more to our identity decision, and we can provide different levels of access for devices based on these attributes.

  • 802.1X Provides the User or Device Credential
  • Mobile Devices are profiled as they access the network.
  • Policy Server is able to query the MDM, looking for “posture”
  • Enforcement may be VLAN or ACL

At this stage we are still looking at “WHAT”, which is being provided by the MDM service.

Phase 6:  [~2013] “Location, Location, Location” Another factor that is commonly asked for / looked at.  There is the ability to look at the source Network Access Device to determine if it was wired, wireless, or VPN as well as where that device is in the network – or even to integrate with location services within the wireless infrastructure, or what about Geo Location w/ the GPS units that are enabled on all these mobile devices.  Now, lets look at physical security controls:  badging systems, etc.  Should a user be allowed to log into a network if they have not badged into that building?  Should they be allowed to login if there is currently a fire-alarm in that building?  The limits are endless!

  • Using the location of an endpoint as part of the policy decision
  • Identity to include location in campus, branch
  • Type of Access is Wired, Wireless, or VPN
  • Attributes could even include Geo Location
  • Physical Security controls could be included as part of the conglomeration of attributes, such as badged in or state of the fire alarm.
  • Was the authentication attempted during normal business hours or was it while the location should be closed?

At this stage we are examining “WHERE”, “WHEN” and “HOW” which is beginning to round out our identity conglomerate or “context”.

Leading us down a path to a Contextual Identity

As you start to put all these phases together, you realize that an identity is growing way beyond just the username credential – it is leading us down a path of an identity including the: WHO, WHAT, WHERE, WHEN, AND HOW; which I like to call a “Contextual Identity”.

That Contextual Identity will be used in-place of the traditional use of a single credential.  Now we will gain the ability to construct business relevant policies that take the entire context of the user access into consideration, providing granular levels of access or even bulk access.

So, what’s missing?  Where do we go from here?

Honestly, this is a beast of a question.  It’s so easy to describe where we’ve been, and much harder to try and predict where this phenomenally dynamic industry will head to next.

Now that we have this contextual identity, what I see is the requirement for EVERYTHING to have some form of scalable connectivity to use that context.  Solutions like firewalls, intrusion prevention systems, macro-analytical tools like SIEM and Cyber Threat Defending solutions, web security appliances, application access, cloud access – they all require a tie-in to that context.

So how do we do this?  There are so many protocols to choose from: Syslog, Security Device Event Exchange (SDEE), Simple Network Management Protocol (SNMP), Interface for Meta Access Points (IF-MAP)?

No!  Honestly there is nothing in existence today that will scale for the environments of tomorrow. Not yet.  We need a new industry-standard communication bus that can take the place of all these miscellaneous technologies used today, something that can truly scale.  Keep an eye out.  You never know what might appear in the near future.

Scalable Enforcement Mechanism

Something else that is missing, but DOES exist today and will get proposed as an industry standard shortly, is a scalable enforcement mechanism.  The traditional methods of enforcement like VLAN and ACL assignment just won’t cut it anymore.  An easy to manage method of classifying endpoints based on context and permitting or denying based on that classification regardless of topology or Layer-3 protocol (IPv4 or IPv6, IPXSPX even) would simply operations & reduce OPEX tremendously.

That solution exists today, and we know it Security Group Tagging.  I know of a large organization that moved to Security Group Tagging and reduced their full-time staff from 24 to 6 FTE’s managing firewall rule maintenance.  Those 20 FTE’s were able to move to other projects and the organization became much more efficient.  Think about it!

What if your policy could move from mapping subnets to hosts, to a simple spreadsheet like below?

Well, I guess that’s enough for this blog.  Expect a lot more from me on this Security Group Tagging stuff, because it’s coming in a big way.

Talk at ya soon!!!

-Aaron

 

References and Further Reading

Blackberry BES working with iOS and Android:
http://us.blackberry.com/business/software/bes-10.html?LID=us:bb:software:businesssoftware:bes-10&LPOS=us:bb:software

EAP-Chaining Deployment Guide (Author: John Eppich): http://www.cisco.com/en/US/solutions/collateral/ns340/ns414/ns742/ns744/docs/howto_80_eapchaining_deployment.pdf

IETF Tunneled EAP (TEAP):
http://datatracker.ietf.org/doc/draft-ietf-emu-eap-tunnel-method/

Security Group Tagging:
http://www.cisco.com/en/US/netsol/ns1051/index.html

 

 

How to hack the certificate for a Cisco Identity Services Engine node

I just got back from a few weeks traveling around Europe, presenting at Cisco Live Europe, and meeting with customers & partners…  It is obvious that this blog is very much needed for a lot of the deployments that we discussed, so as promised in the Load Balancing Blog, I am following up with a blog on how to “hack” the certificate for a Cisco Identity Services Engine (ISE) node; so that we may include entries in the Subject Alternative Name (SAN) field.

Why do we need to do this? 

There are numerous occasions where you will want to reach ISE with a DNS name that is not the exact-same as it’s hostname.  If you’ve ever tried to reach an https:// web site by ip address, you most likely have experienced the web browser arguing that the certificate name is mis-matched, the browser requires you to accept the warning in order to proceed.  Example is shown below. 

Cisco ISE has a few different portals that you may connect to:

  • Sponsor Portal:  https://ISE:8443/sponsorportal/   – This portal is for Employees of your company to login & create guest accounts.  Obviously telling an employee to connect to this URL will be very tedious, and a more friendly name is needed.
  • MyDevices Portal:  https://ISE:8443/mydevices/  – This portal is for Employees of your company to login & manage their personal devices which they have/can register to provide network access to those devices. Obviously telling an employee to connect to this URL will be very tedious, and a more friendly name is needed.

So, ISE can use HTTP host-headers to use friendly names, and redirect traffic destined to that friendly name to the correct URL/port.  This is set under Administration à Web Portal Management à Settings à General à Ports:  http://blog.woland.com/images/Blog3/02-FriendlyNames.png

If you were to use “hotspot.CompanyX.com”, it would not match what was in ISE’s certificate for the web portals.  The certificate will only match the actual hostname (such as: atw-cp-ise04.cisco.com).  This results in a certificate mismatch error – and the user experience is less than desirable.

How do I fix this?

Standard X.509 certificates provide fields to allow a certificate to match more than one URL.  This is known as the Subject Alternative Name field.  This certificate field may be populated with other DNS names, other IP Addresses, and more.

Using the Subject Alternative Name field will prevent the certificate errors.  However, Cisco ISE does not provide the ability to populate these fields when generating a Certificate Signing Request (CSR) to be sent to the Certificate Authority for signing.

What is the “Hack”?

While the ISE user interface may not provide the ability to populate the SAN field with it’s own Certificate Signing Request (CSR), it is still just an X.509 certificate; which is a standard.  Why don’t we just export the public & private certificate from ISE, and use OpenSSL to generate the CSR instead?

Note: We have tried this with MAC-OS, since OpenSSL is built into it, however it did not work for us.  We did have success with using OpenSSL on Windows & Linux.  I am going to focus on using the Windows implementation of OpenSSL for this blog entry.  You can download OpenSSL from here:http://gnuwin32.sourceforge.net/packages/openssl.htm

Let’s Begin!

Step 1:  To begin, you should generate a new self-signed certificate for the ISE node.  Set the Key length to be your desired key length (2048 for example).

Afterwards, you can reconnect to ISE and it will be using the new certificate.  Here I am viewing the new certificate, just to show you some fields.  There is no Subject Alternative Name field, and you can see below that the subject is CN=atw-cp-ise01.ise.local (the fqdn of the ISE node).

Step 2:  Export the Public & Private Certificate from ISE.  The default format is a zip file that contains both the public & private key.  In this case:  “atwcpise01iselocalatwcpis.zip”.

Step 3:  Extract the zip file & copy the .pem & .pvk files to the OpenSSL binary directory (C:\Program Files (x86)\GnuWin32\bin).

Step 4:  Create a customized configuration file for OpenSSL Certificate Signing Requests named openssl.cnf.

A really nice walk through of the openssl.cnf file can be found here: http://www.phildev.net/ssl/opensslconf.html

The contents of my openssl.cnf file:

Step 5:  Now that your openssl.cnf file is ready with your certificate customizations, you will use OpenSSL to create a custom CSR file using the following command:

openssl req -key [PVK_file] -new -out [CSR_filename] –config [your_openssl.cnf_file]

Example:

Step 6:  Request a new Certificate from the CA. I used a Microsoft CA in this example.

Step 7:  Choose an Advanced certificate Request

Step 8:  Paste in the contents of the certificate request file generated in Step 5.  Ensure the Certificate Template type is “Web Server”.

Step 9:  Download the certificate in Base 64 (PEM) format.  For best results do not use DER format, and do not use the certificate chain.

Step 10:  Under Local Certificates, select Add -> Import Local Server Certificate

Step 11:  Import the Original Private key and new CA signed public key into ISE.

  1. For Certificate File, choose the new CA signed certificate that you just downloaded from the CA.
  2. For the Private Key File, select the original private key that you exported.

Step 12:  Your ISE node will now be using the new CA signed certificate, with the Subject Alternative Names in it.

 

 

EAP Primer

The more interaction I have with customers who are getting started with Identity projects, the more I realize that a simple explanation & comparison of the differences between EAP types is needed.

For example, the general opinion that I get from customers is that EAP-TLS is the most secure EAP type to use, since it is X.509 certificate based.  Ok, I can accept that opinion; but did you realize that EAP-TLS might also be used as the Inner-Method of PEAP or EAP-FAST?  No, not a cut down version, the SAME EAP-TLS protocol that can be used in isolation, may also be used within a PEAP or EAP-FAST tunnel.

So, for this blog entry, I would like to examine the main (most common) EAP types and their uses.

EAP is an authentication framework that defines the transport and usage of identity credentials.  EAP encapsulates the usernames, passwords, certificates, tokens, OTPs, etc. that a client is sending for purposes of authentication.  In fact, did you know that 802.1X is really ‘just’ defining EAP over LAN?

There are many different EAP types, each one has it’s own benefit and downside.

  • EAP-MD5:  Uses a “Message Digest algorithm” to hide the credentials in a HASH.  The HASH is sent to the server where it is compared to a local hash to see if the credentials were accurate.  However, EAP-MD5 does not have a mechanism for mutual authentication.  That means the server is validating the client, but the client does not Authenticate the Server (i.e.: does not check to see if it should trust the server).
    EAP-MD5 is common on IP Phones, and it is also possible that some switches will send MAC Authentication Bypass (MAB) requests using EAP-MD5.
  • EAP-TLS:  An EAP type that uses TLS (Transport Layer Security) to provide the secure identity transaction.  This is very similar to SSL and the way encryption is formed between your web browser and a secure web site.  EAP-TLS has the benefit of being an open IETF standard, and is considered “universally supported”.
    EAP-TLS uses X.509 certificates and provides the ability to support mutual authentication, where the client must trust the server’s certificate, and vice-versa.  It is considered among the most secure EAP Types, since password capture is not an option; the endpoint must still have the private-key.
    Note:  EAP-TLS is quickly becoming the EAP type of choice when supporting BYOD in the Enterprise.

Tunneled EAP Types:

The EAP-Types above transmit their credentials immediately.  These next two EAP types form encrypted tunnels first and then transmit the credentials within the tunnel.

Blog-2_EAP_Tunnels

  • PEAP:  Protected EAP.  Originally proposed by Microsoft, this EAP Tunnel type has quickly become the most popular and widely deployed EAP method in the world.  PEAP will form a potentially encrypted TLS tunnel between the client and server, using the x.509 certificate on the server in much the same way the SSL tunnel is established between a web browser and a secure web site.  After the tunnel has been formed, PEAP will use another EAP type as an “inner method” – authenticating the client using EAP within the outer tunnel.
    • EAP-MSCHAPv2:  using this inner method, the client’s credentials are sent to the server encrypted within an MSCHAPv2 session.  This is the most common inner-method, as it allows for simple transmission of username and password, or even computer-name and computer-passwords to the RADIUS server, which in-turn will authenticate them to Active Directory.
    • EAP-GTC:  EAP-Generic Token Card (GTC).  This inner method was created by Cisco as an alternative to MSCHAPv2 that allows generic authentications to virtually any identity store, including One-Time-Password (OTP) token servers, LDAP, Novell E-Directory and more.
    • EAP-TLS:  While rarely used, and not widely known, PEAP is capable of using EAP-TLS as an inner method.

EAP-FAST:  Flexible Authentication via Secure Tunnel (FAST) is very similar to PEAP.  FAST was created by Cisco Systems as an alternative to PEAP that allows for faster re-authentications and support faster wireless roaming.   Just like PEAP, FAST forms a TLS outer-tunnel and then transmits the client credentials within that TLS tunnel.  Where FAST differs from the PEAP is the ability to use Protected Access Credentials (PACs).  A PAC can be thought of like a secure “cookie”, stored locally on the host as “proof” of a successful authentication.

  • EAP-MSCHAPv2:  using this inner method, the client’s credentials are sent to the server encrypted within an MSCHAPv2 session.  This is the most common inner-method, as it allows for simply transmission of username and password, or even computer-name and computer-passwords to the RADIUS server, which in-turn will authenticate them to Active Directory.
  • EAP-GTC:  EAP-Generic Token Card (GTC).  This inner method was created by Cisco as an alternative to MSCHAPv2 that allows generic authentications to virtually any identity store, including One-Time-Password (OTP) token servers, LDAP, Novell E-Directory and more.
  • EAP-TLS:  EAP-FAST is capable of using EAP-TLS as an inner method.  This has become quite popular with EAP-Chaining.

EAP Chaining with EAP-FASTv2:  As an enhancement to EAP-FAST, a differentiation was made to have a User PAC and a Machine PAC.  After a successful machine-authentication, ISE will issue a Machine-PAC to the client.  Then when processing a user-authentication, ISE will request the Machine-PAC to prove that the machine was successfully authenticated, too.  This is the first time in 802.1X history that multiple credentials have been able to be authenticated within a single EAP transaction, and it is known as “EAP Chaining”.  The IETF is creating a new open standard based on EAP-FASTv2 and at the time I wrote this Blog entry, it was to be referred to as “EAP-TEAP” (tunneled EAP), which should eventually be supported by all major vendors.

How to properly use a Load-Balancer in Cisco’s Identity Services Engine

So, this is my first blog post on here.  Hope it goes well.

One of the most commonly asked questions of late is how to properly use a load-balancer with Cisco’s Identity Services Engine.  Here are some basic guidelines to use when configuring a Load Balancer for the ISE Policy Services Nodes (PSNs).

Understanding terms:

  • PSN = Policy Services Node.  The PSN is the ISE persona that handles all of the radius requests, and make the policy decisions.  If you are using profiling, the PSN is also handling the profiling for you.
  • PAN = Policy Administration Node.  The PAN is the ISE persona that handles all the database synchronization/replication, and provides the administrative GUI.  This node must talk to the PSN directly, without going through NAT.
  • VIP = Virtual IP Address.  This is the IP Address that Load Balancer listens on, and will redirect traffic destined to the VIP to the real IP Addresses of the servers in the Server Farm.
  • Server Farm = The Grouping of servers that will be load balanced when traffic is destined to the VIP
  • Endpoint = the actual device accessing the network.
  • NAD = Network Access Device.  The Access-Layer device (switch / wireless controller) that provides and enforces network access to the endpoint.
  • SNAT = Source Network Address Translation.  Function of load balancers to hide the source ip address of the NAD, which allows the load-balancer to run “out of band”.
  • Server NAT = the reverse of Source NAT.  This is hiding the IP Address of the actual ISE PSN when it initiates communication to the NAD for things like Change of Authorization (CoA), and replacing that IP Address with the VIP instead.

General Guidelines

When using a Load-Balancer (anyone’s) you must ensure a few things.

  • Each PSN must be reachable by the PAN / MNT directly, without  having to go through NAT (Routed mode LB, not NAT).  No Source-NAT.  This  includes the Accounting messages, not just the Authentication ones.
    • This means the Load-Balancer must be in the direct path between the clients and the ISE PSNs.
    • Some organizations have used Policy Based Routing (PBR) to accomplish the path, without physically locating the Load-Balancer between the clients and the PSNs.
  • Endpoints (clients) must be able to reach each Policy Services Node Directly (not going through the VIP) – for redirections / Centralized Web Authentication / Posture Assessments / Native Supplicant Provisioning, and more.
  • You may want to “hack” the certs to include the VIP FQDN in the SAN field (my next blog post should cover this trick).
  • Perform sticky (aka: persistence) based on Calling-Station-ID and Framed-IP-address
  • VIP gets listed as the RADIUS server of each NAD for all 802.1X related AAA.
  • Dynamic-Authorization (CoA):
    • If you use Server NAT to replace the PSN IP address with the VIP Address for Change of Authorization, then you would use the VIP address as the Dynamic-Authorization (CoA) client.
    • Otherwise, use the real IP Address of the PSN, not the VIP.
  • The LoadBalancer(s) get listed as NADs in ISE so their test authentications may be answered, to keep the probes alive.
  • ISE uses the Layer-3 Address to Identity the NAD, not the NAS-IP-Address in  the RADIUS packet…  This is a big reason to avoid SNAT.

Failure Scenarios:

  • The VIP is the RADIUS Server, so if the entire VIP is down, then the NAD should fail over to the Secondary DataCenter VIP (listed as the secondary RADIUS server on the NAD).
  • Use probes on the Load-Balancers to ensure that RADIUS is responding, as well as HTTPS  (at minimum).
    • LB Probes should send test RADIUS messages to each PSE periodically, to ensure that RADIUS is responding, not just look for open UDP ports.
    • LB Probe should also examine the response for HTTPS, not just look for the open port(s).
  • Use node-groups with the L2-adjacent PSN’s behind the VIP.
    • If the session was in process and one of the PSN’s in a node-group fails, then another member of the node-group will issue a CoA-reauth; forcing the session to begin again.
    • At this point, the LB should have failed the dead PSN due to the probes configured in the LB; and so this new authentication request will reach the LB & be directed to a different PSN…

Why can’t we use Source NAT (SNAT)?

One of the most common questions when load balancing, is: “Why can’t we use SNAT?”  Source NAT is a fantastic thing for general Load-Balancing – but not with ISE.  The reasons listed below pertain to ISE version 1.1.x; and may change with ISE 1.2+

Reason #1:  Network Access Device (NAD) will be wrong:
With SNAT, the source Network Access Device will show up in ISE as being the Load-Balancer, NOT the Network Access Device.

Blog1 - Image-1

ISE uses sessionized network authentication.  This means ISE is tracking the session along with the NAD – so the NAD & ISE stay in-sync about the state and location of the endpoint…  This session also gives ISE the NAD address to send Change of Authorizations to, as well as the location of the endpoint.

  • The source NAD is used in many different ISE Policies, especially for location data.
  •  If all nodes always appear to be coming from the Load-Balancer, instead of the NAD – how can we know the location of the endpoint?

Location is not nearly as big of a problem as the Change of Authorizations – which are key to a successful deployment.

  • ISE records the Layer-3 Address of the NAD from the Layer-3 headers.
    • There is a RADIUS field known as NAS-IP-Address; which embeds in the IP Address of the Network Device in  the RADIUS Packet.
    • However, ISE does not currently use that field; and therefore the L3 IP Address of the NAD must be correct for Change of Authorization to be sent to the correct device.
      • If the NAD appears as the IP Address of the Load-Balancer, then ISE will send the  Change of Authorization to the Load-Balancer – not the switch.

 

Reason #2:  URL Redirection and Web Portals:
Next, ISE 1.1.x only has one interface that can be used for all functions.  Yes, ISE can run RADIUS on any of ISE’s four interfaces, but the Gigabit 0/0 interface is the ONLY interface for Management Traffic.  Also, the fqdn of the Policy services node is embedded into the certificate for ISE 1.1.x; and that is what gets used for URL Redirection for WebAuth & Device Registration & Supplicant Provisioning, etc…

Blog1 - Image-2

So, when the URL Redirection occurs, the endpoints will need to talk to ISE Directly (not the VIP) – and reach the web portals.  The Portals can ONLY exist on the Gigabit 0/0 Interface in 1.1.x.  (This may change in a future version of ISE).

Reason #3:  Routing Tables:
Unless you add a static route to ISE for every NAD Subnet, ISE does not have the ability in 1.1.x to return traffic on a different subnet through a different Gateway, only it’s default Gateway.  Therefore, the  Load-Balancer MUST be the Default-Gateway for the ISE PSN’s (or at least in the path).

Since the Load-balancer must be the default Gateway, then all Management Traffic is also flowing through the Load-Balancer, unless you physically locate the Policy Administrative Node (PAN) and Monitoring & Troubleshooting Node (MNT) behind the load-balancer as well (just don’t include those in the ServerFarm).

I hope that helps.

Aaron