Wednesday, January 1, 2020

Aruba: Connectivity to Mobility Master for DMZ Deployments

I recently had the opportunity to deploy a remote worker solution using Aruba Mobility Controllers and Remote APs (RAPs). One of the first steps I took in preparing to deploy the solution was to download the Aruba Validated Reference Design, or VRD, for remote AP deployments. There was a lot of good information in that VRD, but it was a little dated. The VRD was written for ArubaOS 6.x deployments, and as a result didn't have any information about Mobility Master.

One thing the VRD does make clear is that for RAP deployments, the Mobility Controllers should be located in your DMZ. My deployment was going to use a cluster of three Mobility Controllers in a DMZ, managed by a Mobility Master that wasn't in the DMZ.

One thing that distinguishes Aruba from the "other" guys controllers is that they are configured with a Controller IP, or LMS-IP. This is the one IP interface on the controller that can terminate a connection with an AP. Because my controllers had to go in the DMZ, my Controller IP had to be an DMZ address. I don't know about your DMZ, by mine has only one way in and out, and that choke point didn't allow access to the internal network (which is good security design). The connection between my Mobility Controllers and my Mobility Master would have to be over a different interface than the DMZ one.

I initially setup my controllers with an IP for a management VLAN, and set port G0/0/0 to that VLAN. This was how I setup my connection to Mobility Master. From there, I created my DMZ interface on G0/0/4 and other settings. I made sure I had my static routes setup correctly, with default route going out the DMZ and a static route to my Mobility Master.

Now to change the Controller IP from my management VLAN to the DMZ address. As soon as I did that, my Mobility Master lost connectivity with the controllers! I thought it must have been a routing issue, but that wasn't the problem. The controllers would do an automatic config rollback and re-establish connectivity with Mobility Master, but every time I changed the Controller IP, connectivity to MM would break.

To see why the connection to MM was failing, I first had to understand what that connection looked like. The connection between Mobility Controllers and Mobility Master is an IPSec tunnel. That tunnel is established when performing the controller's initial setup through the serial port. During initial setup, you are asked for the IP address of the Mobility Master, and which method to authenticate with. If you are connecting to a Virtual Mobility Master, which I was, the two methods are PSK with IP, or PSK with MAC (where PSK stands for pre-shared key). If you choose PSK with IP, the initial setup uses the IP address of the Mobility Master entered earlier with the pre-shared key to establish the tunnel. At the Mobility Master side, you also enter the IP address of the controller and the same PSK.

This is where my problem was. Since the secure connection between the MC and MM was defined using the IP of the MC, that connection broke when changing the controller IP. To support my configuration, I had to switch the configuration to use PSK with MAC.

Using PSK with MAC is a little trickier than PSK with IP, because you have to pick the correct MAC addresses. Once you do though, it doesn't matter what you set for the Controller IP.

The moral of this story: If you plan on deploying Mobility Controllers in a DMZ for RAPs and need to use an interface for management that won't be the same as your Controller IP, use PSKwithMAC as your authentication method for setting up the connection to Mobility Master.

Sunday, July 29, 2018

Cisco + Apple Partnership - Phase 2: iOS Wi-Fi Analytics

Cisco introduced "Phase 2" of its partnership with Apple starting with release 8.5 of wireless controller code. Phase 2 brings a feature called Wi-Fi Analytics. This feature allows certain iOS devices to communicate useful information to the controller during association and disassociation.

According to Cisco and Apple Wi-Fi analytics require iOS 11, and this functionality is limited to certain devices. According to this Cisco document, Wi-Fi analytics only works on iPhone 7 or higher and iPad Pro and higher.

So how does an iOS device know it is connected to a Cisco wireless network running 8.5 code? Similar to what was seen with Fastlane and Adaptive 11r, beacons and probe responses from Cisco APs include a vendor-specific Apple information element. The difference is that this IE will appear even if Fastlane and Adaptive 11r are not enabled.

One feature of Phase 2 is beacon reports. Beacon reports are defined in 802.11k, and they allow a client to report to the infrastructure how it sees the wireless environment. This is an important metric; it's easy to learn how an AP hears a client, but that is just half the picture. Knowing how clients hear the AP's signal can be a valuable troubleshooting tool, and it allows the the controller to optimize 802.11k neighbor reports for future clients.

The neighbor reports appear in the Cisco controller dashboard under client details. Below is a report from an iPad for the one AP that it heard broadcasting the SSID it was connected to.

Client Scan Report
This information is sent via an unsolicited 802.11k beacon report, Wireshark capture shown below.

802.11k Beacon Report
The Received Channel Power Indicator (RCPI) is defined as a number between 0 and 255, and is used to indicate a receive power in dBm between -120 and 0. The conversion formula is supposed to be RCPI / 2 - 120 = dBm, but that does not appear to give a feasible value in this case, as 0xbe would equal -25 dBm. There may be some proprietary characteristics in play. Another indication of this is the Operating Class value of 241, which appears invalid.

Running a 'debug 11k all' on the controller while the neighbor report is sent generates the following output.

Received a 11k Action frame with code 1 from mobile station E4:E0:A6:xx:xx:xx
payloadLen = 31, subIe ID 39 len 29
Measurement report:
Token ID: 96, Mode late: 0, Mode incapable: 0, Mode refused: 0, Type: 5
Found 802.11k beacon report element ID
Regulatory class: 241, Channel number: 165, Measure duration: 46012, Condensed Phy Type: 0, Reported Frame Phy Type: 0, RCPI: 190, RSNI: 27, BSSID: 58ac.78xx.xxxx, Antenn
payloadLen before sub= 31
payloadLen after sub = 0

That's all for this blog. In the future I hope to cover some of the other features included in Phase 2.

Thursday, November 16, 2017

Cisco's Adaptive 11r

There is an excellent post on the Cisco Support forums about options for implementing 802.11r Fast Transition. In summary, there are three ways to 802.11r for a WLAN on Cisco wireless:
  • "Pure" mode, where the only Authentication Key Management (AKM) method listed in the Robust Security Network (RSN) Information Element is a FT method. Common FT methods are 802.1X FT or PSK FT. Clients that don't support 802.11r will not be able to connect to this type of WLAN. They may not even see it. 
  • "Mixed" mode, where both FT and non-FT AKM methods are included in the AKM suite. This mode allows both clients that do and don't support FT to connect. There will still be clients that get confused by the presence of a FT AKM. Notably, if you change an existing WLAN to mixed mode FT, macOS clients may not be able to connect until you delete the WLAN profile and re-connect. 
  • Adaptive 11r. In this mode, the beacon does not advertise the FT AKM at all, but will use FT when supported clients connect. 
Let's look at beacons for different types of FT networks. First, here are the relevant IEs for a non-FT network. 
Beacons and probe responses for a non-FT network will contain non-FT AKM methods in the RSN IE, like PSK shown above. There will also not be a Mobility Domain IE. Now let's look at a mixed mode network.

RSN and Mobility Domain IEs
The RSN IE contains two AKM entries; regular PSK and FT using PSK. In addition to the FT AKM, beacons and probe responses will contain the Mobility Domain IE. Next, things get a little strange with Adaptive 11r.

Adaptive 11r RSN and MD IEs
With Adaptive 11r enabled on a WLAN, the RSN IE does not have any FT methods, but the Mobility Domain IE is present. Beacons will also contain the below IE, even if Aironet IE is disabled on the WLAN. 

Adaptive 11r Aironet IE
This looks similar to the Aironet IE that appears when Fast Lane is enabled. This IE is telling iOS clients that Adaptive 11r is available to use. Similarly, a compatible iOS device will tell the Cisco WLAN that it is, well, an iOS device, by adding a vendor IE into it's association request.

Apple Vendor IE
To summarize what we know so far, beacons and probe responses from an Adaptive 11r WLAN will contain a RSN IE with only non-FT elements, but will also include a Mobility Domain IE. This is paradoxically saying that the WLAN does not support FT and supports FT at the same time. So what happens when clients try to connect? First, let's see an association request from a client that doesn't support Adaptive 11r (a Motorola Android phone).

Association Request From Android Phone

This frame looks normal, and is what you would expect when a client is connecting to a non-FT WLAN. There is no Mobility Domain IE, which implies that the client saw that there was no FT AKM method in the RSN IE. The client determined that the network did not support FT, and did not include the Mobility Domain IE. The expanded RSN IE shows that the client will use PSK as the Authentication Key Management. What happens when a client that supports Adaptive 11r connects? 

Association Request from an iPad
The RSN IE shows that the AKM chosen was FT using PSK, which is not advertised in the beacons! This is the secret sauce of Adaptive 11r. You can also see the magic happen if you run "debug client" and "debug ft events enable" while an Adaptive 11r clients connects. Below is the output of debug commands while my iPad connected. Interesting lines are highlighted in red.

f4:5c:89:xx:xx:xx Recevied management frame ASSOCIATION REQUEST  on BSSID xx:xx:xx:df:a0:30 destination addr xx:xx:xx:df:a0:3f
f4:5c:89:xx:xx:xx Updating 11r vendor IE 

f4:5c:89:xx:xx:xx Marking this mobile as TGr capable.
RSNIE in Assoc. Req.: (20)

     [0000] 01 00 00 0f ac 04 01 00 00 0f ac 04 01 00 00 0f

     [0016] ac 04 0c 00

f4:5c:89:xx:xx:xx Processing RSN IE type 48, length 20 for mobile f4:5c:89:xx:xx:xx
f4:5c:89:xx:xx:xx Selected Unicast cipher CCMP128 for client device
f4:5c:89:xx:xx:xx RSN Capabilities:  12
f4:5c:89:xx:xx:xx Marking Mobile as non-11w Capable 
f4:5c:89:xx:xx:xx Validating FT AKM's on WLAN
f4:5c:89:xx:xx:xx Setting adaptive AKM 4 into RSN Data at 19

f4:5c:89:xx:xx:xx Sending assoc-resp with status 0 station:f4:5c:89:xx:xx:xx AP:xx:xx:xx:df:a0:30-01 on apVapId 1
f4:5c:89:xx:xx:xx VHT Operation IE: width 20/0 ch 165 freq0 0 freq1 0 msc0 0x3f msc1 0x3f
f4:5c:89:xx:xx:xx Including FT Mobility Domain IE (length 5) in Initial assoc Resp to mobile 
f4:5c:89:xx:xx:xx Sending R1KH-ID as 58:8d:09:cd:75:40
f4:5c:89:xx:xx:xx Sending R0KH-ID as:10.-114.-68.66
f4:5c:89:xx:xx:xx Including FT IE (length 98) in Initial Assoc Resp to mobile 

The most interesting part of the output is "Setting adaptive AKM 4 into RSN Data at 19". AKM 4 is short for FT using PSK, and "Data at 19" specifies position in the RSN IE that defines the AKM method. If you issue a "show client detail" command for an Adaptive 11r client, you will see that AKM method listed is an FT one.

(Cisco Controller) >show client detail f4:5c:89:xx:xx:xx
Client MAC Address............................... f4:5c:89:xx:xx:xx
Client Username ................................. N/A
AP MAC Address................................... xx:xx:xx:df:a0:30
AP Name.......................................... wap004-011-ap01   
AP radio slot Id................................. 1  
Client State..................................... Associated     
Client User Group................................ 
Client NAC OOB State............................. Access
Wireless LAN Id.................................. 14 
Wireless LAN Network Name (SSID)................. Fastlane
Wireless LAN Profile Name........................ Fastlane

Policy Type...................................... WPA2
Authentication Key Management.................... FT-PSK
Encryption Cipher................................ CCMP-128 (AES)

Roaming with an Adaptive 11r compatible client is the same as roaming with regular old FT. When the client sends authentication and reassociation requests to a new AP, it includes Mobility Domain and Fast BSS Transition IEs. Roam time with the iPad I tested with was less than 10 ms. (That's how long it took to go from Authentication to the first data packet sent by the iPad. Getting the iPad to roam in the first place was a challenge, given the environment I was testing with).

I like this feature from Cisco and Apple. There appears to be no risk in breaking connectivity for non-iOS devices if you enable it, and the upside for supported devices is really good. Hopefully this blog gave readers some insight into how this feature works. 

Wednesday, October 25, 2017

Writing Custom IDS Signatures for Cisco WLCs

The recent discovery of flaws of WPA/WPA2, network operators are paying more attention the the security monitoring capabilities of their wireless infrastructure. Although the severity of KRACK is debatable, paying attention to threats to your wireless network is still a wise practice.

Cisco wireless controllers provide two methods of security monitoring with no licensing requirements other than AP adoption licenses: rogue AP scanning and IDS signature monitoring. The risk of rogue APs are well known. This blog will focus on IDS signature monitoring.

IDS signature monitoring works by listening for common attacks against wireless networks including deauthentication and EAPOL floods. It does this by looking for specific patterns in individual 802.11 frames and analyzing the frequency with which those packets are heard. If enough matching frames are received within a defined window of time, an alert is triggered and (optionally) a trap is sent to a NMS. Cisco WLCs come with a set of pre-defined signatures.

Standard IDS Signatures
Clicking on the signature precedence number will load a page that allows you to edit some parameters of the signature (more details later), but you will not be able to edit the patterns that the IDS signature will look for in frames. It will list them, but you can't edit them. If you want to make your own signatures, you will need to upload a signature file with the signature definitions in them in plain text.

To get an idea of how to create your own signatures, you can upload the standard signature file from a controller to a TFTP server. The signature file is in plain text, and extraordinarily well notated. Go to Commands -> Upload file. Select Signature File from the from the File Type drop-down, and enter your details for how to transfer the file.

Uploading the Standard Signatures
The standard signature file is very well documented, and describes the syntax required to make a signature. Rather than duplicating the information in the standard signature file, I will focus on the most important aspect of a signature: patterns. A pattern specifies a section of the 802.11 frame to extract and inspect. The section extracted can be one or more bytes, and it is selected by specifying an offset value in bytes (starting at zero), and where to start the offset from; the header or the frame body. The extracted bytes are compared to a user supplied bitmask with a binary AND operation. The result of the AND is compared to a user-defined value, and if they are equal, the pattern is a match. Signatures can contain one or more patterns, and each pattern must evaluate to true (be a match) for the whole signature to match a frame. The format of each pattern is


The start parameter is a bit the defines where you start the offset from; 0 for the frame header and 1 for the frame body.

You might be asking "Why the AND operation?" Why not just specify exactly what to match? The answer to that question is: sometimes it's not important that a whole byte matches, but that a specific bit matches. The bitmask with AND operation allows the user to specify exactly which bits are important. If a bit in the mask is set to 0, whether or not that bit matches is not important. If the bit in the mask is set to 1, it is required for the pattern to result in a match.

Here's an example. Suppose you wanted to write a signature that matched all frames transmitted by a locally administered MAC address. The IEEE defines locally administered (non-unique) MAC addresses as having the 2nd least-significant bit in the first octet as having a value of 1. To determine if a transmitting address is locally administered, we only need to look at the 2nd LSB in the first octet of the address. A pattern to match this would be 0:10:0x02:0x02. This pattern reads "Start at the frame header and go to the 10th byte (starting from 0), extract 1 byte from that position (implied by the length of the mask), AND it with 0x02, and if the result is 0x02, there is a match." Since locally administered MAC addresses will have the 2nd LSB equal to 1, a mask of 0x02 is all that needed. None of the other bits matter in making the determination.

Before we go any further, I should point out that it's very helpful to have a packet analyzer like Wireshark or Omnipeek handy while writing IDS signatures. If you have pcaps of the kinds of frames that you want to catch with an IDS filter, it makes it much easier to write the patterns.

There was was something that confused me though, and it was how patterns that inspected the frame control field were written. Below is a screen grab of an authentication frame from Wireshark.

Authentication Frame
Wireshark shows the Frame Control field as 0xB000; the first byte is the type/subtype, and the second byte represents the flags. When you look at the standard signature file for the "Auth Flood" signature however, the pattern looks like this:

It appears that the order of the flags and type/subtype bytes are reversed in the pattern, at least as compared to Wireshark. Other fields like addresses do not appear to have the same reversal. I'm going to take it on faith that Cisco wrote the signatures correctly, and use the same format when writing signatures that have to match a pattern in the frame control field.

Each signature requires at least one pattern and other parameters. You can read about them in the standard signature file, but I will outline them here too.

  • Version: The version of signature syntax. There is only one allowed value: 0. 
  • FrmType: The type of frame you want to inspect. The two options are data and mgmt (for management frames).
  • Interval: The amount of time in seconds that frames are collected and attempted to match to the signature. 
  • Freq: The number of frames that must match the signature within the interval before an action is taken. This essentially defines a rate for matching packets.
  • Action: What to do when a signature matches frequency frames in interval time. There are only two options: none (do nothing) and report. 
  • Quiet: The amount of time that must pass since the last frame that matched the signature in order for the alert to be cleared. 
  • Track: Defines how to track the alarm event when the signature is matched. Option are tracking by signature, tracking by offending MAC, or both. 
  • MacFreq: Optional parameter that sets the frequency of matched frames required to trigger the alert action if tracking the signature by MAC address.
Now let's take a look at writing some useful signatures. Shortly after the announcement of the KRACK vulnerabilities, a sample script for the 802.11r FT attack was published. The attack script came with a sample pcap, which showed the malicious FT reassociation request frames.

Malicious FT Reassociation Request
Curiously, the FT attack script sets the "More data buffered at AP" bit int he flags section. This isn't normal; the transmitting station is usually a client. If the attack script is not modified from its original form, a pattern to detect this frame would be 0:0:0x2020:0x20FF. The 0x00FF part of the mask will ensure that only reassociation requests are matched, and the 0x2000 part of the mask will match any frame with the "more data" flag set. Together, the 0x20FF mask will only match reassociation request frames with the "more data" flag set. The signature can be made very sensitive by setting the Freq or MacFreq values to 1, so a single frame will trigger the alert.

Another recent vulnerability is CVE-2017-11120, which can crash devices using certain chipsets by sending them 802.11k Neighbor Report Response messages with out-of-bounds operating class and channel values. The published exploit for this flaw sends a series of 11k Neighbor reports with operating class and channel values starting from 225 (0xE1) and ending at 240 (0xF0). These high values for channel number would never normally be seen in a Neighbor Report response frame. For the United States regulatory domain, you would likely never see anything over 165, but 0xE0 is the maximum allowed value. Let's take a look at a Neighbor Report Response frame.

802.11k Neighbor Report Response
I highlighted portions in the frame that correspond to sections an IDS signature will need to match. We will need at least three pattern definitions to match the three segments of the frame to uniquely identify it.

  • The frame control field of 0x00D0 (not shown above). 
  • The category and action code fields, 0x0505, in red above. 
  • The channel field in the neighbor report, in blue above. 
The pattern to match the frame control field can be cloned from the standard signatures. It will be 0:0:0x00D0:0x00FF. To match the category and action codes, a pattern of 1:0:0x0505:0xFFFF will work nicely. Since the action and category codes are seen immediately at the start of the frame body, I used a start of '1' and offset of 0. 

For the channel byte, I want to match anything greater than 0xE0. In binary, 0xE0 is 1110 0000, so anything greater than 0xE0 binary ANDed with 0xE0 will be 0xE0. A pattern of 1:16:E0x:0xE0 will match any channel value greater than or equal to 0xE0. How can I not match if the channel is equal to 0xE0? The pattern syntax allows a NOT operator, !. If I use a pattern of 1:!16:0xE0:0xFF, this will match as long as the channel value is NOT 0xE0. Combining all four patterns in a single signature will guarantee it only matches Neighbor Report responses that have channel values that are out of bounds. 

After you have defined your signatures, you will need to download them to your controller. I recommend just editing the standard signature file by deleting all of the pre-defined signatures, adding in your own signatures, and changing the line that has the keyword "Revision" to read "Revision = custom". If you don't put "custom" in, when you upload the signature file it will replace the standard signatures with your custom ones. After you download them, the new custom signatures will appear in the Custom Signatures section. Below is an example of my FT Reassociation attack signature after it was uploaded to the controller. 

Custom Signature Details
If you want traps to be sent for signature matches, make sure that it is enabled under Management -> SNMP -> Trap Controls. 

Trap Controls
Along with the technical details on custom signatures, here are some observations I have made while working on this blog: 
  • Signatures will match packets transmitted by other APs on the same controller. There is no logic to exclude frames transmitted from other authorized APs. 
  • Matching frames with variable length information elements, or IE that can appear in arbitrary order, is very difficult. 
  • If you are not using monitor mode APs, your detection is best done while the AP is on-channel. This could lead to you missing attacks happening on channels not served by your APs. You can make the filters more effective by setting the Freq or MacFreq parameters in custom signatures down to 1, so a single frame will trigger the alert. 
Finally, here is a link to a custom signature file containing the signatures that I discussed in this blog. If you wish, you can upload them to your lab controllers and try it out. 

Friday, September 29, 2017

Using SQL Queries to Analyze AP Neighbor Information

There's been debate on the state of 2.4GHz Wi-Fi. Some say 2.4 GHz is deceased, that it has kicked the bucket, shuffled off its mortal coil, joined the choir invisible. Others say it's not quite dead yet. I'm in the latter group, primarily because I have devices and applications that rely on it.

One thing most Wi-Fi engineers will agree on: if you have a high-density 5 GHz network of dual-radio APs, you will need to turn off some of the 2.4 GHz radios. There are only 3 non-overlapping channels available, and since 2.4 GHz propagates better than 5 GHz, leaving all radios enabled will result in large amounts of co-channel interference. And that's bad.

*Heavy Sigh*
So how do you decide which radios to turn off?

If you have a Cisco wireless network with lightweight APs, you can leverage RRM to help. RRM uses Neighbor Discovery Packets (NDP) to measure the RF "distance" between radios. The end result is two sets of tables: the receive neighbor table and the transmit neighbor table. The receive neighbor table contains a list, for each radio, of what other radios it can hear NDP packets from. The list contains which channel the neighbor was heard on, and the signal strength it was heard at last. The transmit  neighbor table contains a list, again for each radio, on how well other radios can hear it. The transmit neighbor table also contains channel and power information for each neighbor.

With this information, you could set a criteria for disabling a 2.4 GHz radio. If a radio has more than X receive neighbors on the same channel as it does, with a power greater than Y, that radio should be disabled. Another criteria could be if a radio has more than A transmit neighbors on the same channel as it does with power greater than B, that radio should be disabled.

Getting the information to determine this is not easy through the CLI or web interfaces. You could do it, but it would be very time and tedious. Thankfully, there is WLCCA. WLCCA is a GUI tool that can read the output of the "show run-config" command and parse it for very valuable information. One of the things it can do is export the receive neighbor table in CSV format. Once it is in CSV format, it can be imported into a database and T-SQL can be used to get the information we want.

For WLCCA to work, you need to give it the output of "show run-config." One way of getting the output is logging the output of a CLI session to text file and issuing the command. Another way is to establish a CLI session and use the transfer upload commands to send the output to a TFTP server. This is the preferred method, especially if you have a controller with hundreds of APs. You may need to extend the timeouts on your TFTP server past the defaults for the transfer to complete successfully.

After you have the exported run-config, open WLCCA and import the file.

You'll be prompted with the dialog below to chose certain analysis options. You can uncheck most of the options. After clicking OK, you'll be prompted to select the file.

After importing the file, WLCCA can export the neighbor table. You perform this by going to Report Center menu.

A file dialog will open to allow you to chose the location and name of the exported file. The name you enter will be appended with "-Nearby" and given a .csv extension. Repeat these steps and export the AP Configuration List (CSV). This file will have "-APsConfig" appended to the file name you enter.

The next step is to import the CSV files into a database. For this blog, I chose Microsoft Access. Before importing in Access, the CSV files need to be cleaned up a bit to make the process smoother.

Open the CSV file for the neighbor list with Excel or a text editor. You will need to delete the first and third lines of the text file, and change the column headings so they don't contain spaces. Below is an example of the Nearby file with the edits made.

You'll also need to edit the APsConfig file in a similar way. This file has many columns, but for our purposes only four are necessary: the ID, AP name, 2.4 GHz channel, and 5 GHz channel. I used Excel for this. My polished data looked like this.

Now that the raw data is in an acceptable format, it can be imported in Access. Launch Access and create a new blank desktop database. Click on the External Data tab, then Text File to launch the wizard. First select the Nearby CSV file. Choose "Import the source data into a new table in the current database," and click OK. Next you'll be asked how to parse the file. Select "delimited" as shown and click Next.

Next, select the delimiter as a comma, and check the box to indicate that the first row contains field names.

Click Next, then Next again. Select the option to let Access add a primary key field, then click Next.

Give the new table a name. I use RxNbrs, which I will reference later in queries. Click Finish.

Repeat these steps to import the APsConfig file. There is one different step in the import process for the APsConfig file; you can chose the ID field as the primary key instead of asking Access to add one for you. When you finish, name the table APsConfig.

I promise, we're getting to the good stuff now. The first SQL query I will write will create the transmit neighbor table from the receive neighbor table. Click on the Create tab, then Query Design.

Don't add any tables, just click Close on the Show Table dialog. Switch to SQL view by selecting it in the upper left corner. Here's the query that will get the transmit neighbor table from the receive neighbor table:

SELECT RxNbr as [TxAP], AP AS [TxNbr], Power, Slot, Channel

The tx neighbor table is really just the inverse of the rx neighbor table! Click on the Save icon and name the query TxNbrs.

Now the real fun begins. Let's combine the information in the APsConfig table and the RxNbrs table to see what radios have Rx neighbors on the same channel they are configured for. Go to the Create tab again, and click Query Design. Click close on the Show Table dialog, and enter SQL view. Here is the query.

SELECT APsConfig.Name, COUNT(RxNbrs.RxNbr)
FROM APsConfig LEFT JOIN RxNbrs ON (APsConfig.[2dot4channel] = RxNbrs.Channel) AND (APsConfig.Name = RxNbrs.AP)
WHERE RxNbrs.Power > -61
GROUP BY APsConfig.Name
HAVING COUNT(RxNbrs.RxNbr) > 2
ORDER BY APsConfig.Name;

This query will produce a list of 2.4 GHz radios that have 3 or more neighbors on the same channel that it hears at a power greater than -61 dBm. You can tune the power level and count to your liking. I chose -61 because that is the level at which, even if the neighbor was transmitting at minimum power, the radio would hear it at about -82 dBm. (Remember, NDP packets are sent at the maximum power supported by the radio.)  My environment has several radios that meet this criteria.

Save this query as RxCandidates. Next, let do the same analysis for the tx neighbors. Here is the query:

SELECT APsConfig.Name, Count(TxNbrs.TxNbr) AS [CountOfTxNbr]
FROM APsConfig LEFT JOIN TxNbrs ON TxNbrs.Channel = APsConfig.[2dot4channel] AND TxNbrs.TxNbr = APsConfig.Name
WHERE TxNbrs.slot=0 AND TxNbrs.power >-61
GROUP BY APsConfig.Name
HAVING Count(TxNbrs.TxNbr) > 2

This query is getting the list of radios that have more than 2 tx neighbors that see the radio with a power greater than -61 dBm. Again, my environment has plenty of those.

Save this query as TxCandidates. Now we have a pretty good picture of  which radios can see other radios at high RSSI, and what radios can be heard by others at high RSSI. We can select radios that meet both criteria by executing the following query:

FROM RxCandidates 
WHERE Name IN (SELECT Name FROM TxCandidates);

Out of 480 APs in my sample deployment, 56 matched both my Rx and Tx criteria. This tells me that these 2.4 GHz radios occupy spaces that are already well covered by other radios, and can probably be disabled without affecting coverage. There are always caveats; make sure that the 2.4 GHz radio isn't necessary for RTLS or other services.

I know that the WLCCA tool is awesome, and has built-in reporting to help you find redundant radios. I just like working with data in SQ. If you are interested, try this in your own environment. If you get stuck, reach out to me on Twitter and I'll see if I can help.

Saturday, August 26, 2017

FRA and Macro/Micro Cell Operation - Part 2

Part 1 of this blog series looked at how Cisco 2800/3800 APs running in dual-5 GHz mode can steer clients from the macro cell to the micro cell using 802.11v BSS Transition Management frames. In this installment, I will look at what methods can be used if your clients don't support 802.11v.

Before going into the details of the other method (probe suppression), here is what I have observed while testing a mix of clients:
  • Both Android and iOS devices responded well to 802.11v Transition Management Requests. Sometimes the iOS device I was testing with would reject the request with a reason code 6, but most of the time it accepted the transition. 
  • If there are enough clients connected to the macro cell to warrant a transition to the micro cell, a client that does support 802.11v will be moved, even if it was in the macro cell "first." 
  • According to the latest RRM White Paper, if a client does not support 802.11v, but does support 802.11k, it can be transitioned, but not as gracefully. The client must request a neighbor report, and the returned neighbor list will be limited to the BSSID of the micro cell. The client will then be disassociated, after which it will hopefully connect to the micro cell. I was not able to replicate this; it was hard to find a client that supported 11k but not 11v. Turning off 802.11v on the WLAN resulted in no clients being transitioned at all, whether or not they supported 11k. 
Configuring probe suppression is shown below. Probe suppression can be configured to suppress only probe responses, or both probe responses and auth responses. 

(Cisco Controller) >config advanced client-steering probe-suppression enable probe-and-auth

(Cisco Controller) >show advanced client-steering

Client Steering Configuration Information

  Macro to micro transition threshold............ -55 dBm
  micro to Macro transition threshold............ -65 dBm
  micro-Macro transition minimum client count.... 1
  micro-Macro transition client balancing win.... 1
  Probe suppression mode......................... probe-and-auth
  Probe suppression validity window.............. 100 s
  Probe suppression aggregate window............. 200 ms
  Probe suppression transition aggressiveness.... 3
  Probe suppression hysteresis................... -6 dBm

The macro to micro transition threshold has a similar meaning with probe suppression as it did with 11v transition. If a new client is a transition candidate, probes received on the macro radio with an RSSI stronger than the macro to micro threshold will have their responses suppressed.

Probe suppression steering introduces four new parameters, only two of which are user configurable. The parameters perform the same function as those under Wireless -> Advanced -> Band Select, but have slightly different names.

The probe suppression aggregate window is an amount of time that a burst of probes from a client on a single change are considered a single probe. This is similar to the Scan Cycle Period Threshold value in Band Select. Sometimes clients will sends out probes in bursts of multiple probes. Below is a Motorola G4 probing out on 5 GHz. It sends bursts of 5 probes on the same channel, just milliseconds apart. The client-steering engine will treat these 5 probes as a single probe because they all happened within 200ms.

Probe Bursts

The probe suppression validity window is the amount of time that could elapse between probes (or bursts of probes) from a single client received on the macro radio. The default value is 100 seconds, and it acts as an age-out timer.

The validity window works with the transition aggressiveness value, which corresponds to the probe cycle count value under Band Select. The transition aggressiveness value sets a limit on the number of times probe responses from the macro radio will be suppressed. The default is 3. If a probing client was a candidate to have probe responses from the macro cell suppressed, and the client had probed out on the macro channel 3 times within 100 seconds, the fourth probe (or burst) on the macro radio would be answered. This allows clients to connect to the macro cell if they refuse to connect to the micro cell because the RSSI at the client is too low.

The probe suppression hysteresis is a user configurable value between -3 and -6 dBm, with the default being -6. When Cisco uses the word hysteresis, it refers to a dampening method to prevent clients from bouncing back and forth between radios. In the context of Client Roaming, under Wireless -> 802.11a/b, the hysteresis value tells CCX clients to move to a new AP only if the RSSI value is 3 dB better than the current AP. I stumbled across the meaning of the hysteresis in probe suppression by trying to adjust the values of the transition RSSI thresholds.

(Cisco Controller) >config advanced client-steering transition-threshold macro-to-micro -60

Value must be greater than micro to Macro RSSI - probe suppression hysteresis

(Cisco Controller) >config advanced client-steering transition-threshold micro-to-macro -60

Value must be less than Macro to micro RSSI + probe suppression hysteresis

In this case, it looks like the -6 dBm hysteresis means that probes for clients already associated to the AP would have to be 6 dB weaker/stronger to get moved to the other cell. This makes sense, as you don't wont the client bouncing back and forth between the micro and macro cells because of small differences in RSSI that could just be from different client device orientations.

My testing with probe suppression for client steering was mostly subjective. Since the clients did not associate, I could not use "show client detail" to see the RSSI of the probe requests at the AP. I could definitely see probe suppression in action over the air. Below is a capture on channels 44 and 161. The macro cell was on channel 161, and you can see probes on 161 being ignored.

Probe Suppression of Macro Cell
The client connects to the micro cell on channel 44.

Other testing I conducted involved the transition aggressiveness factor. My Moto G4 cycles through the 5 GHz channels in about 6 seconds. With a transition aggressiveness factor of three, it should take about 24 seconds to see probe responses from the macro cell. My observations lined up with this prediction within a few seconds.

Overall, I didn't find the probe suppression method of client steering to be as predictable as the 11v method, but it did work satisfactorily. Given that most clients now support 11v I would prefer using that method over probe suppression.

Sunday, August 13, 2017

FRA and Macro/Micro Cell Operation - Part 1

NOTE: This blog is not about the merits, performance, or lack thereof with dual-5 GHz radios. This is a blog about the operation or dual-cell APs, specifically how the AP transitions clients between the macro and micro cells.

Cisco 2800/3800 APs support Flexible Radio Assignment, which allows the 2.4 GHz radio to flip to either a monitor or another client-serving 5 GHz radio. When the AP is operating in this dual-5 GHz mode, the normal radio (slot 1) powers the macro cell, and the flexible radio (slot 0) powers the micro cell. The terms macro and micro are used for two reasons. When the flexible radio is put into 5 GHz mode, either automatically through the FRA algorithm or manually, the radio switches to a more directional antenna than the normal 2.4 GHz antenna. See the antenna radiation patterns of the 3800 from the AP2800/3800 Deployment Guide:

AP3800 Antenna Patterns
The second reason is the reduction in power on the flexible radio when operating in dual-5 GHz mode. The flexible radio is locked into transmitting at the lowest power supported, which is usually 2 dBm. The reduction in power, along with a mandatory separation of at least 100 MHz between the micro and macro radios, is to reduce the near-field effects of the two radios interfering with one another.

Looking at the elevation pattern (right), you can see that the macro radio has a "dead spot" directly below the AP. The micro radio (blue line) has a 15 dB advantage over the macro radio in this dead spot. Unfortunately because of the power limit on the micro radio, most clients will still perceive the macro radio as having a higher signal strength. This is even more true when not directly under the AP.

In order to take advantage of the micro cell, the AP/Controller has to have a way to nudge clients that connect to the macro cell over to the micro cell. This mechanism is called client steering, and the default method to steer clients is the 802.11v BSS Transition Request.

To see how client steering works with flexible radios the following settings must be made on the controller:
  • Flexible Radio Assignment must be enabled globally under Wireless -> Advanced. 
  • The flexible radio in the AP2800/3800 can either be set to Auto or client serving. When in auto mode, the FRA algorithms determines if it is better to leave the radio in 2.4 GHz client serving mode, monitor mode, or 5 GHz client serving mode. For my testing I manually configured the flexible radio as 5 GHz client serving. 
  • BSS Transition Management must be enabled for the WLAN that the clients will connect to. 
Unlike normal BSS Transition Management between distinct APs, Optimized Roaming is not required for 11v frames to be used to transition clients between the macro and micro cells. Radio power and channel settings can be left to auto. The power will not be adjustable on the micro radio, even if it it set to manual. For ease of testing, I removed DFS channels from the 802.11a channel plan. This is how my AP3800 looked:

AP Name                          Channel    TxPower       Allowed Power Levels    
-------------------------------- ---------- ------------- ------------------------
FRA-AP                           44*        *8/8 ( 2 dBm) [22/19/16/13/10/7/4/2]
FRA-AP                           149*       *2/7 (16 dBm) [19/16/13/10/7/4/2/0]

It's also helpful to see the BSSID values that are assigned to the macro and micro cells, to be able to confirm values in the BSS Transition Management requests.

(Cisco Controller) >show ap wlan 802.11a FRA-AP

Site Name........................................ TestGroup
Site Description................................. 

WLAN ID          Interface          BSSID                            
-------         -----------        --------------------------       
14              xxxxxxxxxx           58:ac:78:xx:xx:3f  
(Cisco Controller) >show ap wlan 802.11-abgn FRA-AP

Site Name........................................ TestGroup
Site Description................................. 

WLAN ID          Interface          BSSID                         
-------         -----------        --------------------------     
14              xxxxxxxxxx           58:ac:78:xx:xx:30            

Note the difference in the last octet of the BSSIDs between the macro and micro cells. I will reference this later.

The command to list the client steering parameters, their default values and explanation are shown below:

(Cisco Controller) >show advanced client-steering

Client Steering Configuration Information

  Macro to micro transition threshold............ -55 dBm
  micro to Macro transition threshold............ -65 dBm
  micro-Macro transition minimum client count.... 3
  micro-Macro transition client balancing win.... 3
  Probe suppression mode......................... disabled
  Probe suppression validity window.............. 100 s
  Probe suppression aggregate window............. 200 ms
  Probe suppression transition aggressiveness.... 3
  Probe suppression hysteresis................... -6 dBm

Macro to micro transition threshold: This is a value in RSSI above which a client can be transitioned from the macro cell to the micro cell. The default is -55 dBm. This is the RSSI at the AP. For example, if a client connects to the macro cell and its RSSI at the AP is greater than -55 dBm, it is a candidate to be transitioned to the micro cell. 

Micro to macro transition threshold: This is a value in RSSI below which a client can be transitioned from the micro cell to the macro cell. The default it -65 dBm. For example, if a client connected to the micro cell initially and had a RSSI at the AP of less than -65 dBm, it will be transitioned to the macro cell. Given the power difference between the macro and micro cells, is is rare for clients to be transitioned from the micro cell to the macro cell.

micro-Macro transition minimum client count: The minimum number of clients in either macro or micro cells that will trigger a transition for the next client connecting. The default is 3. For example, if there are 3 clients in the macro cell, the 4th client that tries to connect to the macro cell will be transitioned to the micro cell, if it meets the requirements for macro-to-micro transition threshold RSSI. 

micro-Macro transition client balancing window: This specifies the minimum difference in client count between the macro and micro cells that must exist before a client can be transitioned between cells. The default is three. Imagine a scenario where there are 5 clients in the macro cell and 3 in the micro cell. The difference in clients between the cells is 2, which is below the default balancing window value of 3. The next client that connects to the macro cell will not be transitioned to the micro cell, even it it meets the RSSI requirements. Now there are 6 clients in the macro cell and 3 in the micro cell, and the difference in client count now meets the balancing window requirement. The next client that connects to the macro cell will be transitioned to the micro cell, IF it meets the RSSI requirement. 

I only had two 802.11v capable clients to test with, so I changed both the transition minimum client count and transition client balancing window to 1.

(Cisco Controller) >show advanced client-steering

Client Steering Configuration Information

  Macro to micro transition threshold............ -55 dBm
  micro to Macro transition threshold............ -65 dBm
  micro-Macro transition minimum client count.... 1
  micro-Macro transition client balancing win.... 1

With these parameters, I could connect one client to the macro cell, and the second client to connect would (hopefully) get transitioned to the micro cell. I used 'debug client' and specified the MAC addresses for both clients.

I connected the first client to the SSID and confirmed through the CLI that it had connected to the macro cell. When using an AP setup in macro/micro, you will see extra lines in the 'debug client' output that contain XOR:

f8:95:c7:xx:xx:xx Association received from mobile on BSSID 58:ac:78:xx:xx:xx AP FRA-AP
f8:95:c7:xx:xx:xx Station:  F8:95:C7:xx:xx:xx  trying to join WLAN with RSSI 208. Checking for XOR roam conditions on AP:  58:AC:78:xx:xx:xx  Slot: 1
f8:95:c7:xx:xx:xx Station:  F8:95:C7:xx:xx:xx  is not eligible for XOR roam on AP  58:AC:78:xx:xx:xx 

The first client is not eligible for transition to the micro cell because it is the only client in the macro cell. Let's see what happens when the second client connects to the macro cell.

80:00:6e:xx:xx:xx Processing assoc-req station:80:00:6e:xx:xx:xx AP:58:ac:xx:xx:xx-01 ssid : xxxxxx thread:1a722e30
80:00:6e:xx:xx:xx Station:  80:00:6E:xx:xx:xx  trying to join WLAN with RSSI 212. Checking for XOR roam conditions on AP:  58:AC:78:xx:xx:xx  Slot: 1
80:00:6e:xx:xx:xx Station:  80:00:6E:xx:xx:xx  scheduled to transition to new BSS on AP  58:AC:78:xx:xx:xx

We see in the second line that the debug output shows a RSSI value of 212. I'm not sure how this scale of RSSI equates to a power level in dBm, but it appears to be above the threshold of -55 dBm. The third line indicates that the client will be scheduled for transition.

Before the client is transitioned, it sends a 802.11k Neighbor Report request. The debug output is interesting.

80:00:6e:xx:xx:xx Got action frame from this client.
80:00:6e:xx:xx:xx Station:  80:00:6E:xx:xx:xx  sent 802.11K neighbor request to AP  58:AC:78:xx:xx:xx 
80:00:6e:xx:xx:xx Station:  80:00:6E:xx:xx:xx  sent request with RSSI (0) to XOR roam capable AP  58:AC:78:xx:xx:xx  Slot 1
80:00:6e:xx:xx:xx Station:  80:00:6E:xx:xx:xx  limiting neighbors to sibling radios on AP  58:AC:78:xx:xx:xx 

Because the client-steering engine had already decided to transition this client to the micro cell, it limits the list of neighbors it will send back to the BSSIDs on the micro cell.

Note that the transition of the client is scheduled; it doesn't happen immediately. Perhaps this is to prevent flapping of clients transitioning between the micro and macro cells as clients join and leave the cell. It may also be delayed to allow clients in motion to roam to other macro cells, instead of being pulled back into the micro cell of a far away AP. My testing indicates that the amount of time that elapses between the association of the client that triggers the XOR roam and the transmission of the 802.11v BSS Transition Management Request varies. If I find more information I will update this blog.

Below we see the sequence of events as the second client is transitioned to the micro cell.

80:00:6e:xx:xx:xx apf80211vSendPacketToMs: 802.11v Action Frame sent successfully to wlc
80:00:6e:xx:xx:xx Setting Session Timeout to 4 sec - starting session timer for the mobile 
80:00:6e:xx:xx:xx Setting Session Timeout to 40 sec - starting session timer for the mobile 
80:00:6e:xx:xx:xx Got action frame from this client.
80:00:6e:xx:xx:xx Processing assoc-req station:80:00:6e:xx:xx:xx AP:58:ac:78:xx:xx:xx-00 ssid : xxxxx thread:18f453d8
80:00:6e:xx:xx:xx Station:  80:00:6E:xx:xx:xx  trying to join WLAN with RSSI 217. Checking for XOR roam conditions on AP:  58:AC:78:xx:xx:xx  Slot: 0
80:00:6e:xx:xx:xx Station:  80:00:6E:xx:xx:xx  is not eligible for XOR roam on AP  58:AC:78:xx:xx:xx

Line 1 is the debug entry for sending the 11v BSS Transition Request, which is acknowledged in line 4. The client re-associates in line 5. Line 6 indicates that the client is associating to the XOR radio slot 0, which is the micro cell. In line seven we see that the client is not eligible for transition from the micro back to the macro cell: the difference in the number of clients between the cells (0) is not greater than the transition client balancing window (1).

Over the air, we see a BSS Transition Management request that includes a candidate list. The only entry in the candidate list is the BSSID of the WLAN on the micro radio.

BSS Transition Management Request
Wireshark does not completely decode the candidate list entries, but they are in the same format as an 802.11k Neighbor Report. I highlighted the important elements: the BSSID and the channel number. The BSSID matches what we expect, and so does the channel number, 44. The client responds to the request with the following action frame.

BSS Transition Management Response

The important fields here are the BSS Transition Response Status Code and the BSS Transition Target BSS. The Status Code communicates whether the client accepts or rejects the request, and a value of 0 indicates that client accepts it. The Transition Target BSS indicates the BSSID that the client intends to transition to. In this case, it matches the BSSID in the candidate list from the request frame.

Now we see that the second client has been transitioned to the micro radio on slot 0.

(Cisco Controller) >show client summary 
MAC Address       AP Name           Slot Status        WLAN  Auth Protocol         Port Wired Tunnel  Role
----------------- ----------------- ---- ------------- ----- ---- ---------------- ---- ----- ------- ----------------
80:00:6e:xx:xx:xx FRA-AP             0   Associated     14   Yes   802.11n(5 GHz)   13   No    No      Local           
f8:95:c7:xx:xx:xx FRA-AP             1   Associated     14   Yes   802.11n(5 GHz)   13   No    No      Local    

That's it for Part 1 of this series on client steering between macro/micro cells on an AP3800. Next I will look at what can be done if clients do not support 11v.