Sunday, April 16, 2017

Broadcast Key Rotation - Part 2

At the end of my last blog, I discussed what happens if a client misses the broadcast key rotation for the AP it is connected to. We know that a client that misses the key rotation will be disconnected, but how many retries are made before the client is removed?

Here are the default EAP settings for a Cisco controller-based wireless network:


The parameters EAPOL-Key Timeout and EAPOL-Key Max retries should be the answer to the question. The default settings would mean there are three attempts at sending the broadcast key to a client, with the 2nd and 3rd attempts being spaced apart by 1 second. If a client can't get the new broadcast key in 2 seconds, it is disconnected.

I tested this by setting an AP at minimum power, using channel 165, and my Moto G4. After connecting, I moved my phone away from the AP and watched debug output on the controller console to see if the key rotation was successful. Eventually I got far enough away and put enough attenuation between my phone and the APs for the key rotation to fail.

First, let's have a look at the output of debug dot1x all:


The first seven lines show that the key rotation has started and that the first attempt is being made at transmitting the new key to my phone. Take note of the third line, where it states "message 5 - group." About 1 second later, the first retransmission happens, after the timeoutEvt message appears in the log. Note that at the end of that line it reads "message  = M5." M5 must mean the group key, based on line 3 in the debug. Another second goes by, and there is another timeoutEvt message. The key is transmitted one more time. Another second goes by, and at 15:04:33 the client is disconnected.

It appears that the EAPOL-Key Timeout and EAPOL-Key Max Retries parameters do indeed control the behavior of broadcast key retransmissions. While I was logging the debug output, I was also capturing frames on channel 165 with an Aruba IAP. I fired up Wireshark, applied the filter that shows the key rotation frame (wlan.ta = wlan.sa = BSSID, wlan.ra = Moto 4G), and scrolled to 15:04:30. And, there's nothing there! The key rotation frames were not seen over the air.

The debug output clearly says that the key was sent three times, so what happened? To find out, I had to go back in the capture to 15:03:48, where I saw this



Less than a minute before the key rotation, my client sent a Null Data Packet to the AP saying that it was going into a sleep mode. I wrote a Wireshark filter to look for all Null Data Packets from my phone and what power management message it was sending. The message at 15:03:48 was the last one sent by the phone on channel 165.

Since the AP had not received an NDP from my phone by the time the broadcast key rotation started, the AP believed the client was still in a sleep state. An AP will not transmit a queued frame to an associated client if it thinks it is asleep. It needs to know the client is awake by receiving a NDP with the "client will stay awake" bit set to 1. This goes for key rotation frames too. Looking further in the capture, the AP did not attempt to transmit de-authentication frames to the client either.

What's the take away here? The combination of power save measures and key rotation can result in clients being disconnected from a WLAN without knowing they have been kicked off. It's known that some clients ignore the DTIM interval in beacons, preferring to save power over receiving broadcast traffic (remember, broadcast and multicast traffic is delivered at the DTIM interval beacon, when the DTIM counter value is zero). Clients are expected to be awake at the DTIM interval beacon to receive broadcast and multicast traffic, but some clients would rather save battery power.

Personally, I recommend increasing the default broadcast key rotation interval from the default 1 hour to something a bit longer, like 12 or 24 hours. If you have a WLAN that is not supporting voice, consider increasing the DTIM period to 3. This will allow clients that do honor the DTIM interval to conserve power, while avoiding problems with clients that don't honor it.




1 comment:

  1. Hello GiantsNerd.

    Once again I thank you for the knowledge that you are sharing.

    ReplyDelete