Friday, April 21, 2017

RRM Neighbor Timeout Factor and Channel Utilization

If you work with Cisco wireless networks, I highly recommend that you read their Radio Resource Management white paper. If you want to understand how RRM works and how to tune it for your environment, this is the document you need to read.

The foundation of all RRM operations is the neighbor table. It is a list of each radio in the system and how other radios hear it, and how it hears other radios. The neighbor table is built using Neighbor Discovery Protocol, or NDP, frames. NDP frames are sent at the maximum power and minimum data rate supported by the radio and channel. The frequency at which NDP are sent depends on the Neighbor Packet Frequency settings under 802.11a/b - RRM - General. The Neighbor Packet Frequency determines how often a radio goes off-channel (yes, I said off-channel) to transmit a NDP for each channel in the band. Which channels are used to transmit NDP frames is controlled by the Noise/Interference/Rogue/CleanAir Monitoring Channels setting, also in 802.11a/b - RRM - General.

The default setting is Country Channels, which means the radio will try to send NDP frames for every channel allowed in your defined regulatory domain for that band. (DFS channels are special; see the RRM white paper for more details).

Just like all frames sent on a wireless medium, the NDP frames must follow the same rules of the DCF. The medium must be idle in order to transmit a NDP frame on the specific channel it is going to be sent on. Unlike regular frames, the NDP frames will not wait very long for the opportunity to transmit; remember it is off-channel and can't serve clients. The radio will simply re-schedule the next NDP transmission for that channel at the defined Neighbor Packet Frequency.

To compensate for short-term problems with transmitting NDP frames, RRM operation uses a Neighbor Pruning Interval value. After a neighbor is discovered, it will stay in the Neighbor Table for a specific amount of time, even if NDP transmission fails due to high channel utilization.

Prior to 8.0, the Neighbor Pruning Interval was fixed at 1 hour. In 8.0, it is fixed at 15 minutes.

Let's look at the defaults for 7.6 code and do an exercise. The Neighbor Packet Frequency is 60 seconds default in 7.6. In order for a radio to drop off the neighbor table, the NDP would need to fail transmit

60 times in a row. That's a lot of chances for an NDP to get through, which would result in a very stable Neighbor Table.

In 8.0 and above, things change. The default Neighbor Packet Frequency increases to 3 minutes, and the Neighbor Pruning Interval shortens to 15 minutes. This means a neighbor could drop off the table if 5 NDP transmissions in a row fail.

Why shorten the Neighbor Pruning Interval? Since the neighbor table is used for both DCA and TPC, a neighbor dropping out of the table could result in radios near it increasing their power. A shorter Neighbor Pruning Interval results in faster adjustment to the loss of an AP.

In 8.1 and above Cisco introduced a new parameter call the Neighbor Timeout Factor, or NTF for short. The NTF allows the user to adjust the Neighbor Pruning Interval in the following way:

In order for a radio to drop out of the Neighbor Table in 8.1 and above, the NDP transmission would have to fail NTF times in a row.

Now let's take a look at how this all ties in with channel utilization. Suppose there are two APs; AP "A" on channel 36 and "B" on channel 149. The two radios are close enough to one another to "hear" each other's NDP frames. Every 180 seconds, "A" goes off-channel to send a NDP frame on channel 149.

Now suppose that channel utilization on "B" is x%. It's a bit of a simplification, but this means that "A" has a x% chance to fail its NDP transmission on channel 149.

The chance that "A" will fail to transmit its NDP frame on channel 149 NTF times in a row is
Let's say that x is 50% and NTF was 5. The chance that "A" would fail to transmit a NDP frame on channel 149 would be .03, or 3%. Conversely, the chance that NDP transmission would succeed on at least one of the 5 attempts would be 97%, or

Suppose you wanted the chance of at least one NDP frame to be transmitted out of 5 attempts to be 99%. What would the channel utilization have to be under for this to happen?

Channel utilization on "B" would need to be under 40% to guarantee 99% stability in the Neighbor Table. Keep in mind that channel utilization is not the only factor in NDP transmission; you also have to deal with scan defer settings for voice traffic.

The take aways:

  • The longer the Neighbor Prune Internal (higher the NTF), the more stable RRM will be. The tradeoff is not adjusting to loss of APs as quickly. 
  • Use the formulas above to calculate what channel utilization you need to stay under in order for NDP transmission to succeed for a given NTF. Plug that number into your trap thresholds. 
  • Dense environments with high channel utilization or voice clients will need higher NTF values. The default of 5 may not be enough. 
  • What works for 5 GHz may not work for 2.4 GHz. Consider using different NTF values for each band.








No comments:

Post a Comment