7
Evan J. Wilkerson 06/30/2021 Following a migration from one manufacturer's controller platform to a newer one in our production environment, I began to receive reports from end-users of issues with voice while on company-provided mobile phones at our branch locations. The problem was that client stations were dropping audio during phone calls in both the upstream and downstream directions while the user was moving around. The same symptoms did not occur when not on a phone call or when using other non-VoIP applications. Branch locations had a standard setup for voice-over-wireless (VoWLAN), including support for 802.11k/v, the use of 802.11r for fast BSS transition (FT), U-APSD power saving, and WMM with well-tuned EDCA parameters to ensure traffic prioritization for real-time voice payload, and they were also locked to the 5GHz band when possible. The voice clients themselves were VoWLAN-friendly, including Voice-Enterprise and WMM-Admission Control certifications through the Wi-Fi Alliance, and were 802.11ac capable with support for two spatial streams. After obtaining enough detailed information from end-users, I went to the lab to recreate and better understand the issue and its symptoms. One thing that caught my attention when doing so had been that the client would only drop its connection when I had been using the voice application and had been on an active call. When I had not been on a call or used another non-VoIP application, I would experience no issues when reassociating to a new access point (AP). It was because of this difference that I began to expect an issue involving QoS-related operations. After performing an over-the-air (OTA) capture, I confirmed proper 802.1D priority tags, and with that, correct access category assignment; AC_VO for voice in this case. One thing I noticed in the capture had been the client's behavior immediately following the BSS transition. I saw a standard open system authentication exchange followed by a reassociation request and response. However, the client station reacted to the reassociation response with a deauthentication containing an unspecified reason code.

E van J. W i l kerson

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Evan J. Wilkerson06/30/2021

Following a migration from one manufacturer's controller platform to a newer one in ourproduction environment, I began to receive reports from end-users of issues with voice while oncompany-provided mobile phones at our branch locations. The problem was that client stationswere dropping audio during phone calls in both the upstream and downstream directions whilethe user was moving around. The same symptoms did not occur when not on a phone call orwhen using other non-VoIP applications.

Branch locations had a standard setup for voice-over-wireless (VoWLAN), includingsupport for 802.11k/v, the use of 802.11r for fast BSS transition (FT), U-APSD power saving,and WMM with well-tuned EDCA parameters to ensure traffic prioritization for real-time voicepayload, and they were also locked to the 5GHz band when possible. The voice clientsthemselves were VoWLAN-friendly, including Voice-Enterprise and WMM-Admission Controlcertifications through the Wi-Fi Alliance, and were 802.11ac capable with support for two spatialstreams.

After obtaining enough detailed information from end-users, I went to the lab to recreateand better understand the issue and its symptoms. One thing that caught my attention whendoing so had been that the client would only drop its connection when I had been using thevoice application and had been on an active call. When I had not been on a call or used anothernon-VoIP application, I would experience no issues when reassociating to a new access point(AP).

It was because of this difference that I began to expect an issue involving QoS-relatedoperations. After performing an over-the-air (OTA) capture, I confirmed proper 802.1D prioritytags, and with that, correct access category assignment; AC_VO for voice in this case. Onething I noticed in the capture had been the client's behavior immediately following the BSStransition. I saw a standard open system authentication exchange followed by a reassociationrequest and response. However, the client station reacted to the reassociation response with adeauthentication containing an unspecified reason code.

I began to inspect the contents of the FT OTA exchange. The open systemauthentication frame upstream from the client had all of the fields I would expect; the PMK-R0name, mobility domain (MD) ID, supplicant Nonce, and R0 key-holder ID. The open systemauthentication frame downstream from the AP added the R1 key-holder ID and authenticatorNonce. The reassociation request from the client to the AP is where I noticed something Itypically didn’t see outside of the generic FT exchange. The reassociation request added amessage integrity code (MIC) into the FT information element (IE), which I expected, but theMIC control element count was set to five. This field contains the number of elements used incalculating the MIC and will typically be the robust security network (RSN), MD, and FT IEs. Itjust so happened that this environment was utilizing traffic specification (TSPEC) for calladmission control (CAC). Because of this, the client was including a resource request in thereassociation. This function added two more MIC-protected elements into the MPDU, a resourceinformation container (RIC) and a TSPEC IE.

Wireshark offered up more visibility regarding the addition of these IEs. Even though thereassociation request showed up as a malformed MPDU or frame, I saw evidence of theadditional RIC request in the form of the RIC IE. The other element that wasn’t able to bedecoded would have been a TSPEC IE used for traffic stream setup. The problem occuredwithin the AP’s response. The AP sent a reassociation response with a MIC control elementcount set to five but was neglecting to include the RIC and TSPEC IEs, resulting in a MICfailure.

This failure is what caused the clients to deauthenticate themselves and perform a full802.1X/EAP authentication. While the disconnection would often be somewhat brief (500ms to700ms), there were many instances where audio would drop for greater than five seconds,causing an unacceptable user experience for any type of application, let alone one used forreal-time voice.

After gathering my findings, I coordinated with the phone and AP manufacturers toensure both sides were aware of the issue. Even though not at fault, the phone manufacturerconfirmed they were able to reproduce the issue in their lab environment. The AP manufacturereventually did the same and created an accompanying defect for their product.

This whole ordeal caused us to rethink our short-term security strategy. While we wereable to remedy the issue by disabling CAC, we had grown tired of the problems faced in ourexperience supporting voice using WPA2-Enterprise at our scale. We had used EAP-PEAP, buthad opted not to require client-side trust of the server certificate, and the same username andpassword were already used on all VoIP clients. The primary benefit over WPA2-Personal thatwe even had at this point was an element of forward secrecy, or that knowing the credentialswould only grant you network access, but not the ability to decrypt other wireless traffic. Afterweighing our options, we eventually decided to convert all of our VoIP SSIDs to WPA2-Personal,not only for its reliability, but to allow for simple site redundancy with the use of local APauthentication.

We have taken proper steps to secure the credential. The passphrase is greater than 20characters, includes special characters, and is stored in a centralized password manageravailable only to the network administration team. We have updated the security policy toprohibit sharing the passphrase, both over the phone and in plain-text form. We achieved this byusing a staging application on the client devices that only needs to scan a 2D barcodecontaining encrypted configuration settings allowing us to update the passphrase at regularintervals by coordinating with branch sites and sending out updated provisioning barcodes. Wecontinue to thoroughly test 802.1X/EAP and WPA3 deployments in our labs, knowing full wellthat WPA2-Personal isn’t a long-term solution for us.