Upload
vuongmien
View
228
Download
0
Embed Size (px)
Citation preview
Troubleshooting Wireless LAN - Tips and Tricks
Surendra BG – WNBU Escalations
• Where do we start?
• Troubleshooting Basics
• AP Troubleshooting
• Client Troubleshooting
• 8.1 Flex debug enhancements
• AP Radio Troubleshooting
• Troubleshooting Tools
• Troubleshooting Scenarios
Agenda
Where do we start??
Where do we start?
IP
DHCP
WLCIP
ISE
CAPWAPE
OIP
802.11
CAPWAP
RADIUS
Chan. 1
802.11 Management
802.11 Management
EAP
IP
su
pp
.
driv
er
rad
io
• Client can’t connect….
5
Troubleshooting Basics
• Troubleshooting 101
• Clearly define the problem
• Understand any possible triggers
• Know the expected behaviour
• Reproducibility
• Do not jump into conclusions
Problem
Definition
Questions
Tests
Solution(s)
Analysis
6
AP Troubleshooting
WLC Discovery
DTLS/Join
Image Download
Configuration Check
REG
More information:
http://www.cisco.com/c/en/us/support/docs/wireless-mobility/wireless-lan-wlan/70333-lap-registration.html
AP Join Process
8
Broadcast on Local Subnet
Locally stored WLC IP Address/Static IP
DHCP Option 43
DNS Resolution for "CISCO-CAPWAP-CONTROLLER.localdomain“
WLC Discovery Mechanisms
9
Troubleshooting Lightweight APs
• Make sure the AP is getting an address from DHCP (check the DHCP server leases for the AP’s MAC address)
• If the AP’s address is statically set, ensure it is correctly configured
• Try pinging from AP to controller and vice versa
• If pings are successful, ensure the AP has at least one method to discover the WLC
• Console or telnet/ssh into the controller to run debugs
• If you do not have access to APs, use “show cdp neighbours port <x/y> detail” on connected switch to verify if the AP has an IP
Can the AP and the WLC communicate?
10
Important Outputs
On the WLC
- show msglogs
- show traplogs
On the AP
- show tech
- show log
- show capwap client rcb
- show capwap client config
- sh capwap reap status
Show Commands
11
Important Outputs
• On the WLC:
- debug mac addr <AP Ethernet/Radio mac>
- debug capwap events enable
- debug capwap errors enable
- debug dtls all enable
- debug pm pki enable
• On the AP
- debug dhcp detail
- debug capwap client detail
- debug capwap client events/error
Debugs to be enabled
12
Exercise: AP Troubleshooting 1
Scenario 1
• Company X has deployed the wireless network for many years and after a power outage none of the APs are registering.
• WLC and AP access is available
What question would you make?
What do you Check first?
Problem Description
14
Scenario 1
• WLC show msglogs
Jul 10 16:13:52.443 spam_lrad.c:6164 LWAPP-3-PAYLOAD_ERR: Join request does not contain valid certificate in certificate payload - AP 00:11:22:33:44:55
*osapiBsnTimer: Oct 29 11:05:04.571: #DTLS-3-HANDSHAKE_FAILURE: openssl_dtls.c:2962 Failed to complete DTLS handshake with peer 192.168.202.8
AP Show log
*Oct 29 18:01:56.107: %PKI-3-CERTIFICATE_INVALID_EXPIRED: Certificate chain validation has failed.
The certificate (SN: 7E3446C40000000CBD95) has expired. Validity period ended on 14:38:08 UTC Oct 26 2021Peer certificate verification failed 001A
*Oct 29 18:01:56.107: DTLS_CLIENT_ERROR: ../capwap/base_capwap/capwap/base_capwap_wtp_dtls.c:496
Certificate verified failed!
*Oct 29 18:01:56.107: %DTLS-5-SEND_ALERT: Send FATAL : Bad certificate Alert to 192.168.10.10:5246
*Oct 29 18:01:56.107: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.10.10:5246
Going forward..
15
Scenario 1
• Show time – on the WLC
Workaround :
- Disable NTP, Change the WLC clock time to several weeks before the expiration
Watch out for…
CSCuq19142 LAP/WLC MIC or SSC lifetime expiration causes DTLS failure
https://supportforums.cisco.com/document/12453081/lightweight-ap-fail-create-capwaplwapp-connection-due-certificate-expiration
Solution :
Upgrade to the new code posted on the CCO
What to look next?
16
Exercise: AP Troubleshooting 2
Scenario 2
• Company X purchased few new Access Points for the extension of the network, the new APs are not joining the WLC
What question would you make?
What do you Check first?
Problem Description
18
Scenario 2
• WLC msglogs
*spamApTask4: May 09 16:14:47.135: #LWAPP-3-RD_ERR4: capwap_ac_sm.c:3122 The system detects an invalid regulatory domain802.11bg:-A 802.11a:-A for AP 1c:aa:07:43:ef:10
*spamApTask4: May 09 16:14:47.135: #LOG-3-Q_IND: spam_lrad.c:10956 Country code (AE ) not configured for AP 1c:aa:07:43:ef:10[...It occurred 2 times.!]
*spamApTask4: May 09 16:14:47.134: #LWAPP-3-RD_ERR8: spam_lrad.c:10956 Country code (AE ) not configured for AP 1c:aa:07:43:ef:10
Going Forward..
19
Scenario 2
• Make sure you match your APs’ regulatory domain with your WLCs
http://www.cisco.com/c/en/us/products/collateral/wireless/aironet-1300-series/product_data_sheet0900aecd80537b6a.html#wp9005314
• In a multiple-controller environment, the RRM startup mode is invoked after an RF Group leader is elected
Solution:
20
Exercise: AP Troubleshooting 3
Scenario 3
• Company X purchased few new Access Points for the extension of the network, the new APs are not joining the WLC and the APs are on a different VLAN
What question would you make?
What do you Check first?
Problem Description
22
Scenario 3
Debug capwap events/errors enable
Ignoring Primary discovery request received on a wrong VLAN (10) on interface (1) from AP a4:0c:c3:47:e0:a0
WLC Uplink Switchport Configuration
interface TenGigabitEthernet1/1
switchport trunk encapsulation dot1q
switchport trunk native vlan 12
switchport mode trunk
end
Going Forward..
23
Scenario 3
- Best practice
• The AP is not sending the discovery request to the management IP address.
• The controller is reporting a broadcast discovery request from a VLAN that is not configured on the controller.
• This typically occurs when we trunk allowed VLANs instead of restricting them to wireless VLANs
Solution : Make sure at least by one way the AP can discover the WLC.
Best Practices Guide
http://www.cisco.com/c/en/us/td/docs/wireless/technology/wlc/8-0/82463-wlc-config-best-practice.html
Solution:
24
Exercise: AP Troubleshooting 4
Scenario 4
• Company X purchased few new Access Points for the extension of the network, the new APs are not joining the WLC
What question would you make?
What do you Check first?
Problem Description
26
Scenario 4
WLC msglogs / Syslog
*spamApTask2: Apr 01 15:08:53.256: %CAPWAP-3-DISC_MAX_AP1: capwap_ac_sm.c:1554 Dropping discovery request from AP c8:f9:f9:ed:29:70 - limit for maximum APs supported 375 reached
*spamApTask2: Apr 01 15:08:53.256: %LWAPP-3-COUNT_ERR3: spam_lrad.c:5760 Mismatch in the internal count (used entries 376, normal 375, mesh 0)
*spamApTask6: Apr 01 15:08:53.046: %CAPWAP-3-DISC_MAX_AP1: capwap_ac_sm.c:1554 Dropping discovery request from AP c8:f9:f9:ed:14:10 - limit for maximum APs supported 375 reached
*spamApTask6: Apr 01 15:08:53.046: %LWAPP-3-COUNT_ERR3: spam_lrad.c:5760 Mismatch in the
internal count (used entries 376, normal 375, mesh 0)
Going Forward..
27
Scenario 4
• Do we have enough AP count license?
• Check for MAX allowed AP under WLC configuration
Solution:
28
Exercise: AP Troubleshooting 5
Scenario 5
• Company X deployed IPv6 for infrastructure WLC/AP, the APs are not registering to the WLC
What question would you make?
What do you Check first?
Problem Description
30
Scenario 5
AP CAPWAP event/error debug
*Jan 23 18:25:26.211: %LINEPROTO-5-UPDOWN: Line protocol on Interface Dot11Radio0, changed state to up
*Jan 23 18:25:44.311: %CLEANAIR-6-STATE: Slot 1 enabled
*Jan 23 18:25:51.091: %CLEANAIR-6-STATE: Slot 0 enabled
*Jan 23 18:26:20.603: %CAPWAP-3-DATA_KEEPALIVE_ERR: Failed to receive data keep-alive
*Jan 23 18:26:20.603: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 2605:6000:F24C:B000::51:5246
*Jan 23 18:26:25.667: %WIDS-6-DISABLED: IDS Signature is removed and disabled.
*Jan 23 18:26:36.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 2605:6000:F24C:B000
Going Forward..
31
Scenario 5WAN link capture
32
Scenario 5Uplink PCAP from the AP side
33
Scenario 5
• We saw a lot of TCP retransmissions on the PCAP and Malformed packets –This was the clue!!
• Controller and AP supports UDP Lite that mandates only a header checksum of the datagram
• The IP protocol ID is 136 and it uses the same CAPWAP ports as UDP
• Enabling UDP Lite requires that the network firewall allows protocol 136
• If protocol 136 is not allowed, Firewall drops the packet.
Solution:
34
Exercise: AP Troubleshooting 6
Scenario 6
• After a WAN outage the Flexconnect APs connected to the WLC are no longer Joining and the only work around is to reboot the AP. We want to know the RCA why reboot fixes the problem? We have 3 APs currently facing the issue.. Please help
What question would you make?
What do you Check first?
Problem Description
36
Scenario 6
WLC AP join Debugs – capwap errors/events/dtls..
Going Forward..
*spamApTask3: Dec 10 10:39:47.385: [PA] b0:00:b4:a1:87:50 Discovery Request from 172.22.98.192:6931
*spamApTask3: Dec 10 10:39:47.385: [PA] b0:00:b4:a1:87:50 ApModel: AIR-CAP1602I-A-K9
*spamApTask3: Dec 10 10:39:47.385: [PA] b0:00:b4:a1:87:50 Join Priority Processing status = 0, Incoming Ap's
Priority 4, MaxLrads = 200, joined Aps =169
*spamApTask3: Dec 10 10:39:47.386: [PA] apModel: AIR-CAP1602I-A-K9
*spamApTask3: Dec 10 10:39:47.386: [PA] apType = 29 apModel: AIR-CAP1602I-A-K9
*spamApTask3: Dec 10 10:39:47.386: [PA] apType: Ox1d bundleApImageVer: 8.0.120.0
*spamApTask3: Dec 10 10:39:47.386: [PA] version:8 release:0 maint:120 build:0
*spamApTask3: Dec 10 10:39:47.386: [PA] b0:00:b4:a1:87:50 Discovery Response sent to 172.22.98.192 port 6931
*spamApTask3: Dec 10 10:39:47.386: [PA] b0:00:b4:a1:87:50 Discovery Response sent to 172.22.98.192:6931
*spamApTask3: Dec 10 10:39:47.396: [PA] b0:00:b4:a1:87:50 Discovery Request from 172.22.98.192:6931
37
Scenario 6Further Debugs:
(WLC) >show dtls connections
AP Name Local Port Peer IP Peer Port Ciphersuite
-------------------- ------------- ---------------- ------------- ------------------------------
AP1 Capwap_Ctrl 172.22.108.199 6919 TLS_RSA_WITH_AES_128_CBC_SHA
AP2 Capwap_Ctrl 172.22.42.198 6931 TLS_RSA_WITH_AES_128_CBC_SHA
- Oops!! We are already using this port 6931 and this means we cannot use this port for another AP!!
Post AP reboot – the AP used a port which was not used already - 6932
38
Scenario 6Solution: RCA Found!!
- DTLS Capwap_Ctrl connections are not cleared causing APs not able to join if they try
to connect using same source port
- Observed in APs connected through WAN (mpls q-in-q) with no NAT configured.
APs are stuck in ClientHello DTLS until reboot of the AP and connect with other source port.
Getting Tracked :
https://tools.cisco.com/bugsearch/bug/CSCuu65672
39
Exercise: AP Troubleshooting 7
Scenario 7
• Customer has deployed DMVPN across the sites and the Aps deployed in India are not able to register to the WLC in Australia. The AP and the WLC are able to ping each other but not registering.
What question would you make?
What do you Check first?
Problem Description
41
Scenario 7
WLC AP join Debugs – capwap errors/events/dtls..
- NOTHING
AP CAPWAP debugs - capwap errors/events/dtls..
- NOTHING apart from AP sending DHCP and getting the right IP address and no DISCOVERY Request sent out
- On the DHCP Server We have this config –
ip dhcp pool vlan10
network 10.10.10.0 255.255.255.0
default-router 10.10.10.1
option 150 ip 10.10.10.10
Going Forward..
42
Scenario 7Solution: RCA Found!!
- YES, its OPTION 43 not 150
Document :
http://www.cisco.com/c/en/us/support/docs/wireless-mobility/wireless-lan-wlan/97066-dhcp-option-43-
00.html
43
Client Troubleshooting
Steps to Building an 802.11 Connection
1. Listen for Beacons
2. Probe Request/Response
3. Authentication Request/Response
4. Association Request
5. Association Response
6. (Optional: EAPOL Authentication)
7. (Optional: Encrypt Data)
8. Move User Data
State 1:
Unauthenticated,
Unassociated
State 2:
Authenticated,
Unassociated
State 3:
Authenticated,
Associated45
802.1X AuthenticationServer
EAP-ID-Request
Rest of the EAP Conversation
Radius-Access-AcceptEAP-Success
EAPOL-START
EAP-ID-ResponseRADIUS (EAP-ID_Response)
Supplicant Authenticator
The Supplicant Derives the
Session Key from EAP process or
Certificate and Authentication
Exchange 46
Understanding the Client StateName Description
8021X_REQD 802.1x (L2) Authentication Pending
DHCP_REQD IP Learning State
WEBAUTH_REQD Web (L3) Authentication Pending
RUN Client Traffic Forwarding
(Cisco Controller) >show client detail 00:16:ea:b2:04:36
Client MAC Address............................... 00:16:ea:b2:04:36
…..
Policy Manager State............................. WEBAUTH_REQD
00:16:ea:b2:04:36 10.10.1.103 DHCP_REQD (7) Change state to RUN (20) last state RUN (20)
47
Client FlowThe Route Toward the RUN State!
Assoc 8021X_REQDWEBAUTH_
REQDDHCP_REQ
DRUNAssoc
On the WLC
On the AP
Debugs
On the Flex AP
Debug capwap flex
Debug capwap client config
Debug capwap flexconnect mgmt
Debug capwap flexconnect pmk
Debug capwap flexconnect cckm
Debug capwap flexconnect dot11r
Debugs
50
Client based debugging exist on WLC but such ability was lacking on the AP side.
Lack of filtering capabilities on debugging information in AP
Specific AP :
debug flexconnect client ap <ap-name> add/delete <addr1> {<addr2> <addr3> <addr4>}
debug flexconnect client ap <ap-name> syslog <server-ip/disable>
For a Flex group :
debug flexconnect client group <group-name>add/delete <addr1> { <addr2> | <addr3> | <addr4>}
debug flexconnect client group <group-name> syslog <serverip/disable>
Debugs on Flex APs – 8.1 new features
Pre and Post 8.1- Sample of client debug
Only client specific debug messages
+Complete Client lifecycle debugging
Exercise:
Client Troubleshooting 1
Scenario 1
• Company X configured a new WLAN and none of the users are able to see the WLAN though the WLAN broadcast is enabled.
What question would you make?
What do you Check first?
Problem Description
Scenario 1Going Forward..
Scenario 1
• WLAN ID is greater than 16, by default first 16 SSIDs will be broadcasted on an AP
• Create a new AP group and add this WLAN and the AP to this group
Solution :
Exercise:
Client Troubleshooting 2
Scenario 2
• Company X configured a new WLAN with pre-shared key and the client with the MAC address 00:17:7c:2f:b6:1c is not able to connect..
What question would you make?
What do you Check first?
Problem Description
Scenario 2
*apfMsConnTask_6: May 10 19:19:15.943: 00:17:7c:2f:b6:1c Processing assoc-req station:00:17:7c:2f:b6:1c AP:f8:c2:88:3d:c9:00-00 thread:150e4ca0
*apfMsConnTask_6: May 10 19:19:15.943: 00:17:7c:2f:b6:1c Adding mobile on LWAPP AP f8:c2:88:3d:c9:00(0)
*apfMsConnTask_6: May 10 19:19:15.944: 00:17:7c:2f:b6:1c processSsidIE statusCode is 0 and status is 0
*apfMsConnTask_6: May 10 19:19:15.944: 00:17:7c:2f:b6:1c processSsidIE ssid_done_flag is 0 finish_flag is 0
*apfMsConnTask_6: May 10 19:19:15.944: 00:17:7c:2f:b6:1c STA - rates (4): 130 132 139 150 0 0 0 0 0 0 0 0 0 0 0 0
*apfMsConnTask_6: May 10 19:19:15.944: 00:17:7c:2f:b6:1c suppRates statusCode is 0 and gotSuppRatesElement is 1
*apfMsConnTask_6: May 10 19:19:15.944: 00:17:7c:2f:b6:1c STA - rates (12): 130 132 139 150 12 18 24 36 48 72 96 108 0 0 0 0
*Dot1x_NW_MsgTask_4: May 10 19:19:15.947: 00:17:7c:2f:b6:1c Initiating RSN PSK to mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:19:15.947: 00:17:7c:2f:b6:1c EAP-PARAM Debug - eap-params for Wlan-Id :5 is disabled - applying Global eap timers and retries
*Dot1x_NW_MsgTask_4: May 10 19:19:15.947: 00:17:7c:2f:b6:1c Disable re-auth, use PMK lifetime.
*Dot1x_NW_MsgTask_4: May 10 19:19:15.948: 00:17:7c:2f:b6:1c dot1x - moving mobile 00:17:7c:2f:b6:1c into Force Auth state
WLC Debugs..
Scenario 2
*Dot1x_NW_MsgTask_4: May 10 19:19:15.948: 00:17:7c:2f:b6:1c Starting key exchange to mobile 00:17:7c:2f:b6:1c, data packets will be dropped
*Dot1x_NW_MsgTask_4: May 10 19:19:15.948: 00:17:7c:2f:b6:1c Sending EAPOL-Key Message to mobile 00:17:7c:2f:b6:1c state INITPMK (message 1), replay counter 00.00.00.00.00.00.00.00
*osapiBsnTimer: May 10 19:19:16.972: 00:17:7c:2f:b6:1c 802.1x 'timeoutEvt' Timer expired for station 00:17:7c:2f:b6:1c and for message = M2
*dot1xMsgTask: May 10 19:19:16.972: 00:17:7c:2f:b6:1c Retransmit 1 of EAPOL-Key M1 (length 121) for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:19:16.979: 00:17:7c:2f:b6:1c Received EAPOL-Key from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:19:16.980: 00:17:7c:2f:b6:1c Ignoring invalid EAPOL version (1) in EAPOL-key message from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:19:16.980: 00:17:7c:2f:b6:1c Received EAPOL-key in PTK_START state (message 2) from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:19:16.980: 00:17:7c:2f:b6:1c Received EAPOL-key M2 with invalid MIC from mobile 00:17:7c:2f:b6:1c version 2
*apfLbsTask: May 10 19:19:17.572: 00:17:7c:2f:b6:1c Copy AP LOCP - mode:2 slotId:128, apMac 0xf8:c2:88:3d:c9:0
*apfLbsTask: May 10 19:19:17.572: 00:17:7c:2f:b6:1c Copy WLAN LOCP EssIndex:5 aid:1 ssid:PVT-TEST
*
WLC Debugs..
Scenario 2
*Jan 27 05:07:41.606: C92542A3 r 1 71/69 17- 8801 13A 6F9E10 2FB69A 6F9E10 0000 q7 l129
EAPOL key desc 02 010A 0000 0000 0000 0000 0000 2888 5356 F66C 4026 1CFB
*Jan 27 05:07:42.456: C9323EDD t 1 0 - 8802 000 2FB69A 6F9E10 6F9E10 50C0 q7 l129
EAPOL2 EAPOL key desc 02 008A 0010 0000 0000 0000 0000 6468 F741 147D
*Jan 27 05:07:42.460: C9324C8A r 1 69/70 15- 8801 13A 6F9E10 2FB69A 6F9E10 0010 q7 l129
EAPOL key desc 02 010A 0000 0000 0000 0000 0000 533F 3E2F 198C 7C8C B070
*Jan 27 05:07:43.425: C9410B12 t 1 0 - 8802 000 2FB69A 6F9E10 6F9E10 50D0 q7 l129
EAPOL2 EAPOL key desc 02 008A 0010 0000 0000 0000 0000 6468 F741 147D
*Jan 27 05:07:43.429: C941199E-0 2FB69A - pak flags 1
*Jan 27 05:07:43.429: C9411996 r 1 70/68 19- 8801 13A 6F9E10 2FB69A 6F9E10 0020 q7 l129
EAPOL key desc 02 010A 0000 0000 0000 0000 0000 22C1 DC16 84C9 77EE 8ED2
*Jan 27 05:07:44.394: C94FD7B6 t 1 0 - C000 000 2FB69A 6F9E10 6F9E10 0000 deauth l 2
reason 15
AP Radio Debug
Wireless PCAP
Scenario 2
• The client is failing the 4-way handshake on M2, which is due to wrong pre-shared key or sometimes its client reauthentication using the wrong keys
• For the AP deauth, association status code refer the below document
https://supportforums.cisco.com/document/141136/80211-association-status-80211-deauth-reason-codes
Solution :
Exercise:
Client Troubleshooting 3
Scenario 3
• Company X has configured a 802.1X enabled WLAN and till last evening everything was working fine, none of the wireless users are able to connect in the morning!!
Whole Wireless network is down
What question would you make?
What do you Check first?
Problem Description
Scenario 3
*Dot1x_NW_MsgTask_4: May 10 19:50:29.241: 00:17:7c:2f:b6:1c Sending EAP-Request/Identity to mobile 00:17:7c:2f:b6:1c (EAP Id 1)
*Dot1x_NW_MsgTask_4: May 10 19:50:29.285: 00:17:7c:2f:b6:1c Reset the reauth counter since EAPOL START has been received!!!
*Dot1x_NW_MsgTask_4: May 10 19:50:29.285: 00:17:7c:2f:b6:1c reauth_sm state transition 1 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:53
*Dot1x_NW_MsgTask_4: May 10 19:50:29.285: 00:17:7c:2f:b6:1c Received EAPOL START from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:29.285: 00:17:7c:2f:b6:1c dot1x - moving mobile 00:17:7c:2f:b6:1c into Connecting state
*Dot1x_NW_MsgTask_4: May 10 19:50:29.285: 00:17:7c:2f:b6:1c Sending EAP-Request/Identity to mobile 00:17:7c:2f:b6:1c (EAP Id 2)
*Dot1x_NW_MsgTask_4: May 10 19:50:29.285: 00:17:7c:2f:b6:1c reauth_sm state transition 0 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:71
*Dot1x_NW_MsgTask_4: May 10 19:50:29.298: 00:17:7c:2f:b6:1c Received EAPOL EAPPKT from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:29.298: 00:17:7c:2f:b6:1c Received EAP Response packet with mismatching id (currentid=2, eapid=1) from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:29.298: 00:17:7c:2f:b6:1c reauth_sm state transition 0 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:71
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c Received EAPOL EAPPKT from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c Received Identity Response (count=1) from mobile 00:17:7c:2f:b6:1c
WLC debug client..
Scenario 3
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c Resetting reauth count 1 to 0 for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c EAP State update from Connecting to Authenticating for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c dot1x - moving mobile 00:17:7c:2f:b6:1c into Authenticating state
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c reauth_sm state transition 0 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:71
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c Entering Backend Auth Response state for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:29.309: 00:17:7c:2f:b6:1c reauth_sm state transition 0 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:71
*apfLbsTask: May 10 19:50:29.572: 00:17:7c:2f:b6:1c Copy CCX LOCP 1
*apfLbsTask: May 10 19:50:29.572: 00:17:7c:2f:b6:1c Copy MobilityData LOCP status:0, anchorip:0x0
*Dot1x_NW_MsgTask_4: May 10 19:50:41.913: 00:17:7c:2f:b6:1c Processing AAA Error 'Timeout' (-5) for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 19:50:41.914: 00:17:7c:2f:b6:1c Sent Deauthenticate to mobile on BSSID 58:f3:9c:43:69:d0 slot 0(caller 1x_auth_pae.c:1581)
WLC debug client..
Scenario 3
*Jan 15 07:21:36.347: 5BCB7C8F t 1 0 - 8802 000 2FB698 6F9E11 6F9E11 1890 q7 l87
EAPOL3 EAP id 2 req ident 0 "networkid=peapradius ,nasid=SURBG-5760,portid=0"
*Jan 15 07:21:36.374: 5BCBEDBF r 1 72/68 15- 8801 17A 6F9E11 2FB698 6F9E11 0000 q7 l13
EAPOL start
*Jan 15 07:21:36.379: 5BCC00E4 t 1 0 - 8802 000 2FB698 6F9E11 6F9E11 18A0 q7 l87
EAPOL3 EAP id 2 req ident 0 "networkid=peapradius,nasid=SURBG-5508,portid=0"
*Jan 15 07:21:38.515: 5BECA39F-0 2FB698 - pak flags 1
*Jan 15 07:21:38.515: 5BECA397 r 1 67/63 22- 8801 17A 6F9E11 2FB698 6F9E11 0010 q7 l22
EAP id 2 resp ident "surbg"
*Jan 15 07:22:17.159: 5E3AFECD r 1 67/63 22- A000 13A 6F9E11 2FB698 6F9E11 C1D0 disass l 2
reason 8
AP Radio Debugs
Wireless PCAP
Scenario 3
• From the Logs we see that we are getting
Returning AAA Error 'Timeout' (-5) for mobile
• This means -//AAA Server Unreachable, followed by client deauth.
• Where to check?
- RADIUS SERVER LOGS
- Check the Network connectivity between the WLC and the RADIUS server
- Check for Shared secret mismatch and reconfigure it.
Solution :
Exercise:
Client Troubleshooting 4
Scenario 4
• The client with the MAC address 00:17:7c:2f:b6:1c is not able to connect to the network..
What question would you make?
What do you Check first?
Problem Description
Scenario 4
*Dot1x_NW_MsgTask_4: May 10 21:19:52.436: 00:17:7c:2f:b6:1c Received EAPOL EAPPKT from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.436: 00:17:7c:2f:b6:1c Received Identity Response (count=1) from mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.436: 00:17:7c:2f:b6:1c Resetting reauth count 1 to 0 for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.436: 00:17:7c:2f:b6:1c EAP State update from Connecting to Authenticating for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.438: 00:17:7c:2f:b6:1c Entering Backend Auth Req state (id=80) for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.438: 00:17:7c:2f:b6:1c WARNING: updated EAP-Identifier 2 ===> 80 for STA 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.444: 00:17:7c:2f:b6:1c Resetting reauth count 0 to 0 for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.445: 00:17:7c:2f:b6:1c reauth_sm state transition 0 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:71
*Dot1x_NW_MsgTask_4: May 10 21:19:52.445: 00:17:7c:2f:b6:1c Entering Backend Auth Response state for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:52.446: 00:17:7c:2f:b6:1c Processing Access-Challenge for mobile 00:17:7c:2f:b6:1c
WLC client debug
*Dot1x_NW_MsgTask_4: May 10 21:19:58.094: 00:17:7c:2f:b6:1c Entering Backend Auth Req state (id=87) for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:58.094: 00:17:7c:2f:b6:1c Sending EAP Request from AAA to mobile 00:17:7c:2f:b6:1c (EAP Id 87)
*Dot1x_NW_MsgTask_4: May 10 21:19:58.094: 00:17:7c:2f:b6:1c Reusing allocated memory for EAP Pkt for retransmission to mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:58.094: 00:17:7c:2f:b6:1c reauth_sm state transition 0 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:71
*Dot1x_NW_MsgTask_4: May 10 21:19:58.101: 00:17:7c:2f:b6:1c Entering Backend Auth Response state for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:58.102: 00:17:7c:2f:b6:1c reauth_sm state transition 0 ---> 0 for mobile 00:17:7c:2f:b6:1c at 1x_reauth_sm.c:71
*Dot1x_NW_MsgTask_4: May 10 21:19:58.103: 00:17:7c:2f:b6:1c Processing Access-Reject for mobile 00:17:7c:2f:b6:1c
*Dot1x_NW_MsgTask_4: May 10 21:19:58.104: 00:17:7c:2f:b6:1c Removing PMK cache due to EAP-Failure for mobile 00:17:7c:2f:b6:1c (EAP Id 87)
Scenario 4
Scenario 4
*Jan 15 04:32:48.475: D7C337E r 1 68/62 20- B000 13A 6F9E11 2FB698 6F9E11 78A0 auth l 6
*Jan 15 04:32:48.475: D7C340C-0 2FB698 - newauth
*Jan 15 04:32:48.476: D7C38C3 t 1 0 - B000 001 2FB698 6F9E11 6F9E11 0000 auth l 6
*Jan 15 04:32:48.479: D7C43F8 r 1 69/62 14- 0000 13A 6F9E11 2FB698 6F9E11 78B0 assreq l 139
*Jan 15 04:32:48.495: D7C8598 t 1 0 - 1000 000 2FB698 6F9E11 6F9E11 0000 assrsp l 151
*Jan 15 04:32:48.497: D7C8B78 t 1 0 - 8802 000 2FB698 6F9E11 6F9E11 0710 q7 l87
EAPOL3 EAP id 1 req ident 0 "networkid=peapradius,nasid=SURBG-5760,portid=0"
*Jan 15 04:32:48.529: D7D09A4 r 1 69/62 21- 8801 17A 6F9E11 2FB698 6F9E11 0000 q7 l13
EAPOL start
*Jan 15 04:32:48.533: D7D170A t 1 0 - 8802 000 2FB698 6F9E11 6F9E11 0720 q7 l87
EAPOL3 EAP id 1 req ident 0 "networkid=peapradius ,nasid=SURBG-5508,portid=0"
*Jan 15 04:32:53.643: DCB28C0 r 1 69/62 20- 8801 17A 6F9E11 2FB698 6F9E11 0010 q7 l22
EAP id 1 resp ident “Surbg"
*Jan 15 04:32:53.649: DCB3734 t 1 0 - C000 000 2FB698 6F9E11 6F9E11 0000 deauth l 2
reason 23
AP Radio Debug
Wireless PCAP
Scenario 4
• From the WLC and the AP debugs we see Access-Reject
• Where to check next?
- RADIUS SERVER logs
- Find the exact reason, in this example it’s the wrong credentials
Solution:
Exercise:
Client Troubleshooting 5
Scenario 5
• Company X has deployed Flex-Connect local switch solution with 2 SSID BYOD deployment
- The clients who access intranet is working fine
- The clients who are accessing Internet right after they go to Run state are not getting redirect (10.X.X.X subnet)
What question would you make? Clients are already on RUN state!
What do you Check first?
Problem Description
Scenario 5Client side Wireshark PCAP
Scenario 5
AP port SPAN session capture
Scenario 5AP show command
Scenario 5
- Important thing to notice
- The client is sending the TCP SYN to google IP 74.X.X.X and the WWW gets incremented , the SYN-ACK is sent from the Server to the Client and here the source will be 74.X.X.X to 10.X.X.X network and this is hitting the DENY statement 220.. thus the incoming traffic is getting dropped..
- The time we changed the ACL DENY statement, everything started working
Solution :
Exercise:
Client Troubleshooting 6
Scenario 6
• Company X has deployed Flex-Connect local switch solution AAA-Override + Profiling (RADIUS NAC enabled) for the clients
- After the session timeout or when the client roams to another AP, the client is not able to ping the default gateway though the client machine says connected and the WLC shows the client in RUN state
What question would you make? Clients are already on RUN state!
What do you Check first?
Problem Description
Scenario 6
WLC show output –
Going Forward..
Scenario 6Client Side PCAP
Scenario 6AP Side show command: show capwap reap association
Scenario 6
- The WLC showed the client in Flex-Local switching state
- AP reap associations showed the Client in Central Switching state
There lies the problem.. This was due to
CSCuo63103 Client local switching to central mode load, aaa override, radius nac
https://tools.cisco.com/bugsearch/bug/CSCuo63103
- Isolating the problem is very important and as stated earlier, this took a weeks time to isolate the issue because of incorrect problem description and very less data provided..
Solution :
Exercise:
Client Troubleshooting 7
Scenario 7
• Company X deployed Bonjour mDNS gateway for Apple TV support but users are unable to see the Apple TV services but Airprint works without any issues..
What do you Check first?
What debugs to run?
Problem Description
Scenario 7
*Bonjour_Msg_Task: Mar 14 13:33:38.582: d0:23:db:58:cf:c0 Parsing 2 bonjour Questions.
*Bonjour_Msg_Task: Mar 14 13:33:38.583: d0:23:db:58:cf:c0 Query Service Name: _raop._tcp.local., Type: C, Class: 8001.
*Bonjour_Msg_Task: Mar 14 13:33:38.583: qNameStr:_raop._tcp.local., bonjServiceNameStr:_raop._tcp.local., bonjSpNameStr:_raop._tcp.local.
*Bonjour_Msg_Task: Mar 14 13:33:38.583: Service Name:_raop._tcp.local. is supported in Master-service-db, Name: AirTunes
*Bonjour_Msg_Task: Mar 14 13:33:38.583: d0:23:db:58:cf:c0 Skipping the Service:_raop._tcp.local.. Not supported by client's profile:default-mdns-profile
*Bonjour_Msg_Task: Mar 14 13:33:38.583: d0:23:db:58:cf:c0 Query Service Name: _airplay._tcp.local., Type: C, Class: 8001.
*Bonjour_Msg_Task: Mar 14 13:33:38.583: qNameStr:_airplay._tcp.local., bonjServiceNameStr:_airplay._tcp.local., bonjSpNameStr:_airplay._tcp.local.
*Bonjour_Msg_Task: Mar 14 13:33:38.583: Service Name:_airplay._tcp.local. is supported in Master-service-db, Name: AppleTV
*Bonjour_Msg_Task: Mar 14 13:33:38.583: d0:23:db:58:cf:c0 Service:_airplay._tcp.local. is supported by client's profile:default-mdns-profile
*Bonjour_Msg_Task: Mar 14 13:33:38.583: Sending Bonjour Response
*Bonjour_Msg_Task: Mar 14 13:33:38.583: Service Provider Name: , Msal Service Name: AppleTV
WLC client debug + debug mdns all enable
Scenario 7
• We saw the error
*Bonjour_Msg_Task: Mar 14 13:34:16.991: d0:23:db:58:cf:c0 Skipping the Service:_raop._tcp.local.. Not supported by client's profile:default-mdns-profile
• The quickest way to troubleshoot these kind of issues is to get mDNS debugs and open the file in a notepad and search for “Not Supported” or “Master Db” and add these services on to the WLC Master DB
Solution:
Exercise:
Client Troubleshooting 8
Scenario 8
• Company X deployed Flexconnect local Switching, Local Authentication to serve the clients. The users are complaining about the frequent client dropouts. Please help to RCA the issue..
What do you Check first?
What debugs to run?
Problem Description
Scenario 8Its Flex local auth – So enable flex AP debugs
Scenario 8Wireless PCAP
Scenario 8
From the AP debugs and Wireless PCAP
- We see that the client sent the EAP ID response to the AP
- AP ACKed the client response
- But the client never received it
- This tells that the Client was either FAR away from the AP or we have really bad RF Issues due to (high Channel Utilisation, too many packet retries, data rate down shift etc)
Conclusion –
BAD RF and recommended 5GHz and after migrating everything worked great
Solution:
Exercise:
Client Troubleshooting 9
Scenario 9
• Company X deployed Flexconnect local Switching, Central Authentication to serve 802.11R clients. Occasionally the clients drop from the network and doesn’t connect back for a while.
What do you Check first?
What debugs to run?
Problem Description
Scenario 9
WLC Debug -
Association Failed on REAP AP BSSID 88:1d:fc:30:14:4b (slot 1), status 55 0 Invalid FTIE MIC value in Re-assoc req
apfMsDeleteByMscb Scheduling mobile for deletion with deleteReason 8, reasonCode 1
Sent Deauthenticate to mobile on BSSID 88:1d:fc:30:11:40 slot 1(caller apf_ms.c:7074)
Wireless OTA PCAP -
At the point of problem, we happened to see many authentication response from the AP which the client never acked. So the AP keeps on transmitting until it reaches max retry for that packet
WLC client debug / OTA / AP Uplink / WLC Uplink Captures
Scenario 9
AP Uplink PCAP -
- On the AP Ethernet port capture we see AP sending and receiving the 11r auth req/res in 100 ms
WLC Uplink PCAP
- On the WLC we don’t see any delay
WLC client debug / OTA / AP Uplink / WLC Uplink Captures
Scenario 9
• Over Flexconnent for 11r deployment we need no more than 20ms delay, this delay will result in FTIE MIC errors between the AP and the Client while performing the 11r authReq/Res
• We tracked this issue under 11r hardening via CSCuw94949
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCuw94949
• AP does both Assoc and 11r Auth req/res and just updates the WLC, thus the AP is not dependent on the WLC to respond back to the 11r auth req what the it sent
• This bug CSCuw94949 is fixed in 80MR3 (Beta code available on Escalation page).
Solution:
Client Issues – Take Away
• Client can be removed for numerous reasons
WLAN change, AP change, configured interval
• Start with Client Debug to see if there is a reason for a client’s deauthentication
• Further Troubleshooting
Client debug should give some indication of what kind of deauth is happening
Packet capture or client logs may be required to see exact reason
Never forget Radio status and RF conditions
AP Radio Troubleshooting
Blackbox Recorder
•AP keeps recordings at multiple levelsap3600-sw3-0-2#dir Directory of flash:/
33 drwx 2240 Oct 29 2014 11:53:23 +00:00 ap3g2-k9w8-mx.wnbu_bt.2014100711102 -rwx 125850 Oct 13 2014 16:16:26 +00:00 event.r015 -rwx 64 Oct 29 2014 11:53:37 +00:00 sensord_CSPRNG016 -rwx 64 Oct 29 2014 11:53:37 +00:00 sensord_CSPRNG13 -rwx 965 Oct 8 2014 09:02:03 +00:00 lwapp_mm_mwar_hash.cfg18 -rwx 56220 Oct 29 2014 11:53:46 +00:00 event.log19 drwx 384 Oct 29 2014 11:56:45 +00:00 configs4 -rwx 280 Oct 29 2014 11:56:40 +00:00 lwapp_officeextend.cfg5 -rwx 75 Oct 29 2014 11:56:38 +00:00 capwap-saved-config6 -rwx 126063 Oct 13 2014 16:16:45 +00:00 event.r17 -rwx 95008 Oct 29 2014 11:53:34 +00:00 lwapp_reap.cfg.bak9 -rwx 50428 Oct 29 2014 12:01:57 +00:00 lwapp_non_apspecific_reap.cfg8 -rwx 7192 Oct 29 2014 11:56:44 +00:00 private-multiple-fs14 -rwx 95008 Oct 29 2014 11:56:52 +00:00 lwapp_reap.cfg31 -rwx 60856 Oct 29 2014 11:52:32 +00:00 event.capwap12 -rwx 359 Oct 29 2014 11:56:38 +00:00 env_vars
AP Radio Issue Exampleap3600-sw3-0-2#more event.r0Oct 13 16:16:07.727: %EVT-5-NTC: Spectrum FW down slot=1:Oct 13 16:16:07.727: %EVT-5-NTC: Spectrum FW recoverable err cnt=0:Oct 13 16:16:07.727: %EVT-5-NTC: mesg: regwdogOct 13 16:16:07.727: %EVT-5-NTC: file: src/de_cmd.cOct 13 16:16:07.727: %EVT-5-NTC: line/id: 856 / 0Oct 13 16:16:16.703: %EVT-5-NTC: Radio d0 FST 25 Flags 2020105 BCN 0Oct 13 16:16:16.703: %EVT-5-NTC: Radio d0 RST 71 Flags 202010D BCN 0Oct 13 16:16:16.703: %EVT-5-NTC: Spectrum FW reset slot=0 No CleanAir msmts0,196,0,0,0,4,4,4,4,0/0/1C05F/F648031*** Show controller output ***
!interface Dot11Radio0Radio ElliotNess 2.4, Base Address 8478.ac99.53f0, BBlock version 0.00, Software version 4.18.1Serial number: FOC16444LCU Unused dynamic SQRAM memory: 0x0000C8F0 (50 KB)Unused dynamic SDRAM memory: 0x000886B8 (545 KB)Spectrum FW version: 1.15.2
Event Logct 29 11:52:22.723: %EVT-5-NTC_PROC: dot11_return_serving_channel:DFS EnableProcess: Dot11 driverOct 29 11:52:23.303: %CAPWAP-3-ERRORLOG: Retransmission count for packet exceeded max(CAPWAP_WTP_EVENT_REQUEST., 4)Oct 29 11:52:27.159: %EVT-5-NTC_PROC: dot11_set_rm_scan:DFS DisableProcess: Dot11 Offchannel PROCESSOct 29 11:52:27.563: %EVT-5-NTC_PROC: mh_driver_off_chnl_complete: to Channel 136 for d1Process: Dot11 driverOct 29 11:52:27.563: %EVT-5-NTC_PROC: dot11_return_serving_channel: d1 Channel 40Process: Dot11 driverOct 29 11:52:27.563: %EVT-5-NTC_PROC: dot11_return_serving_channel:DFS EnableProcess: Dot11 driverOct 29 11:52:32.415: %EVT-4-WRN: Write of flash:/event.capwap doneOct 29 11:52:32.459: %LWAPP-3-CLIENTERRORLOG: Switching to Standalone modeOct 29 11:52:32.459: %CAPWAP-3-ERRORLOG: GOING BACK TO DISCOVER MODEOct 29 11:52:32.459: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.100.54:5246Oct 29 11:52:32.527: %EVT-5-NTC: CAPWAP state change 5Oct 29 11:52:42.531: %CAPWAP-3-ERRORLOG: Go join a capwap controller Oct 29 11:48:18.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 192.168.252.56 peer_port: 5246Oct 29 11:48:18.711: %CAPWAP-5-DTLSREQSUCC: DTLS connection created sucessfully peer_ip: 192.168.252.56 peer_port: 5246Oct 29 11:48:18.711: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.252.56
Sh Controller dX
ap3600-sw3-0-2#sh cont d0!interface Dot11Radio0Radio EN 2.4GHz, Base Address 8478.ac99.53f0, BBlock version 0.00, Software version 4.18.2Serial number: FOC16444LCU Unused dynamic SQRAM memory: 0x0000C7C0 (49 KB)Unused dynamic SDRAM memory: 0x000841E8 (528 KB)Spectrum FW version: 1.15.2Number of supported simultaneous BSSID on Dot11Radio0: 16Carrier Set: Spain (ES) (-E)Uniform Spreading Required: NoConfigured Frequency: 2412 MHz Channel 1Allowed Frequencies:
2412( 1) 2417( 2) 2422( 3) 2427( 4) 2432( 5) 2437( 6) 2442( 7) 2447( 8) 2452( 9) 2457( 10)
2462( 11) 2467( 12) 2472( 13) Listen Frequencies:
2412( 1) 2417( 2) 2422( 3) 2427( 4) 2432( 5) 2437( 6) 2442( 7) 2447( 8) 2452( 9) 2457( 10)
2462( 11) 2467( 12) 2472( 13) 2484( 14) Beacon Flags: 0, Interface Flags 2020105, Interface Events 0, Mode 9; Beacons are enabled; Probes are enabled
Sh Controller dX – What?
•Interface statusBeacon Flags: 0, Interface Flags 20105, Interface Events 0, Mode 9; Beacons are enabled; Probes are enabled
• ChannelConfigured Frequency: 2412 MHz Channel 1
• QBSSQBSS Load: 0x24
• TX QueuesTransmit queues: Limit 419 Current 20 In-Progress 20
Sh Controller dX – What?
•Driver blocks
Driver TX blocks: in use 0, high 0, at reset 0, fail 0 drop 0
• Authentications in progress8021x auth in prog 0 allowed 0
• SSID client countVlan BSSID Clients PSP Pri/U/M HT Encr Key0 Key1 Key2 Key3 SSIDs MFP
0n 4450 0 0 0 0 0 0 0 0 2 4451 1 0 0 3 3 3 0 0 imago 03 4452 2 0 0 3 3 3 0 0 setup-wifi 04 4453 3 5 1 3 3 3 0 204 128 x128 eduroam 0
• Radio ResetsLast radio reset code: 37Radio resets - total:9 retries:0 failed:0Reset Stats: Start Cnt: 6, Recovery: Cnt 0, Last Ret: 0, Fails: 0, Recvry Status: Stalled NO, In ProgNO
Code/Count: 37/00006 1 62/00002 1 67/00001 0
Sh Controller dX – What?
•Queue status---- Active ------ In-Progress --------------- Counts --------------
Cnt Quo Bas Max Cl Cnt Quo Bas Sent Discard Fail Retry MultiUplink 0 64 0 0 0 0 5 0 0 0 0 0 0Voice 0 512 0 0 0 0 60 0 0 0 0 0 0Video 0 1024 0 0 0 0 200 0 0 0 0 0 0Best 0 1024 0 0 0 0 200 0 6404 0 0 0 0
•Radio commandsRadio commands - total:126328 delayed:0 elapsed:0 timeouts:0 time high
00000000 low 00000000 usecs 00000000
Client Status•Clients at different AP Components
a2-ap3600-sw2-12#sh capwap reap association Address : 001d.4546.a204 Name : NONEIP Address : 0.0.0.0 IPv6 Address : :: Gateway Address : 0.0.0.0 Netmask Address : 0.0.0.0 Interface : Dot11Radio 0Bridge-group : 0reap_flags_1 : 0x0 ip_learn_type : 0x0 transient_static_ip : 0x0 Device : WGB-client Software Version : NONE CCX Version : NONE Client MFP : Off
State : Assoc Parent : 6073.5c7e.e002 SSID : wgb-roamer WLAN : 1Hops to Infra : 0 Clients Associated: 0 Repeaters associated: 011w Status : OffREAP Data Switching: Local
Address : 6073.5c7e.e002 Name : wgb2600-sw2-4IP Address : 0.0.0.0 IPv6 Address : :: Gateway Address : 0.0.0.0 Netmask Address : 0.0.0.0 Interface : Dot11Radio 0Bridge-group : 4reap_flags_1 : 0x0 ip_learn_type : 0x0 transient_static_ip : 0x0 Device : WGB Software Version : NONE CCX Version : 5 Client MFP : Off
Exercise:
Radio Troubleshooting 1
Scenario 1
• Company X has deployed MESH Access points across the port to scan the devices that come in and go out of the harbor, all of sudden none of the clients are getting connected and rebooting the Access points fixes the issue.. We have one AP in the failed state, please isolate the problem
What question would you make?
What do you Check first?
Problem Description
Scenario 1
- Debugs on the WLC is not giving any data..
- AP Radio debug for txev / xmt
Going Forward..
Scenario 1
Wireless PCAP
Going Forward..
Scenario 1
- On the AP show controller output we saw..
- This was due to transmit voice queue stuck, leading to out of Transmit buffers
https://tools.cisco.com/bugsearch/bug/CSCur71315
Solution :
Exercise:
Radio Troubleshooting 2
Scenario 2
• Company X has deployed a wireless network with multiple SSIDs in Anchor-Foreign design, the session timeout is set to 30 minutes and the idle timeout is set to 5 minutes.
• The active clients are getting dropped from the network even before the session timeout is being kicked, troubleshoot the problem.
What question would you make?
What do you Check first?
Problem Description
Scenario 2
*DHCP Socket Task: Aug 22 11:28:45.202: 58:55:ca:69:91:2a DHCP server id: 1.1.1.1 rcvd server id: 1.1.1.1
*DHCP Socket Task: Aug 22 11:28:45.202: 58:55:ca:69:91:2a DHCP successfully bridged packet to STA
*spamApTask0: Aug 22 11:29:41.454: 58:55:ca:69:91:2a Cleaning up state for STA 58:55:ca:69:91:2a due to event for AP 00:26:cb:94:e2:10(0)
*apfReceiveTask: Aug 22 11:29:41.454: 58:55:ca:69:91:2a apfSendDisAssocMsgDebug (apf_80211.c:1855) Changing state for mobile 58:55:ca:69:91:2a on AP 00:26:cb:94:e2:10 from Associated to Disassociated
*apfReceiveTask: Aug 22 11:29:41.454: 58:55:ca:69:91:2a Sent Disassociate to mobile on AP 00:26:cb:94:e2:10-0 (reason 1, caller apf_ms.c:4983)
*apfReceiveTask: Aug 22 11:29:41.454: 58:55:ca:69:91:2a Scheduling deletion of Mobile Station: (callerId: 45) in 10 seconds
*apfMsConnTask_2: Aug 22 11:29:49.700: 58:55:ca:69:91:2a Association received from mobile on AP 00:26:cb:94:e3:a0
Debug client on Foreign
Scenario 2Logs from the AP
Scenario 2Solution:
• Show controller
• We saw Radio resetting with the reason code 65 - Rogue Location Discovery Protocol stop.
• Disable RLDP fixed the issue
• From 7.0 + we have RLDP Enable / Disable on demand –
config rogue ap rldp schedule ?
We have documented the radio reset codes here
http://www.cisco.com/c/en/us/support/docs/wireless/aironet-3700-series/117869-technote-ap-00.html
Troubleshooting Tools
Troubleshooting Tools
• The “Debug client”
• Wireless Sniffer
• Example: Linksys USB600N with Omnipeek • TAC can publish Omnipeek-RA if you have compatible HW
• Windows 7 with Netmon 3.4 https://supportforums.cisco.com/docs/DOC-16398
• Mac OS X 10.6+ https://supportforums.cisco.com/docs/DOC-19212
• Wired Packet Capture
• Spectrum Analyser
• Spectrum Expert with Card or Clean-Air AP
• AP Packet Capture
On the WLC
On the AP
Debugs
WLC Config Analyser (WLCCA)
• Support Forums DOC-1373
• Simplifies config check on the WLC
• Performs Audit checks
Wireless Capture – AP as Sniffer
Wireless Capture – AP as Sniffer
Wireless Capture – AP as Sniffer
• Use Wireshark 1.12.x or higher
Capture traffic - Not parsed
PC traffic
Wireless Capture – AP as Sniffer
Wireless Capture – AP as Sniffer
• RF info now available
RF
Wireless Capture – Quick tips• Log Time Correlation – Find unique IDs
*pemReceiveTask: Mar 13 08:32:39.877: 18:34:51:c4:d0:33 Sending a gratuitous ARP for 10.10.16.96, VLAN Id 252
*DHCP Socket Task: Mar 13 08:32:41.345: 18:34:51:c4:d0:33 DHCP received op BOOTREQUEST (1) (len 335,vlan 0, port 13, encap 0xec03)
*DHCP Socket Task: Mar 13 08:32:41.345: 18:34:51:c4:d0:33 DHCP (encaptype 0xec03) mstype 0ff:ff:ff:ff:ff:ff
*DHCP Socket Task: Mar 13 08:32:41.345: 18:34:51:c4:d0:33 DHCP processing DHCP REQUEST (3)
*DHCP Socket Task: Mar 13 08:32:41.345: 18:34:51:c4:d0:33 DHCP op: BOOTREQUEST, htype: Ethernet, hlen: 6, hops: 0
*DHCP Socket Task: Mar 13 08:32:41.346: 18:34:51:c4:d0:33 DHCP xid: 0x636ccd64 (1668074852), secs: 1024, flags: 0
*DHCP Socket Task: Mar 13 08:32:41.346: 18:34:51:c4:d0:33 DHCP chaddr: 18:34:51:c4:d0:33
*DHCP Socket Task: Mar 13 08:32:41.346: 18:34:51:c4:d0:33 DHCP ciaddr: 0.0.0.0, yiaddr: 0.0.0.0
*DHCP Socket Task: Mar 13 08:32:41.346: 18:34:51:c4:d0:33 DHCP siaddr: 0.0.0.0, giaddr: 0.0.0.0
*DHCP Socket Task: Mar 13 08:32:41.346: 18:34:51:c4:d0:33 DHCP requested ip: 10.10.16.96
*DHCP Socket Task: Mar 13 08:32:41.346: 18:34:51:c4:d0:33 DHCP successfully bridged packet to DS
Wireless Capture – Quick tips• IO Graph to find problems
iOS Capture
• You can sniff iDevice traffic
• Useful for application troubleshooting (voice, webauth, apps)
• No wireless data
• Additional debug levels possible through Apple TAC
iOS Capture
• Get Xcode installed (free on AppStore)
• Find the UDID
Click here until you
see it
iOS Capture
• Start rvictl tool to create virtual interface
iOS Capture• Use Wireshark on the new virtual interface
Q & A
Complete Your Online Session Evaluation
Learn online with Cisco Live!
Visit us online after the conference
for full access to session videos and
presentations.
www.CiscoLiveAPAC.com
Give us your feedback and receive a
Cisco 2016 T-Shirt by completing the
Overall Event Survey and 5 Session
Evaluations.– Directly from your mobile device on the Cisco Live
Mobile App
– By visiting the Cisco Live Mobile Site http://showcase.genie-connect.com/ciscolivemelbourne2016/
– Visit any Cisco Live Internet Station located
throughout the venue
T-Shirts can be collected Friday 11 March
at Registration
Thank you