Upload
camilogb
View
247
Download
15
Tags:
Embed Size (px)
DESCRIPTION
Cisco 4500 Troubleshooting guide
Citation preview
Troubleshooting Cisco Catalyst 4500 Series Switches BRKCRS-3142
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Session Goals
At the end of this session, you should be able to:
Understand system resources and monitor their usage
Identify all areas of packet loss
Trace hardware packet path
Make use of newer tools
This content is based on questions we see in the field. Feedback is welcome!
3
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow
Tools/Tips
Appendix
4
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Products Overview
5 5
4503-E 4507R+E 4510R+E 4506-E
6 Gbps per slot • Classic supervisors • Classic line cards
• e.g, SupV-10GE, 45xx line card
See the appendix for supervisor, line card, and chassis product and compatibility details.
48 Gbps per slot • +E Chassis support 12.2(53)SG4 onward • switch, Sup7L-E, 47xx line card • 4507R+E, 4510R+E, 4503-E, 4506-E
24 Gbps per slot
• -E Chassis support 12.2(31)SGA6 onward
• Sup6-E, Sup6L-E and 46xx line card
• 4507R-E, 4510R-E
5
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Products Overview
6
1. Ternary Content Addressable Memory 2. Optional for Supervisor IV and V. Integrated in Supervisor V-10GE, switch, 7L-E
Intelligent Supervisors Supervisor Engine 7-E, 7L-E, 6-E, 6L-E, V-10GE, V, IV, II-Plus-10GE,
II-Plus-TS, II-Plus
Transparent Line Cards Wire-rate, oversubscribed, PoE 10/100, 10/100/1000, GE, 10GE Various physical media front panel ports Dedicated per-slot bandwidth to supervisor
Switching ASICs Packet Processor Forwarding Engine
Specialized Hardware TCAM1s for ACLs, QoS, L3 forwarding NetFlow2 (NFE) for statistics gathering
6
Shared Packet Memory
Line Card Stub ASICs
Front Panel Ports
Supervisor
NFE2
CPU
TCAMs1
Packet Processor
Forwarding Engine
6
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow
Tools/Tips
Appendix
7
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Method General Recommendations
Design with intent – ideally, create a deterministic network – engineers – not traffic – should control the network
Baseline, monitor against baseline, alarm and/or adjust – problems are solved faster when knowns can be eliminated
Characterize issues quickly with a plan
8
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Method Method
1. Define Problem
2. Gather Facts
3. Consider Possibilities
4. Create Action Plan
5. Execute Action Plan
6. Observe Results
Doc
umen
tatio
n
Symptoms? System Messages? User Input? When? Frequency? Impact? Scope?
•Need to have a good understanding about how the system looks like when it is healthy
•Further information and examples are in the troubleshooting section
Want to learn more? Check out CCNP Practical Studies: Troubleshooting by Donna Harrington.
CCNP TSHOOT 642-832 Official Certification Guide by Kevin Wallace.
9
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Method Method
Category Possible Cause
Config/Design Mis-configuration
Reaching Capacity
Traffic DOS Attack
Traffic Pattern Change
Bad peer/server
Software Issue Software Limitation
Bug
Hardware Issue Hardware Limitation
Failed Hardware
Transient Hardware Issue
1. Define Problem
2. Gather Facts
3. Consider Possibilities
4. Create Action Plan
5. Execute Action Plan
6. Observe Results
Doc
umen
tatio
n
10
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Method Method
1. Define Problem
2. Gather Facts
3. Consider Possibilities
4. Create Action Plan
5. Execute Action Plan
6. Observe Results
Doc
umen
tatio
n
What needs to be done to isolate each potential root cause? Make a change, measure results, rollback change if problem persists Problem solved? If not, continue action plan
11
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Method Before you dig deep
Top down approach – Hardware generally does what it’s told to do – Before you troubleshoot the platform, rule out the usual suspects
End-to-end • Compare traffic at endpoints • Keep standard methods/tools for loss
measurement handy
Iperf
Security • Port security issues • Actions are not always sent to syslog • Restrict modes may use CPU
802.1x, DAI, DHCP snooping/relay, IPSG, Port Security, PACL
Common Issues
• Security features • L2 • L3 unicast • L3 multicast
RACL, VACL, unicast RPF, intermediary stateful inspection spanning-tree topology, IGMP snooping reachability, peer adjacency rpf, L3 path construction (RP), IGMP groups
12
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Method Caution
debug and show platform commands to follow
Excessive debug output to console may disable switch
show platform commands are intended for in-depth troubleshooting
Use debug and show platform commands only when advised by TAC
show platform CLIs are not officially supported IOS commands
Not all commands apply to all platforms.
– Some are IOS-XE specific (Supervisor 7-E, 7L-E and 4500X)
13
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow
Tools/Tips
Appendix
14
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
System Resources CPU
• Runs IOS/IOS-XE processes
• Runs 4500 platform-specific processes
• Sends/Receives control traffic
• Software-switches packets that can’t be hardware-switch
• Elevated CPU = in-use CPU, does not impact data plane
• Baseline is important
15
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting CPU from “show process cpu”
16
CPU higher than baseline
High iosd use on IOS-XE?
sh proc cpu detail process iosd
No
Reference Document ID: 65591 on http://www.cisco.com for more
details High CPU in IOS process or
Cat4k process?
Troubleshoot features related to the process / open TAC SR
No
Yes High CPU traffic driven?
(K*CpuMan Review)
show platform health
ios cat4k
Can the traffic be identified?
show platform cpu packet stat
No
Yes
Stop / alter traffic source, open TAC SR if more detail
needed
monitor session 1 source cpu OR
debug platform packet all buffer show platform cpu packet buffer
No Yes
IOS-XE
IOS
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting CPU: Narrowing Down Process switch# show process cpu sort Core 0: CPU utilization for five seconds: 99%; one minute: 16%; five minutes: 7% Core 1: CPU utilization for five seconds: 3%; one minute: 69%; five minutes: 33% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 8590 3186391 38863326 176 51.20 42.52 20.34 0 iosd … 11969 3138594 13447334 23 0.08 0.07 0.05 0 ffm 8448 207801 20750735 10 0.04 0.14 0.27 0 cli_agent 10684 428406 20858613 20 0.04 0.01 0.01 0 licensed 11241 3603017 26001138 138 0.04 0.04 0.04 0 cpumemd switch# show proc cpu detail process iosd sort Core 0: CPU utilization for five seconds: 99%; one minute: 62%; five minutes: 22% Core 1: CPU utilization for five seconds: 2%; one minute: 38%; five minutes: 43% PID T C TID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process (%) (%) (%) 8590 L 3346604 3886415 176 51.12 50.36 32.75 0 iosd 8590 L 0 8590 3561989 2098956 0 49.88 49.04 30.82 0 iosd 8590 L 1 12314 4076156 1787406 0 1.24 1.32 1.91 0 iosd 8590 L 0 12315 3425 52685 0 0.00 0.02 0.06 0 iosd 24 I 376348 695349 0 77.00 75.77 43.55 0 ARP Input 85 I 534349 8127080 0 18.77 18.77 12.66 0 Cat4k Mgmt HiPri 7 I 2083841 1110797 0 1.11 0.33 0.22 0 Check heaps 86 I 744497 5671481 0 1.11 1.22 2.22 0 Cat4k Mgmt LoPri
Dual Core
17
IOS-XE processes
Traditional IOS processes indented
Catalyst-4k Specific Management Processes
17
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting CPU: Packet-Driven CPU
switch# show platform health … %CPU %CPU RunTimeMax Priority Average %CPU Total Target Actual Target Actual Fg Bg 5Sec Min Hour CPU K5CpuMan Review 30.00 70.81 30 17 100 500 91 66 9 19:17 … Switch# show platform cpu packet statistics … Packets Dropped by Packet Queue Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg ---------------------- --------------- --------- --------- --------- ---------- Ip Option 10715071 118803 71866 15919 0 … (config)# monitor session 1 source cpu rx (config)# monitor session 1 destination interface Gi1/48
K5CpuMan Over Target
Recent flood of packets with IP Options (not HW routable)
If port is available, get a full capture from CPU
18
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting CPU: SPAN not available?
switch# debug platform packet all buffer platform packet debugging is on Switch# show platform cpu packet buffered Total Received Packets Buffered: 1024 ------------------------------------- Index 0: 3 days 23:23:18:54927 - RxVlan: 1006, RxPort: Gi1/1 Priority: Normal, Tag: No Tag, Event: 11, Flags: 0x40, Size: 64 Eth: Src 00:00:0B:00:00:00 Dst 00:22:90:E0:D6:FF Type/Len 0x0800 Ip: ver:IpVersion4 len:24 tos:0 totLen:46 id:0 fragOffset:0 ttl:64 proto:tcp src: 10.10.10.100 dst: 172.16.100.100 hasIpOptions firstFragment lastFragment Remaining data: 0: 0x0 0x64 0x0 0x64 0x0 0x0 0x0 0x0 0x0 0x0 10: 0x0 0x0 0x50 0x0 0x0 0x0 0x8A 0x37 0x0 0x0 20: 0x0 0x1 0xB5 0x77 0x6A 0x7E
• This debug does not require significant CPU overhead • Be sure to use “buffer” and not “log”
Newer versions provide human-readable event Decode on older versions with: switch# show platform software cpu events | i Code|11
CPU Event Code PE-Q
1 2 Ip Option 11 17
19
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting CPU: Common Punt Reasons Common Cause Recommended Solution
Same interface forwarding no ip redirect, or alter topology
ACL logging disable ACL logging, use ACL matching stats or netflow
ACL deny causing switch to send ICMP unreachable
no ip unreachables2
Forwarding/Feature exception (out of TCAM/adj space)
reduce TCAM usage resize TCAM region (TCAM2/3)
SW-supported feature (i.e.GRE) disable the feature or reduce the amount of traffic
IP packets with TTL<2, IP options disable the offending traffic, regulate source with Control Plane Policing1
Unexpected control/data traffic Control Plane Policing1
1.CoPP supported on all legacy supervisors starting 12.2(31)SG, SUP6-E/6L-E /4900M/4948E on 12.2(50)SG , all Sup7E/7L-E/4500X 2.Must be configured on all the L3 interfaces of the switch
20
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
System Resources Memory
• Leak vs Large Usage
• Large usage goes away when condition is no longer present
• Leak never decreases
• Establish baseline
• Collect multiple iterations over recorded interval
• Correlate increase with any known activity
21
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Memory: Large Usage
switch# sh authentication session | count Runn Number of lines which match regexp = 239 switch# sh proc mem detail proc iosd sort | i Hold|Auth Manager PID TTY Allocated Freed Holding Getbufs Retbufs Process 113 0 870624 125992 837216 0 0 Auth Manager switch(config)# int ra gi 1/1 - 48 , gi 2/1 - 48 , gi 3/1 - 48 , gi 4/1 - 48 switch(config-if-range)# shut switch(config-if-range)# int ra gi 7/1 - 48 , gi 8/1 - 48 , gi 9/1 - 48 , gi 10/1 - 48 switch(config-if-range)# shut switch(config-if-range)# end switch# sh authentication session | count Runn Number of lines which match regexp = 0 switch# sh proc mem detail proc iosd sort | i Auth Manager 147 0 1434488 601760 514088 0 0 Auth Manager
300Kb not leaked, simply used
22
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Memory switch# show proc mem sort System memory : 2011604K total, 765920K used, 1245684K free, 85548K kernel reserved Lowest(b) : 710864896 PID Text Data Stack Dynamic RSS Total Process 10137 69308 800424 88 236 958000 1017272 iosd 5498 1140 233600 88 2492 40332 309140 ffm switch# show proc mem detail proc iosd sort Processor Pool Total: 805306368 Used: 645097888 Free: 160208480 I/O Pool Total: 20971520 Used: 361576 Free: 20609944 Critical Pool Total: 4087852 Used: 40 Free: 4087812 Critical Pool Total: 106460 Used: 40 Free: 106420 PID TTY Allocated Freed Holding Getbufs Retbufs Process 153 0 1461539184 749742680 307884712 14266252 0 Auth Manager 0 0 304511544 14111208 272960272 0 0 *Init* 185 0 887586464 301222848 31368752 0 0 CDP Protocol switch# show proc mem detail proc iosd task 153 Process ID: 153 Process Name: Auth Manager Total Memory Held: 307882352 bytes Processor memory Holding = 307882352 bytes pc = 0x16FCD45C, size = 291258544, count = 4441 pc = 0x16FCF828, size = 9378512, count = 143
For Classic IOS, use: • show process mem sort
• show process mem <pid>
Auth Manager holding too much
Collect process memory breakdown for TAC
23
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
System Resources TCAM
• Check TCAM usage for ACLs, security, L3 routes, PBR, DHCP Snoop, IPSG, WCCPv2
%C4K_HWACLMAN-4-ACLHWPROGERR: Input VOIP_FROM_CE_IPv6 - hardware TCAM limit, qos being disabled on relevant interface
%C4K_HWACLMAN-4-ACLHWPROGERR: Input Security: 101 - hardware TCAM limit, some packet processing will be software switched
C4K_HWACLMAN-4-ACLHWPROGERRREASON: Input(75/Normal, 1/Normal) Invalid Acl-based Feature - hardware TCAM policers exceeded
24
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Monitoring TCAM
switch# show platform hardware acl statistics utilization brief CAM Utilization Statistics -------------------------- Used Free Total -------------------------------- Input Security (160) 42 (2 %) 2006 (98 %) 2048 Input Security (320) 66 (3 %) 1982 (97 %) 2048 Input Qos (160) 15 (0 %) 2033 (100%) 2048 Input Qos (320) 14 (0 %) 2034 (100%) 2048 Input Forwarding (160) 2 (0 %) 2046 (100%) 2048 Input Unallocated (160) 0 (0 %) 55296 (100%) 55296 switch# show platform hardware qos policer utilization ------------------------------------------- Policer utilization summary: Direction Assigned Used Free ------------------------------------------- Input 2048 ( 12.5%) 4 ( 0.1%) 2044 ( 99.8%) Output 2048 ( 12.5%) 1 ( 0.0%) 2047 ( 99.9%) Free 12288( 75.0%) 0 ( 0.0%) 12288(100.0%)
Low utilization
25
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
System Resources Queue Memory
• Reserved queue memory for each linecard, exceeding this eats into global pool
• When global pool exhausted, the above message appears
• Options:
• decrease queue depths on a per port basis
• combine classes under the same queue
%C4K_HWPORTMAN-3-TXQUEALLOCFAILED: Failed to allocate the needed queue entries for Gi6/13
26
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Monitoring Queue Memory
Entry Sup6-E/6L-E/7L-E Sup7E Total queue memory 512K 1M
Free Reserve: global pool 100K 100K CPU, recirc, drop queues 20K 40K
Queue entries per slot1 x = 400K/ nSlots2 X = 860K/nSlots
Queue entries per port on a line card y = x / nPorts3 y = x/nPorts
Queue entries per class transmit queue z = y/nTxQs4 z = y/nTxQs
1. In a redundant chassis, two supervisor slots are treated as one 2. nSlots – number of Slots 3. nPorts – number of Ports in a line card 4. nTxQs – number of transmit queues in use
27
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Monitoring Queue Memory switch# show platform software qm Drop port Tx Queue allocations (Size: 8184, Base: 0x019008) Tx Queue allocations for recirc ports (Size: 24576, Base: 0x01D1D0) CPU Subport Tx Queue allocations (TotalSize: 8656) … Superport Tx Queue space distribution ------------------------------------- Superport Slot Percent Base Addr End Addr Entries --------- ---- ------- --------- -------- ------ 4 1 10 0x047ED8 0x04C858 18841 5 1 10 0x04C878 0x0511F8 18841 6 1 10 0x051218 0x055B98 18841 7 1 10 0x055BB8 0x05A538 18841 8 0 10 0x0231D0 0x027B50 18841 9 0 10 0x027B70 0x02C4F0 18841 10 0 10 0x02C510 0x030E90 18841 11 0 10 0x030EB0 0x035830 18841 … 40 1 10 0x05A558 0x05EED8 18841 41 1 10 0x05EEF8 0x063878 18841 42 1 10 0x063898 0x068218 18841 43 1 10 0x068238 0x06CBB8 18841
• 18841 * 8 QM entries available for physical slot 2
• 150728 / 48 = 3140 entries/port • >3140 entries will eat into global pool
Drop, Recirc, CPU reservations
28
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting System Resources Commands CLI Purpose
List IOS process CPU % on IOS-XE show proc cpu detail process iosd sort
Monitor Cat4k platform CPU statistics show platform health show platform cpu packet statistics
SPAN packets to/from CPU monitor session 1 source cpu monitor session 1 destination interface <int>
Enable/monitor Cat4k CPU buffer debug platform packet all buffer show platform cpu packet buffered
Display process memory and buffer holdings
show proc mem sort show process mem <pid> show buffers
Display process memory and buffer holdings on IOS-XE
show proc mem detail proc iosd sort show proc mem detail proc iosd task <pid> show buffers detailed process iosd
Display Cat4k ACL and policer usage show platform hardware acl statistics utilization brief show platform hardware qos policer utilization
Display Cat4k queue memory usage show platform software qm
29
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow
Tools/Tips
Appendix
30
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Why is any packet sent to port(s), to CPU, or dropped?
Losing packets on the 4k without a clue why?
1. Collect “show tech” and iterations of the below
2. Step through the platform
1. Identify counters outside of baseline, find an explanation based on counter meaning
2. Identify unexpected platform programming, work upwards
• incrementing counters are most useful
• Some counters are normal
• Baseline data is useful
31
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Areas Of Investigation
HW-based checks Queue/buffer failure
PHY, stub, packet processor, forwarding engine
show interfaces <int> counters all show platform hardware interf <int> statis show platform software interf <int> statis show platform software interf <int> stub statis show platform software interf <int> stub cts statis all show platform hardware ret rrq show platform software drop-port
CPU queues CPU controller show platform cpu packet driver show platform cpu packet statistics
STP L2 lookup show platform hardware stp vlan <vlan>
L3 entries forwarding lookup show platform hardware ip route [ipv4|ipv6] network <net> <mask> show platform hardware ip route [ipv4|ipv6] host <ip or group>
ACL input classification, output classification
show access-list <*acl> show platform hardware acl input entries static show platform hardware acl [input|output] entries interface <int> all show platform hardware acl [input|output] entries vlan <vlan> all show platform hardware acl [input|output] actions <action>
L2 entries, floodsets
L2 lookup show plat hard mac add <mac> show plat hard ret chain index <index> show platform hardware floodset vlan <vlan>
* Ensure HW statistics are enabled (see ACL section)
32
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path PHY and Stub ASIC
Line Card Stub ASICs
Front Panel Ports
Supervisor
Layer 1 issues
Malformed frames/packets
Oversubscription
Flow-control
Storm-control
33
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Layer 1 Issues
• Match speed and duplex • Isolate bad hardware using known good hardware • Specific to end device? Patch/line cord? Front panel port? Linecard? • Exclude patch panel if possible • Peer misbehaving? Sniff wire for malformed frames
switch# show interfaces g5/5 count errors | exclude \ 0\ *0\ *0\ *0 Port CrcAlign-Err Dropped-Bad-Pkts Collisions Symbol-Err Gi5/5 23736730 0 0 0 Port Undersize Oversize Fragments Jabbers Port Single-Col Multi-Col Late-Col Excess-Col Port Deferred-Col False-Car Carri-Sen Sequence-Err See Appendix for Error descriptions
34
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Layer 1 Issues
switch# show platform software interface gigabitEthernet 1/1 stub statistics XgstubMan(0:N-0)Port( 1 ) Rx Stats: … OverrunPackets : 0 AlignmentErrorPackets : 0 FcsErrorPackets : 0 SymbolErrorPackets : 0 InvalidOversizePackets : 0 Ipv4HdrChecksumErrorPackets : 0 Ipv4HdrErrorPackets : 0 Ipv6HdrErrorPackets : 0 … switch# show platform software interface gigabitEthernet 1/1 statistics Superport8(Gi1/1-6) Non-Zero Software Statistics … RxSequenceErrors : 255 RxSymbolErrors : 255
Note: counters may increment during plug / unplug
Platform commands can narrow down stub ASIC vs packet processor
35
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Layer 1 Issues
(config)# logging event link-status global (config-if)# logging event link-status switch# show platform software interface all | inc downs:|PimPhyport … GalGlmPort(0:N/21), Active? : true, PimPhyport Name : Gi1/22, EpmPortMan Name : EpmPortMan(0:N/21) Name( EpmPortMan(0:N/21) ), PimPhyport name( Gi1/22 ) #link downs: 41712 switch# show platform software interface gi1/1 mii … 0x00 ControlReg 0x1140 0x01 StatusReg 0x79C9 … 0x04 AutoNegAdvReg 0x01E1 0x05 AutoNegLinkPartnerAbilityReg 0x0000 0x06 AutoNegExpansionReg 0x0064 0x07 AutoNegNextPageTransmitReg 0x2001 … 0x09 1000BaseTControlReg 0x0F00 0x0A 1000BaseTStatusReg 0x0000
Monitor for link flap via syslog
Configurable globally or per-interface
Get total number of flaps since switch boot
Compare with switch uptime
This command should be run twice
Use the second results, decode standard 802.3 registers
36
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Oversubscription: stub/supervisor port buffers
completely even traffic flow does not occur in real-world – 2:1 1Gbps != (real world) 500 Mbps x 2 ports – 2:1 10bps != (real world) 5Gbps x 2 ports
ingress traffic on oversubscribed ports – control on the peer device
egress oversubscription – consider multi-path
max
avg
min
37
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Flow control
• switch may send pause toward end-device if rx buffer passes high watermark
• stub will pause toward supervisor if end-device signals pause
Stub ASICs
Front Panel Ports Pause
Packet Processor
Pause
1
2
Drops 3 1. Device sends pause to stub
2. Stub sends pause to packet processor
3. Packet processor pauses tx-queue
38
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Tx Oversubscription and Flow Control
switch# show interfaces g2/47 counters detail | begin Drops Port Tx-Drops-Queue-5 Tx-Drops-Queue-6 Tx-Drops-Queue-7 Tx-Drops-Queue-8 Gi2/47 0 0 0 37748571 switch# show interfaces g2/47 counters detail | begin RxPause Port Rx-No-Pkt-Buff RxPauseFrames TxPauseFrames PauseFramesDrop Gi2/47 0 130 0 0
Tx oversubscription will result in tx-queue drops
Pause frames from a peer will stop tx-queue processing
Queue 8 is the default queue with no QoS Configured
39
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Rx Oversubscription
switch # show interface gi1/13 | include overrun 0 input errors, 0 CRC, 0 frame, 86432 overrun, 0 ignored switch# show interface gi1/13 counter all | begin Rx-No Port Rx-No-Pkt-Buff RxPauseFrames TxPauseFrames PauseFramesDrop Gi1/13 206658 0 0 0 switch# show platform software interface g1/13 stub stat | in Overrun OverrunPackets : 206658 (look for Rx Stats)
RxFifo stub overrun will be seen during Rx oversubscription
packet buffer depletion can also cause Rx-No-Pkt-Buff
40
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Packet Processor
Shared Packet Memory
Line Card
Supervisor
Packet Processor
Central packet memory exhaustion
• Deep transmit queues • Egress oversubscription (example: SPAN) • Jumbo frames
%C4K_SWITCHINGENGINEMAN-4-IPPLLCINTERRUPTFREELISTBELOWHIPRIORITYTHRESHOLD: IPP LLC freelistBelowHiPriorityThreshold interrupt FreeListCount: 2058, lowestFreeCellCnt: 0
Has anyone seen a
longer log message?
41
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Oversubscription: packet memory exhaustion
Deep buffers and congestion
limited gain (temporary buffering)
switch-global expense (ingress and egress)
1. Deep egress queue fills
2. Packet memory consumed
3. Packet memory unavailable for ingress
Packet Processor
Shared Packet Memory
Drops
Drops 1
2
3
Full
42
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Oversubscription: packet memory exhaustion
Reduced buffers during congestion
limited expense (smaller threshold on given interface)
large gain (no packet memory exhaustion)
Other solutions:
even out packet port distribution
egress policers
Packet Processor
Shared Packet Memory
Drops
Restricted
43
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Packet memory: keeping the FreeList healthy
switch# show platform hardware interface all | include FreeListCount FreeListCount : 64268 switch# show platform hardware interface all | include FreeListCount FreeListCount : 62100 switch# show interfaces g2/47 counters detail | begin Drops Port Tx-Drops-Queue-5 Tx-Drops-Queue-6 Tx-Drops-Queue-7 Tx-Drops-Queue-8 Gi2/47 0 0 0 37748571 (config)# policy-map egress_queue_limit class class-default queue-limit 500 (config)# hw-module system max-queue-limit <value>
64K*280 Byte cells in Sup6E, Sup6L-E
128K*256 Byte cells in Sup7E, Sup7L-E
Drop in FreeList will accompany IPP log message
1. Locate interfaces tail dropping
2. Reduce tx-queue size OR
3. Modify default queue size
44
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Packet Loss / Path Forwarding ASIC
Line Card
Supervisor
NFE
CPU
TCAMs
Forwarding Engine
Stepping through forwarding ASIC stages
Identifying packet destiny – Punt? – Drop? – Forward to where? – Replicate to where?
Working backwards from ASIC counters
45
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Forwarding ASIC Location Purpose Most Common Platform Troubleshooting Need
IM Input mapping Vlan re-mapping
L2 L2 lookup Layer 2 destination
IC Input classification ACLs (especially static ACL, which evaluate *all* traffic) For custom ACL, IOS-level CLI typically all that is needed
NF Netflow Platform troubleshooting not commonly required
IP Input policing IOS-level policer counters typically all that is needed
FL Forwarding lookup L3 Multicast replication
OC Output classification IOS-level CLI typically all that is needed
OP Output policing IOS-level policer counters typically all that is needed
OM Output mapping, replication
Vlan re-mapping Replication counters useful in very high density scenarios
QM Queueing Tx-queue programming
46
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Mapping IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
• Physical / aggregate port mapping • Vlan mapping
switch# show platform mapping ports Interface Superport Subport CompactSubportId PortSet Phyport Aggport PimPhyport Gi1/1 8 1 20 2 13 8 0 … Gi7/48 35 4 210 8 402 Po1(417) 367 switch# show platform hardware portvlan-map-table interface gigabitEthernet 1/1 Aggport( 8 ): ----- PortVlanDirectTable ----- VlanId FwdVlanId SrcMissCtrl TxDropEn VlanTagStripEnOnTx 0 0 SrcMissCopyToCpu False False … ----- PortVlanHashTable ----- Index PartialAggport VlanId FwdVlanId Dir SrcMissCtrl TxDropEn VlanTagStripEnOnTx 1568 8 100 200 Rx SrcMissCopyToCpu - False 3188 8 100 200 Tx - False False
All ports on an Etherchannel share an Aggport
Vlan mapping in use
Mapping information used in many platform CLI outputs
47
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Mapping / L2 Lookup IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
• Confirm if routing features are enabled on a vlan
switch# show platform hardware rxvlan-map-table vlan 902 Vlan 902: l2LookupId: 902 srcMissIgnored: 0 ipv4UnicastEn: 1 ipv4MulticastEn: 1 ipv6UnicastEn: 0 ipv6MulticastEn: 0 … switch# show int vl 902 | i SVI Hardware is Ethernet SVI, address is 001e.f73f.f5bf (bia 001e.f73f.f5bf) switch# show mac address-table vlan 902 | i 001e.f73f.f5bf 902 001e.f73f.f5bf static ip,ipx,assigned,other Switch switch# show plat hard mac add 001e.f73f.f5bf vlan 902 … Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex ----- -------------- ----- ---------- ---------------------------- 63248 001E.F73F.F5BF 902 SinglePort Cpu aggport(4) ND RouterAddr
IPv4 unicast and multicast routing enabled
SVI MAC present in MAC table (for unicast routing)
Note: all SVI use the same MAC address on 4k
48
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: L2 Lookup • STP state check • SA Learning
switch# show span int gi 7/48 state | i VLAN0002 VLAN0002 forwarding switch# show platform hardware stp vlan 2 | i Gi7/48 Gi7/48 (375) Forwarding switch(config)# no mac address-table learning vlan 100 switch# show platform hardware rxvlan-map-table vlan 100 | i srcMiss srcMissIgnored: 1 switch# show mac add int gi 1/46 | i 902 902 0000.0500.0000 dynamic ip,ipx,assigned,other GigabitEthernet1/46 902 ffff.ffff.ffff system Gi1/46,Gi7/48,Switch switch# show plat hard mac add 0000.0500.0000 | i 0500|Index Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex 27760 0000.0500.0000 902 SinglePort Gi1/46(53) ND SrcOrDst F
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
no copies will be sent to CPU for MAC source address learning
HW matches SW
49
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: L2 Lookup • SA Lookup: port security
switch# show run int gi 3/19 … interface GigabitEthernet3/19 switchport access vlan 172 switchport mode access switchport port-security spanning-tree portfast switch# show platform hardware mac vl 172 Flags are: ---------- D - Drop ND - Do not drop Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex ----- -------------- ----- ---------- ---------------------------- 2640 0017.9543.EA7F 172 SinglePort Gi3/19(74) ND SrcOrDst 49300 0017.9543.EA7F 172 SinglePort WildcardAggport D SrcOrDst
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Traffic sourced from this MAC from any port other than Gi3/19 will be dropped on vlan 172
50
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: L2 Lookup • DA Lookup: private vlan example
switch# show run int gi 3/7 interface GigabitEthernet3/7 switchport private-vlan host-association 100 200 switchport mode private-vlan host spanning-tree portfast end switch# show platform hardware mac add c89c.1d53.612d Flags are: ---------- D - Drop ND - Do not drop Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex ----- -------------- ----- ---------- ---------------------------- 11700 C89C.1D53.612D 200 SinglePort Gi3/7(62) ND SrcOrDst 46352 C89C.1D53.612D 100 SinglePort Gi3/7(62) ND SrcOrDst 51376 C89C.1D53.612D 200 SinglePort Drop aggport(8190) D SrcOrDst
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Traffic toward C89C.1D53.612D on vlan 200 (isolated vlan) will reach the drop port instead
Note: Index order is not lookup order
51
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: L2 Lookup • DA Lookup: multicast, broadcast
switch# show mac add multi vlan 902 | i 0100.5e01.0101 902 0100.5e01.0101 igmp Gi1/46,Switch switch# show plat hard mac add 0100.5e01.0101 | i 0100.5E01.0101|Index Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex 20224 0100.5E01.0101 902 Ret 104444 switch# show plat hard ret chain index 104444 RetIndex 104444 RetWordIndex: 522220 Link: 1048575(0xFFFFF) FieldsCnt: 1 SuppressRxVlanBridging: true Vlan: 902 BridgeOnly: N Gi1/46(53) Switch# show platform hardware floodset vlan 902 Vlan 902: Unicast Floodset: FloodToCpu: - RetIndex: 902 Gi1/46(53) Po1(417) …
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
unknown unicasts will be flooded to these ports
Multicast traffic to 0100.5e01.0101 replicated here, unless overridden by L3/ACL
Note since 15.0(2)SG / 3.2.0SG Broadcast is a per-vlan ffff.ffff.ffff entry instead of a floodset
52
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: L2 vs L3 vs ACL What HW programming will direct the packet?
switch# show platform hardware ip fwdsel summary L2Value == other (port/RET) (0): IC L3 0 1 2 3 0 l2 ic ic ic 1 l3 ic ic ic 2 l3 l3 ic ic 3 l3 l3 l3 ic
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Fwdsel relevant to ACL (ic) only when there is a redirect action
Example:
L3 entry present, FwdSel=2
ACL redirect entry present, FwdSel=2
Winner = ACL (ic)
L3 Entry ACL Entry
L2 entry floodset
Depends on “fwdsel”
> >
53
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Classification • SVI and ACL statistics require hardware resources • Not enabled by default
switch# show run … interface Vlan902 ip address 92.92.92.1 255.255.255.0 counter … ip access-list extended deny deny ip any any hardware statistics … switch# show platform hardware vlan statistic summary Region Name First Last First LastUsed Entries Entries Block Block Entry Entry Used Free Size 2 Counters Region 0 510 0 0 1 2043 Size 4 Counters Region 511 1022 2044 - 0 2048 VlanStatsTable Programming Complete: Yes
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Enable hardware counters
Ensure resources are available
54
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Classification • Any ACL-based ingress classification (security, QoS, PBR) • ACL examples: local multicast sources, static ACL, PBR, PACL
switch# show platform hardware acl input entries vlan 902 all … Opcode : 40000 / 40000 IP Src : 92.92.92.0 / 255.255.255.0 IP Dst : 224.0.0.0 / 240.0.0.0 … ActIdx: 249 StatsIdx: 0 FwdIdx: (Cpu, Cpu: true, CpuEvent: 1, Port: 6) switch# show platform hardware acl input actions 249 … Idx: 249 … FwdSel: 2 … L3Action: 2
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
• Installed automatically when PIM is enabled on the SVI
• Matches local sources > TTL=1
• Redirects to CPU for S,G setup (if not overridden by L3 entry)
• Compare FwdSel with L3 entries
• L3Action: (0 = permit, 1 = drop, 2 = redirect)
55
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Classification • ACL examples: local multicast sources, static ACL, PBR, PACL
switch# show platform hardware acl input entries static … CamIndex Entry Type Active Apply QoS Hit Count -------- ---------- ------ --------- --------- 2 IgmpToCpu Y N/A 14237 (estimate) … switch# show platform hardware acl input entries start 2 end 2 all … IP Src : 0.0.0.0 / 0.0.0.0 IP Dst : 224.0.0.0 / 240.0.0.0 IP Protocol : igmp / IpProtocolMask … ActIdx: 252 StatsIdx: 0 FwdIdx: (Cpu, Cpu: true, CpuEvent: 1, Port: 3) switch# show platform hardware acl input actions 252 … FwdSel: 3 L2Action: 2
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
• Watch for increment
• Hit does not mean packet count
IGMP sent to 224/4
will go to CPU
if FwdSel wins over L3
L2Action: (0 = permit, 1 = drop, 2 = redirect)
56
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Classification • ACL examples: local multicast sources, static ACL, PBR, PACL
switch# show platform hardware acl input entries vlan 901 all … IP Src : 1.1.1.1 / 255.255.255.255 IP Dst : 0.0.0.0 / 0.0.0.0 … ActIdx: 244 StatsIdx: 0 FwdIdx: (Adj, Adj: 8) switch# show platform hardware acl input actions 244 … FwdSel: 2 … L3Action: 2 switch# show platform hardware ip adjacency entry 8 000008: vlan: 192 port: Po1 (417) size: 1 ifaId: 20 fwdCtrl: 5 cpucode: 3 sifact4: FwdToCpu sifact6: FwdToCpu sa: 00:1E:F7:3F:F5:BF da: 00:0C:29:6D:1A:ED rwFmt: Unicast packets: 0 bytes: 0
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Packets sourced from 1.1.1.1/32
will be redirected to adjacency 8 (Po1)
If FwdSel wins over L3
Note: PBR ACLs are removed if adjacency becomes unavailable
57
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Classification • ACL examples: local multicast sources, static ACL, PBR, PACL • Note: packets classified as non-IP, IPv4, IPv6 (cannot MAC ACL on an IP packet)
switch# show ip access deny Extended IP access list deny 10 deny ip any any (1056 matches) switch# show ip int gi 1/2 Inbound access list is deny switch# show plat hard acl inp entr int gi 1/2 all … IP Src : 0.0.0.0 / 0.0.0.0 IP Dst : 0.0.0.0 / 0.0.0.0 IP Protocol : IpProtocolNull / IpProtocolNull … ActIdx: 254 StatsIdx: 0 FwdIdx: (None, rep: 0) switch# show plat hard acl inp act 254 … FwdSel: 0 … L2Action: 1
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
All IPv4 traffic will be dropped
Fwdsel doesn’t matter
L2Action: (0 = permit, 1 = drop, 2 = redirect)
58
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Classification / Policing • Order of operations
flow record microflow match ipv4 source address class-map match-all microflow match flow record microflow policy-map ingress class voice-signalling set dscp cs3 police cir 32000 bc 8000 conform-action transmit exceed-action set-dscp-transmit cs1 exceed-action set-cos-transmit 1 class microflow police cir 100000 conform-action transmit exceed-action drop class class-default set dscp default set cos 0
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM Unconditional Marking
Microflow policing
• Flexible Netflow
• Class-map matching FNF
• Policer
Normal policer
Conditional Marking
Classification
Ingress Classification
Ingress Policing
Ingress Marking Unconditional
Ingress Marking Conditional
Forwarding
59
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Input Classification / Policing • Monitoring ingress Qos
switch# show policy-map interface gigabitEthernet 1/46 GigabitEthernet1/46 Service-policy input: ingress Class-map: voice-signalling (match-all) 28283457437 packets Match: dscp ef (46) QoS Set dscp cs3 police: cir 32000 bps, bc 8000 bytes conformed 76128704 bytes; actions: transmit exceeded 1810581188160 bytes; actions: set-dscp-transmit cs1 set-cos-transmit 1 conformed 32000 bps, exceed 761238000 bps
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Class-map stats are shared across interfaces with the same policy map
• Ensure counters increment
• Classification displays using the packet counts
• Policing displays using bytes
60
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Forwarding Lookup • L3 unicast destination lookups, multicast (*,G) / (S,G) lookups, urpf lookups
switch# show ip route 192.168.200.200 Routing entry for 192.168.200.0/24 Known via "static", distance 1, metric 0 Routing Descriptor Blocks: * 192.168.100.100 Route metric is 0, traffic share count is 1 switch# show ip arp | i 192.168.100.100 Internet 192.168.100.100 0 000c.296d.1aed ARPA Vlan192 switch# show mac address dynamic | i 000c.296d.1aed 192 000c.296d.1aed dynamic ip,ipx,assigned,other Port-channel1 switch# show platform hardware ip route ipv4 network 192.168.200.0 255.255.255.0 Block: 0 En: true EntryMap: LSB Width: 80-Bit Type: Dst … 000022: v4 192.168.200.0/24 --> vrf: Global Routing Table (0) adjStats: true fwdSel: 2 mrpf: 0 (None) fwdIdx: 0 ts: 0 adjIndex: 8 vlan: 192 port: Po1 (417) fwdCtrl: 5 cpucode: 3 sifact4: FwdToCpu sifact6: FwdToCpu sa: 00:1E:F7:3F:F5:BF da: 00:0C:29:6D:1A:ED
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Remember: unicast traffic won’t be destination-routed unless: • routing is enabled on the vlan • traffic is sent to L3 MAC • FwdSel of route wins over ACL
62
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Forwarding Lookup • L3 unicast destination lookups, multicast (*,G) / (S,G) lookups, urpf lookups • In general: (S,G) > vlan multicast source ACL > (*,G)
switch# show ip mroute 239.1.1.1 91.91.91.100 … (91.91.91.100, 239.1.1.1), 00:08:11/00:01:32, flags: JT Incoming interface: Vlan901, RPF nbr 0.0.0.0 Outgoing interface list: Vlan902, Forward/Sparse, 00:07:49/00:02:53 switch# show platform hardware ip route ipv4 host 239.1.1.1 … 008194: v4 91.91.91.100/32 239.1.1.1/32 --> vrf: Global Routing Table (0) adjStats: true fwdSel: 3 mrpf: 901 (FwdToCpu) fwdIdx: 0 ts: 0 retIndex: 49150 retTs: 0 Vlan: 901 BridgeOnly: Y Gi1/46(53) Vlan: 901 BridgeOnly: Y Gi7/1(328) Vlan: 901 BridgeOnly: Y Po1(417) Vlan: 902 BridgeOnly: N Gi1/46(53)
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
BridgeOnly = Y, packet will be bridged (to Gi1/46 vlan 901) BridgeOnly = N, packet will be routed (to Gi1/46 vlan 902)
Packets matching the (S,G) NOT ingressing mrpf vlan will fail rpf check, punt to CPU
63
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Forwarding Lookup Quiz scenario: • Switch configured for multicast routing, sparse mode • No RP address is configured • A local multicast source starts
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Why does the new source stream to the
CPU?
Answer: • Vlan local source ACL punts traffic to the CPU • No S,G is ever created to override the ACL (via fwdsel)
64
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Forwarding Lookup • L3 unicast destination lookups, multicast (*,G) / (S,G) lookups, urpf lookups
switch# show run int vl 901 interface Vlan901 ip address 91.91.91.1 255.255.255.0 ip verify unicast source reachable-via rx allow-default switch# show platform hardware ip route ipv4 network 91.91.91.0 255.255.255.0 … Block: 3 En: true EntryMap: LSB Width: 80-Bit Type: Src … 012333: v4 91.91.91.0/24 * --> vrf: Global Routing Table (0) defaultRoute: false rpfVlan: 901 (Drop) ts: 0
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Routed traffic sourced from 91.91.91.0/24 Where RPF fails (ie doesn’t ingress vlan 901) Will be dropped
65
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Output Classification, Policing, Mapping
• ACL-based output classification: Security, qos • ACL and policer CLI the same (change input -> output) • Mapping behavior shown in ingress mapping CLI • STP state is not checked in HW
• entries/floodsets simply don’t include ports in blocked state • Packets for replication enqueued into replication queue
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
switch# show platform hardware ret rrq … ReasonQueue is not empty ReasonHead: 0xDEB ReasonTail: 0xDD7 DataQueue is not empty DataHead: 0xDE2 DataTail: 0xDD7 Prefull drops: 171477 Over Threshold drops: 0
Control, lookup queues are in use
Drops have occurred due to reaching first drop threshold
66
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Output Classification / Policing • Order of operations
policy-map egress class voice set dscp ef set cos 5 priority police cir percent 33 class voice-control set dscp af31 set cos 3 bandwidth remaining percent 5 class class-default dbl
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Marking
Queuing
Policing
Note: dbl, shape, bandwidth, queue-limit and priority commands are all queuing commands
MQC for port-channels: • Policy with queuing actions – only physical ports • Policy with non-queuing actions – only port channel
Output Classification
Output Policing
Output Marking Unconditional
Output Marking Conditional
Queuing
Classification
67
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Output Classification / Policing • Monitoring egress Qos switch# show policy-map int g1/36 output GigabitEthernet1/36 Service-policy output: AutoQos-VoIP-Output-Policy Class-map: AutoQos-VoIP-Bearer-QosGroup (match-all) 625530530 packets Match: qos-group 46 QoS Set ip dscp ef cos 5 priority queue: Transmit: 32344068480 Bytes, Queue Full Drops: 0 Packets police: cir 33 % cir 330000000 bps, bc 10312500 bytes conformed Packet count - n/a, 32335870400 bytes; actions: transmit exceeded Packet count - n/a, 7813435520 bytes; actions: drop conformed 325185000 bps, exceed 97368000 bps
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Class-map stats are shared across interfaces with the same policy map
• Ensure counters increment
• Classification display using the packet counts
• Policing display using bytes
• Queue full drops are in packets
68
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: Output Queuing • DBL processing (if packet is not scheduled for drop) • Descriptor enqueued in queue memory
switch# show platform hardware interface gigabitEthernet 1/1 tx-queue … Phyport TxQ Head Tail Pre Empty Num BaseAddr Size Shape-Ok Empty Packets TxQ Subport ------------------------------------------------------------------------------- Gi1/1 0 0x0000 0x0000 True 0 0x20D10 16 True True Gi1/1 1 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 2 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 3 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 4 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 5 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 6 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 7 0x0000 0x0000 True 0 0x20D20 3152 True True
IM
L2
IC
NF
IP
FL
OC
OP
OM
QM
Reminder: SPAN copies are probably sent to different port / queues
Default queues configured Currently empty
69
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
policy-map egress_queueing class dscp32-48 police cir 990000 conform-action transmit exceed-action drop priority class dscp0-15 bandwidth 250000 queue-limit 400 class dscp16-31 bandwidth 250000 queue-limit 512 class class-default switch# show platform hardware interface g2/48 tx-queue … Phyport TxQ Head Tail Pre Empty Num BaseAddr Size Shape-Ok Empty Packets TxQ Subport ------------------------------------------------------------------------------- Gi2/48 0 0x0000 0x0000 True 0 0x5ECE8 352 True False Gi2/48 1 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 2 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 3 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 4 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 5 0x0000 0x0000 True 0 0x5E958 512 True False Gi2/48 6 0x0000 0x0000 True 0 0x5EB58 400 True False Gi2/48 7 0x008A 0x0088 False 1421 0x5EE48 1520 True False
Packet Loss / Path: Output Queuing Tx Q Class 0 dscp32-48
5 dscp16-31
6 dscp0-15
7 dscp49-63, class-default
Low priority queues can be starved, policer recommended
Last queue is default queue
In this example, it is non-empty
First and last appear where expected, middle reversed
70
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: ASIC Drop Categories
Common Drop Event Reason Typical Description
BridgeToRxPortDrop received in a vlan with no other ports, replicated to a floodset/entry where ingress port was a member
DblDrop packets dropped by DBL (including DBL on CPU ports)
InpL2AclDrop, InpL3AclDrop, OutL2AclDrop, OutL3AclDrop
packets denied by ACL
rplErrDrop broadcast/multicast packets dropped while being replicated, many normal reasons to increment, including: rpf failure, floodset containing drop port, packets replicated to the CPU but also bridged to a floodset/entry containing the CPU
SptDrop spanning-tree drop; packets dropped because a port is not in a forwarding state
SrcHitDrop dropped at source learning stage; example: static MAC drop entry
TxQueFullDrop a tx port is oversubscribed
• show platform software drop-port shows global ASIC drop events (not per interface) • these counters are frequently expected • baseline and/or high packet rate very useful
71
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Packet Loss / Path: CPU Queues
72
switch# show plat cpu pack driv Forerunner Packet Engine 1.83 (0) Receive Queues: received packets summary Qu Capac Guara CurPo Unpro Accum Kept BperP Packets 2 2512 112 610 0 2 2 73 610 58 512 256 37 12 5 511 216 591103 Receive Queues: dropped packets summary Qu Total Packets Drop No Cell Drop Overrun Drop Underrun 58 591103 43623295103 0 0 Transmit Queues Qu PosAdd Pendng Packets Bytes 0 595 0 8633668179 663318795241 1 863 0 5315423 363150782
However, combine high “Kept” with: • CurPo does not increment • Drop No Cell does increment … queue 58 is stuck!
• High “Kept” indicates high rate of traffic • Incrementing “Drop No Cell” indicates
queue oversubscription
• Check for transient flooding / loss versus stuck queue • Decode queue meaning with show platform software cpu events
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow
Tools/Tips
Appendix
73
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting VSS Core
Distribution VSS member 1
VSS member 2
Access
VSL
Core Switch 1 Core Switch 2
Access Switch 1 Access Switch 3 Access Switch 2
• Differences
• VSL Health
• Packet Path
74
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting VSS: Tips and Differences
• Available on Sup7E/4500X (ipbase or better), Sup7L-E (entservices or better)
• No quad-sup SSO, but you can use in-chassis standby (ICS) uplinks
• Configure VSS before installing ICS
• ICS must remain in rommon
• Split-brain detection uses ePAGP
• MEC policers are applied independently (eg 100Mbps = 100 @ active, 100 @ standby)
• No qos groups
• Not currently supported: smart Install, linecards prior to 46**, custom VSL qos
75
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
switch# show redundancy | i Current Current Processor Information : Current Software state = ACTIVE Current Software state = STANDBY HOT switch# show switch virtual Executing the command on VSS member switch role = VSS Active, id = 1 Switch mode : Virtual Switch Virtual switch domain number : 100 Local switch number : 1 Local switch operational role: Virtual Switch Active Peer switch number : 2 Peer switch operational role : Virtual Switch Standby Executing the command on VSS member switch role = VSS Standby, id = 2 Switch mode : Virtual Switch Virtual switch domain number : 100 Local switch number : 2 Local switch operational role: Virtual Switch Standby Peer switch number : 1 Peer switch operational role : Virtual Switch Active
Troubleshooting VSS: VSL Health
Chassis SSO is established
VSS is functioning
76
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting VSS: VSL Health
switch# show switch virtual link port-channel | i Po Group Port-channel Protocol Ports 10 Po10(SU) - Te1/3/1(P) Te1/3/2(P) 20 Po20(SU) - Te2/3/1(P) Te2/3/2(P) Group Port-channel Protocol Ports 10 Po10(SU) - Te1/3/1(P) Te1/3/2(P) 20 Po20(SU) - Te2/3/1(P) Te2/3/2(P)
switch# show policy-map int te1/3/2 | i Class|drops Class-map: VSL-MGMT-PACKETS (match-any) (queue depth/total drops) 0/0 Class-map: VSL-L2-CONTROL-PACKETS (match-any) (queue depth/total drops) 0/0 Class-map: VSL-L3-CONTROL-PACKETS (match-any) (queue depth/total drops) 0/6 Class-map: VSL-VOICE-VIDEO-TRAFFIC (match-any) (queue depth/total drops) 0/0 Class-map: VSL-SIGNALING-NETWORK-MGMT (match-any) (queue depth/total drops) 0/0 Class-map: VSL-MULTIMEDIA-TRAFFIC (match-any) (queue depth/total drops) 0/0 Class-map: VSL-DATA-PACKETS (match-any) (queue depth/total drops) 0/491 Class-map: class-default (match-any) (queue depth/total drops) 0/37
VSL members bundled
• Watch for non-zero queue depth or incrementing drops on control queues
• Drops on non-control queues? Increase VSL links/speed
77
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting VSS: Packet Path
switch# show platform hardware floodset vlan 97 … Executing the command on VSS member switch role = VSS Active, id = 1 Vlan 97: Unicast Floodset: FloodToCpu: - RetIndex: 97 Gi1/5/69(236) Alternate VSL aggport(1528) … … Ipv4 Multicast Floodset: FloodToCpu: N RetIndex: 16481 Gi1/5/69(236) Po10(842)
Executing the command on VSS member switch role = VSS Standby, id = 2 Vlan 97: Unicast Floodset: FloodToCpu: - RetIndex: 97 Alternate VSL aggport(1528) Gi2/1/1(420) Gi2/7/38(777) … Ipv4 Multicast Floodset: FloodToCpu: N RetIndex: 16481 Gi2/1/1(420) Gi2/7/38(777) Po20(1108)
• VSS virtual data path visible in platform programming
• Reflected in all packet path programming If traffic needs to cross chassis, VSL aggport, VSL Po, or CPU must be used
78
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow
Tools/Tips
Appendix
79
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE: Power Supply and Linecards switch# show environment status <snip> Supervisor Led Color : Green Module 1 Status Led Color : Green Module 2 Status Led Color : Green PoE Led Color : Green switch# show power detail Power Fan Inline Supply Model No Type Status Sensor Status ------ ---------------- --------- ----------- ------- ------- PS1 PWR-C45-4200ACV AC 4200W good good good PS1-1 110V good PS1-2 110V good PS2 Watts Used of System Power(12V) Mod Model budgeted instantaneous peak out of reset in reset ---- ------------------- -------- ------------- ------ ------------ -------- 1 WS-X4648-RJ45V-E 92 -- -- 92 10 2 WS-X4548-GB-RJ45V 60 -- -- 60 25
PoE is operational on the line card
If not good, check power supply LEDs
Linecards are fully powered
80
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE: Analyze Power Budget
switch# show power detail Power Summary Maximum (in Watts) Used Available ---------------------- ---- --------- System Power (12V) 847 1360 Inline Power (-50V) 6 1580 Backplane Power (3.3V) 40 40 ---------------------- ---- --------- Total 893 (not to exceed Total Maximum Available = 2100) Inline Power Admin Inline Power Oper Mod Model PS Device PS Device Efficiency ---- ------------------- -------- ------------- ------ ------------ -------- 1 WS-X4648-RJ45V-E 7 6 9 8 93 2 WS-X4548-GB-RJ45V 0 0 17 15 89 Total 7 6 26 23
PoE Allocated
Inline power available. If not, this log would be seen:
%ILPOWER-5-ILPOWER_POWER_DENY: Interface <interface>: inline power denied
• Switch will allocate highest power level requested by the phone • Catalyst 4500 power allocation rules:
• Power line cards before IP phones • Prefer static over auto power
Cisco Power Calculator: http://tools.cisco.com/cpc/launch.jsp
81
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE: Linecard Status
switch# show module Chassis Type : WS-C4510R-E Power consumed by backplane : 40 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 48 10/100/1000BaseT POE E Series WS-X4648-RJ45V-E JAE1329EAVL 2 48 10/100/1000BaseT (RJ45)V, Cisco/IEEE WS-X4548-GB-RJ45V JAE10244L7P 4 18 10GE (X2), 1000BaseX (SFP) WS-X4606-X2-E JAE12021FMP 5 6 Sup 6-E 10GE (X2), 1000BaseX (SFP) WS-X45-SUP6-E JAE1223KL3G 6 6 Sup 6-E 10GE (X2), 1000BaseX (SFP) WS-X45-SUP6-E JAE12460E61 M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 0024.1446.2d93 to 0024.1446.2dc2 1.0 Ok 2 0018.1958.cf70 to 0018.1958.cf9f 3.3 Ok 4 001d.4573.0ada to 001d.4573.0aeb 1.0 Ok 5 0022.90e0.d6c0 to 0022.90e0.d6c5 1.1 12.2(44r)SG 12.2(53)SG1 Ok 6 0022.90e0.d6c6 to 0022.90e0.d6cb 1.2 12.2(44r)SG 12.2(53)SG1 Ok
If not Ok, try resetting after executing all troubleshooting steps:
hw-module module <module> reset
Other status includes: Faulty, Authfail, Offline, PwrOver, PwrMax, PwrDeny. See Appendix for details.
82
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE: Devices Drawing Too Much
(config-if)# power inline police switch# %INLINEPOWEROVERDRAWN: Inline powered device connected on port Gi2/2 exceeded its policed threshold. ERR_DISABLE: inline-power error detected on Gi2/2, putting Gi2/2 in err-disable state switch# show power inline police g2/2 Available:1580(w) Used:77(w) Remaining:1503(w) Interface Admin Oper Admin Oper Cutoff Oper State State Police Police Power Power --------- ------ ---------- ---------- ---------- ------ ----- Gi2/2 auto errdisable errdisable overdrawn 0.0 0.0
(config-if)# power inline static max 20000
• Policing available from 12.2(50SG)
• For phones that rarely draw more than allowed, configure static power
83
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE: CDP / LLDP Negotiation
(config)# lldp run (config)# int gi 3/1 (config-if)# lldp tlv-select power-management
Cat 4K Feature Release LLDP 802.1ab 12.2(44)SG LLDP 802.3at PoE+ TLV, LLDP-MED 12.2(54)SG
Power Negotiation can occur via CDP, LLDP 802.3at or LLDP-MED
Switch "locks" to first protocol packet (CDP or LLDP) that has the power negotiation TLV
LLDP 802.3at power negotiation TLV overrides the LLDP-MED power negotiation TLV
Recommend - disable all but the desired power negotiation protocols on the switch interface & peer
84
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE: Verify Data, Collect Debugs Change connections
– Record results of different line card, port, cable, end device
Is this a PoE issue or a PoE and data issue? – Disconnect phone, and connect non-PoE device
Configure “power inline never” on the port – Verify the link comes up
Re-enable power
Collect additional debugs
switch# show platform chassis module <id> switch# debug interface g1/48 Condition 1 set switch# debug ilpower powerman disconnect PD, connect PD, collect debugs) switch# undebug all All possible debugging has been turned off switch# undebug interface g1/48
Power device (PD)/phone not powering up at all?
‒ Confirm the device is IEEE compliant, check with vendor
‒ Validate with 3rd party PD testers
‒ Device capacitance or impedance as per IEEE?
When PoE is enabled on a port, auto MDIX is disabled. Please make sure you use the correct cable type. See the note in the Catalys4500 configuration guide.
85
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE: Analyze Power Allocation Line Card PoE per Line Card PoE per Port
WS-X4748-UPOE+E 1440 60W WS-X4748-RJ45V+E 1440 30W WS-X4648-RJ45V+E 750 W 30 W WS-X4548-RJ45V+ 1050 W 30 W WS-X4648-RJ45V-E 750 W 20 W WS-X4548-GB-RJ45V 750 W 15.4 W WS-X4524-GB-RJ45V 750 W 15.4 W WS-X4248-RJ45V 750 W 15.4 W WS-X4248-RJ21V 750 W 15.4 W WS-X4224-RJ45V 750 W 15.4 W WS-X4148-RJ45V 750 W 7 W WS-X4148-RJ21V 750 W 7 W
Does the PoE line card support enough power per port? Does the PoE line card support enough power? ( slots 3-10 pair limit in 4510)
Catalyst 4500 Line Cards Data Sheet: http://www.cisco.com/en/US/prod/collateral/modules/ps2710/ps5494/product_data_sheet0900aecd802109ea_ps4324_Products_Data_Sheet.html IP Phone Data Sheets: http://www.cisco.com/en/US/products/hw/phones/ps379/products_data_sheets_list.html.
86
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting PoE Commands Troubleshooting Steps Commands
Check Link debounce settings show interfaces debounce
Check number of debounce events show platform software interfaces mii | inc Debounce
Check Digital Optical Monitoring Data show interface <> transceiver detail
Verify PoE line card is online show module
Verify inline power available and operational show power detail
Verify the inline power status of the port show power inline <interface> [detail]
Verify PoE line card supports enough power per port, per slot Appendix table, line card datasheets
Verify phone is not drawing more power than it should show power inline police <interface>
Verify power negotiation is successful debug interface <interface> debug ilpower powerman undebug all undebug interface <interface>
Gather various module specific debugs show platform chassis module <id>
87
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow
Tools/Tips
Appendix
88
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Flexible Netflow Overview
• Flexible NetFlow (FnF) available on switch, Sup7L-E and 4500X-32
• Original netflow – src/dst IP, src/dst L4 port, protocol, TOS, and input interface
• Flexible netflow – user defined fields (supports L2, IPv4, IPv6)
• Support both v9 (flexible) and v5 (fixed tuple) export formats
• Uses
• Troubleshooting – profile for suspected patterns and port
• Network security – monitor and record network meta-data, spot new patterns
• Usage monitoring and billing
89
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting FNF Export Flow stats not received at collector
• UDP export only, check for packet loss along path to collector
• Issue can be with the collector as well
• Confirm NetFlow export version matches the collector
• Note mandatory fields are required for v5 export
(config)# flow exporter flowexporter1 (config-flow-exporter)# destination 10.10.22.22 (config-flow-exporter)# export-protocol netflow-v5 (config-vlan-config)# ip flow monitor flowmonitor1 input Warning: Exporter flowexporter1 could not be activated because the following fields are mandatory: ipv4 source address ipv4 destination address transport source-port transport destination-port ipv4 protocol
90
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting FNF Export • Flows stats may be lost if there are more flows than permitted in the monitor cache
• Constant cache aging on flow monitors can also drive CPU higher
switch# show flow monitor ipv4fm cache Cache type: Normal Cache size: 4096 Current entries: 3891 High Watermark: 4096 Flows added: 12288 Flows aged: 8397 - Active timeout ( 1800 secs) 0 - Inactive timeout ( 15 secs) 0 - Event aged 0 - Watermark aged 599 - Emergency aged 7798
(config-if)# no ip flow monitor ipv4fm input (config-if)# exit (config)# flow monitor ipv4fm (config)# cache entries 64000 (config)# int gi 1/46 (config-if)# ip flow monitor ipv4fm input switch# show flow monitor ipv4fm cache Cache type: Normal Cache size: 64000 Current entries: 32768 High Watermark: 32768 Flows added: 32768 … - Emergency aged 0
Tune cache size to match flow flux
91
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Troubleshooting Monitoring FNF Collisions
• Limit cache entries
• If cache limit is already reached and hash table is full, scope of monitoring will need to be adjusted
switch# show platform hardware flow table utilization … Buckets w/ X Bucket Count Used Entry Count Used Entries (% of Buckets) (% of Entries) ------------ --------------- ---------------- 0 0 ( 0.0) 0 ( 0.0) 1 0 ( 0.0) 0 ( 0.0) 2 0 ( 0.0) 0 ( 0.0) 3 0 ( 0.0) 0 ( 0.0) … 14 0 ( 0.0) 0 ( 0.0) 15 1 ( 0.0) 15 ( 0.0) 16 8191 ( 99.9) 131056 ( 99.9) Total Used 8192 (100.0) 131071 ( 99.9) Total Free N/A 1 ( 0.0) Unaccounted packets: User configured flow monitor cache limit reached: 4419746531 IPv6 entry table full: 0 Hash Collosions: 176000251
Flow Hash Table Buckets 8K
Entries per bucket 16
Total hash table entries 128K
Approx. total usable space 108K
%C4K_HWFLOWMAN-5-FLOWUNACCOUNTEDPACKETS: Flow stats for 46444030 packets are not accounted due to hardware hash collisions or full hardware flow table
All 16-entry buckets are full = constant collisions
92
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Agenda
Products Overview
Troubleshooting – Method – Packet path / loss – VSS – PoE – System Resources – Netflow
Tools/Tips
Appendix
93
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tools: Wireshark
Wireshark Best Practices Do not display directly to console without a buffer, file or a duration limit
Write to PCAP file on storage, display on switch or using laptop Wireshark GUI
Only the core filter is implemented in hardware as ACLs. Use a restricted filter to avoid high CPU
Available on Sup7E, Sup7L-E, 4500X
Onboard full packet capture, filter, decode / display
Up to 8 instances supported
94
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tools: Wireshark
Forwarding Engine
IOS-XE
Ring Buffer
Console
File Core Filter
Display Filter
Display Filter
Capture Filter
switch# monitor capture mycap int gi 1/46 in match ipv4 protocol tcp 10.1.1.1/32 any file location bootflash:mycap.pcap limit duration 3 switch# monitor capture mycap start *Apr 15 17:56:24.291: %BUFCAP-6-ENABLE: Capture Point mycap enabled. *Apr 15 17:56:27.720: %BUFCAP-6-DISABLE_ASYNC: Capture Point mycap disabled. Reason : Wireshark session ended switch# show monitor capture file bootflash:mycap.pcap display-filter "ip.ttl == 100“ 1 0.000000 10.1.1.1 -> 91.91.91.100 TCP [TCP ZeroWindow] 0 > 0 [<None>] Seq=1 Win=0 Len=2
95
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tools: Wireshark
Troubleshooting Steps Commands
Create a monitor monitor capture mycap <interface | vlan | control-plane>
Add core filter monitor capture mycap [access-list <acl> | match <in-line match CLI>]
Display monitor details show monitor capture
Start/stop a monitor session monitor capture mycap start | stop
Display a pcap file show monitor capture file <filename>
Display a pcap file in detail show monitor capture file <filename> detailed
Display a pcap file with filter show monitor capture file <filename> display-filter “filter-detail”
Check if wireshark is running show proc cpu | inc dumpcap
96
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tools: Embedded Event Manager
Extremely versatile tool for monitoring, automating, working around issues
(a) What do I want to detect? (b) What do I want to do after that?
event manager applet high-cpu event snmp oid 1.3.6.1.4.1.9.9.109.1.1.1.1.10.1 get-type exact entry-op ge entry-val “80" poll-interval 10 action 1.0 syslog msg "HIGH_CPU! CPU is at: $_snmp_oid_val“ action 2.0 cli command "enable" action 2.1 cli command "show process cpu | redirect bootflash:cpu.txt" action 2.2 cli command "configure terminal" action 2.3 cli command "event manager scheduler suspend“ %HA_EM-6-LOG: TEST: HIGH_CPU! CPU is at: 99 event manager applet interface-flapping event syslog pattern ".*UPDOWN.*GigabitEthernet1/1.*" occurs 4 action 1.0 syslog msg “GigabitEthernet Interface 1/1 changed state 4 times“ action 2.0 cli command "enable" action 2.2 cli command "configure terminal" action 2.3 cli command “interface GigabitEthernet1/1 “ action 2.4 cli command “shutdown”
Collect process CPU usage when CPU is high
Bring an interface down when it flaps too frequently
97
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Embedded Event Manager / Netflow Integration
1. Packets with TTL=1 sent to the switch (TTL=1 streams can cause high CPU) 2. NetFlow Engine collects the flow capturing the TTL value:
%HA_EM-6-LOG: ttl: Flow Monitor ttl reported Low TTL for 10.10.10.3 10.10.10.4
3. EEM triggers a syslog when flow is detected:
switch# sh runn flow record ttl match ipv4 ttl match ipv4 protocol match ipv4 source address match ipv4 destination address collect counter bytes collect counter packets collect timestamp sys-uptime first collect timestamp sys-uptime last switch# sh runn flow monitor ttl Current configuration: flow monitor ttl record ttl cache timeout active 40 switch# sh runn int gi 6/1 no switchport ip flow monitor ttl input ip address 10.10.10.2 255.255.255.254
switch(config)# event manager applet ttl event nf monitor-name "ttl" event-type create event1 entry-value "2" field ipv4 ttl entry-op lt action 1.0 syslog msg "Flow Monitor $_nf_monitor_name reported Low TTL for $_nf_source_address $_nf_dest_address"
check – show flow monitor ttl cache format record for IP TTL: 1
98
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tips: Crashes
Enhanced crashdump features in 15.0(2)SG2 / 3.2.2SG and higher
exception coredump highly recommended on IOS-XE
Classic IOS full core in 15.1(1)SG2 onwards
On IOS-XE, collect all files in crashinfo: and kinfo:
99
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tips: Miscellaneous
Enable NTP to troubleshoot across switches
Include date and time for debug and log messages
service timestamps [debug, log] msec localtime show-timezone
Automatically output time and CPU utilization with each command (exec mode)
terminal exec prompt timestamp
When logging the console, add comments and prefix with “!” to avoid error messages
switch#!!! show module after peer reload
switch# show module
100
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tips: Make Life Easier
Search Bug Toolkit for known issues
Output Interpreter to decode command output
System Message Guide for mitigation recommendations
Smart Call Home in 12.2(52)SG
Catalyst 4000 Troubleshooting TechNotes
Catalyst 4500 Configuration Guide and Release Notes
NetPro discussion groups on http://www.cisco.com
101
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Tips: Platform Control Plane Enhancements Common Drop Event
Reason First
Available
Control Packet Data Plane Qos
12.2(54)SG Per-interface qos policies can drop control packets
Control Packet Enhancements
15.0(2)SG / 3.2.0SG
Many static ACLs matching control traffic removed CPU now included in special control floodsets on a per-vlan basis access-list hardware capture mode now controls only IGMP ACLs
CPU queue rate limits 15.1(1)SG / 3.3.0SG
DBL (per-flow rate limits) are applied to some CPU queues Improved areas include: • port security / dot1x violate mode • non-RPF multicast (fast drop) Drops appear as DblDrop in show platform software drop-port show platform software ip mfib fastdrop deprecated
102
© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public
Maximize your Cisco Live experience with your free Cisco Live 365 account. Download session PDFs, view sessions on-demand and participate in live activities throughout the year. Click the Enter Cisco Live 365 button in your Cisco Live portal to log in.
Complete Your Online Session Evaluation
Give us your feedback and you could win fabulous prizes. Winners announced daily. Receive 20 Cisco Daily Challenge
points for each session evaluation you complete. Complete your session evaluation
online now through either the mobile app or internet kiosk stations.
103