103

2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

Embed Size (px)

DESCRIPTION

Cisco 4500 Troubleshooting guide

Citation preview

Page 1: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches
Page 2: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

Troubleshooting Cisco Catalyst 4500 Series Switches BRKCRS-3142

Page 3: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Session Goals

At the end of this session, you should be able to:

Understand system resources and monitor their usage

Identify all areas of packet loss

Trace hardware packet path

Make use of newer tools

This content is based on questions we see in the field. Feedback is welcome!

3

Page 4: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow

Tools/Tips

Appendix

4

Page 5: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Products Overview

5 5

4503-E 4507R+E 4510R+E 4506-E

6 Gbps per slot • Classic supervisors • Classic line cards

• e.g, SupV-10GE, 45xx line card

See the appendix for supervisor, line card, and chassis product and compatibility details.

48 Gbps per slot • +E Chassis support 12.2(53)SG4 onward • switch, Sup7L-E, 47xx line card • 4507R+E, 4510R+E, 4503-E, 4506-E

24 Gbps per slot

• -E Chassis support 12.2(31)SGA6 onward

• Sup6-E, Sup6L-E and 46xx line card

• 4507R-E, 4510R-E

5

Page 6: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Products Overview

6

1. Ternary Content Addressable Memory 2. Optional for Supervisor IV and V. Integrated in Supervisor V-10GE, switch, 7L-E

Intelligent Supervisors Supervisor Engine 7-E, 7L-E, 6-E, 6L-E, V-10GE, V, IV, II-Plus-10GE,

II-Plus-TS, II-Plus

Transparent Line Cards Wire-rate, oversubscribed, PoE 10/100, 10/100/1000, GE, 10GE Various physical media front panel ports Dedicated per-slot bandwidth to supervisor

Switching ASICs Packet Processor Forwarding Engine

Specialized Hardware TCAM1s for ACLs, QoS, L3 forwarding NetFlow2 (NFE) for statistics gathering

6

Shared Packet Memory

Line Card Stub ASICs

Front Panel Ports

Supervisor

NFE2

CPU

TCAMs1

Packet Processor

Forwarding Engine

6

Page 7: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow

Tools/Tips

Appendix

7

Page 8: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Method General Recommendations

Design with intent – ideally, create a deterministic network – engineers – not traffic – should control the network

Baseline, monitor against baseline, alarm and/or adjust – problems are solved faster when knowns can be eliminated

Characterize issues quickly with a plan

8

Page 9: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Method Method

1. Define Problem

2. Gather Facts

3. Consider Possibilities

4. Create Action Plan

5. Execute Action Plan

6. Observe Results

Doc

umen

tatio

n

Symptoms? System Messages? User Input? When? Frequency? Impact? Scope?

•Need to have a good understanding about how the system looks like when it is healthy

•Further information and examples are in the troubleshooting section

Want to learn more? Check out CCNP Practical Studies: Troubleshooting by Donna Harrington.

CCNP TSHOOT 642-832 Official Certification Guide by Kevin Wallace.

9

Page 10: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Method Method

Category Possible Cause

Config/Design Mis-configuration

Reaching Capacity

Traffic DOS Attack

Traffic Pattern Change

Bad peer/server

Software Issue Software Limitation

Bug

Hardware Issue Hardware Limitation

Failed Hardware

Transient Hardware Issue

1. Define Problem

2. Gather Facts

3. Consider Possibilities

4. Create Action Plan

5. Execute Action Plan

6. Observe Results

Doc

umen

tatio

n

10

Page 11: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Method Method

1. Define Problem

2. Gather Facts

3. Consider Possibilities

4. Create Action Plan

5. Execute Action Plan

6. Observe Results

Doc

umen

tatio

n

What needs to be done to isolate each potential root cause? Make a change, measure results, rollback change if problem persists Problem solved? If not, continue action plan

11

Page 12: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Method Before you dig deep

Top down approach – Hardware generally does what it’s told to do – Before you troubleshoot the platform, rule out the usual suspects

End-to-end • Compare traffic at endpoints • Keep standard methods/tools for loss

measurement handy

Iperf

Security • Port security issues • Actions are not always sent to syslog • Restrict modes may use CPU

802.1x, DAI, DHCP snooping/relay, IPSG, Port Security, PACL

Common Issues

• Security features • L2 • L3 unicast • L3 multicast

RACL, VACL, unicast RPF, intermediary stateful inspection spanning-tree topology, IGMP snooping reachability, peer adjacency rpf, L3 path construction (RP), IGMP groups

12

Page 13: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Method Caution

debug and show platform commands to follow

Excessive debug output to console may disable switch

show platform commands are intended for in-depth troubleshooting

Use debug and show platform commands only when advised by TAC

show platform CLIs are not officially supported IOS commands

Not all commands apply to all platforms.

– Some are IOS-XE specific (Supervisor 7-E, 7L-E and 4500X)

13

Page 14: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow

Tools/Tips

Appendix

14

Page 15: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

System Resources CPU

• Runs IOS/IOS-XE processes

• Runs 4500 platform-specific processes

• Sends/Receives control traffic

• Software-switches packets that can’t be hardware-switch

• Elevated CPU = in-use CPU, does not impact data plane

• Baseline is important

15

Page 16: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting CPU from “show process cpu”

16

CPU higher than baseline

High iosd use on IOS-XE?

sh proc cpu detail process iosd

No

Reference Document ID: 65591 on http://www.cisco.com for more

details High CPU in IOS process or

Cat4k process?

Troubleshoot features related to the process / open TAC SR

No

Yes High CPU traffic driven?

(K*CpuMan Review)

show platform health

ios cat4k

Can the traffic be identified?

show platform cpu packet stat

No

Yes

Stop / alter traffic source, open TAC SR if more detail

needed

monitor session 1 source cpu OR

debug platform packet all buffer show platform cpu packet buffer

No Yes

IOS-XE

IOS

Page 17: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting CPU: Narrowing Down Process switch# show process cpu sort Core 0: CPU utilization for five seconds: 99%; one minute: 16%; five minutes: 7% Core 1: CPU utilization for five seconds: 3%; one minute: 69%; five minutes: 33% PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process 8590 3186391 38863326 176 51.20 42.52 20.34 0 iosd … 11969 3138594 13447334 23 0.08 0.07 0.05 0 ffm 8448 207801 20750735 10 0.04 0.14 0.27 0 cli_agent 10684 428406 20858613 20 0.04 0.01 0.01 0 licensed 11241 3603017 26001138 138 0.04 0.04 0.04 0 cpumemd switch# show proc cpu detail process iosd sort Core 0: CPU utilization for five seconds: 99%; one minute: 62%; five minutes: 22% Core 1: CPU utilization for five seconds: 2%; one minute: 38%; five minutes: 43% PID T C TID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process (%) (%) (%) 8590 L 3346604 3886415 176 51.12 50.36 32.75 0 iosd 8590 L 0 8590 3561989 2098956 0 49.88 49.04 30.82 0 iosd 8590 L 1 12314 4076156 1787406 0 1.24 1.32 1.91 0 iosd 8590 L 0 12315 3425 52685 0 0.00 0.02 0.06 0 iosd 24 I 376348 695349 0 77.00 75.77 43.55 0 ARP Input 85 I 534349 8127080 0 18.77 18.77 12.66 0 Cat4k Mgmt HiPri 7 I 2083841 1110797 0 1.11 0.33 0.22 0 Check heaps 86 I 744497 5671481 0 1.11 1.22 2.22 0 Cat4k Mgmt LoPri

Dual Core

17

IOS-XE processes

Traditional IOS processes indented

Catalyst-4k Specific Management Processes

17

Page 18: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting CPU: Packet-Driven CPU

switch# show platform health … %CPU %CPU RunTimeMax Priority Average %CPU Total Target Actual Target Actual Fg Bg 5Sec Min Hour CPU K5CpuMan Review 30.00 70.81 30 17 100 500 91 66 9 19:17 … Switch# show platform cpu packet statistics … Packets Dropped by Packet Queue Queue Total 5 sec avg 1 min avg 5 min avg 1 hour avg ---------------------- --------------- --------- --------- --------- ---------- Ip Option 10715071 118803 71866 15919 0 … (config)# monitor session 1 source cpu rx (config)# monitor session 1 destination interface Gi1/48

K5CpuMan Over Target

Recent flood of packets with IP Options (not HW routable)

If port is available, get a full capture from CPU

18

Page 19: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting CPU: SPAN not available?

switch# debug platform packet all buffer platform packet debugging is on Switch# show platform cpu packet buffered Total Received Packets Buffered: 1024 ------------------------------------- Index 0: 3 days 23:23:18:54927 - RxVlan: 1006, RxPort: Gi1/1 Priority: Normal, Tag: No Tag, Event: 11, Flags: 0x40, Size: 64 Eth: Src 00:00:0B:00:00:00 Dst 00:22:90:E0:D6:FF Type/Len 0x0800 Ip: ver:IpVersion4 len:24 tos:0 totLen:46 id:0 fragOffset:0 ttl:64 proto:tcp src: 10.10.10.100 dst: 172.16.100.100 hasIpOptions firstFragment lastFragment Remaining data: 0: 0x0 0x64 0x0 0x64 0x0 0x0 0x0 0x0 0x0 0x0 10: 0x0 0x0 0x50 0x0 0x0 0x0 0x8A 0x37 0x0 0x0 20: 0x0 0x1 0xB5 0x77 0x6A 0x7E

• This debug does not require significant CPU overhead • Be sure to use “buffer” and not “log”

Newer versions provide human-readable event Decode on older versions with: switch# show platform software cpu events | i Code|11

CPU Event Code PE-Q

1 2 Ip Option 11 17

19

Page 20: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting CPU: Common Punt Reasons Common Cause Recommended Solution

Same interface forwarding no ip redirect, or alter topology

ACL logging disable ACL logging, use ACL matching stats or netflow

ACL deny causing switch to send ICMP unreachable

no ip unreachables2

Forwarding/Feature exception (out of TCAM/adj space)

reduce TCAM usage resize TCAM region (TCAM2/3)

SW-supported feature (i.e.GRE) disable the feature or reduce the amount of traffic

IP packets with TTL<2, IP options disable the offending traffic, regulate source with Control Plane Policing1

Unexpected control/data traffic Control Plane Policing1

1.CoPP supported on all legacy supervisors starting 12.2(31)SG, SUP6-E/6L-E /4900M/4948E on 12.2(50)SG , all Sup7E/7L-E/4500X 2.Must be configured on all the L3 interfaces of the switch

20

Page 21: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

System Resources Memory

• Leak vs Large Usage

• Large usage goes away when condition is no longer present

• Leak never decreases

• Establish baseline

• Collect multiple iterations over recorded interval

• Correlate increase with any known activity

21

Page 22: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Memory: Large Usage

switch# sh authentication session | count Runn Number of lines which match regexp = 239 switch# sh proc mem detail proc iosd sort | i Hold|Auth Manager PID TTY Allocated Freed Holding Getbufs Retbufs Process 113 0 870624 125992 837216 0 0 Auth Manager switch(config)# int ra gi 1/1 - 48 , gi 2/1 - 48 , gi 3/1 - 48 , gi 4/1 - 48 switch(config-if-range)# shut switch(config-if-range)# int ra gi 7/1 - 48 , gi 8/1 - 48 , gi 9/1 - 48 , gi 10/1 - 48 switch(config-if-range)# shut switch(config-if-range)# end switch# sh authentication session | count Runn Number of lines which match regexp = 0 switch# sh proc mem detail proc iosd sort | i Auth Manager 147 0 1434488 601760 514088 0 0 Auth Manager

300Kb not leaked, simply used

22

Page 23: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Memory switch# show proc mem sort System memory : 2011604K total, 765920K used, 1245684K free, 85548K kernel reserved Lowest(b) : 710864896 PID Text Data Stack Dynamic RSS Total Process 10137 69308 800424 88 236 958000 1017272 iosd 5498 1140 233600 88 2492 40332 309140 ffm switch# show proc mem detail proc iosd sort Processor Pool Total: 805306368 Used: 645097888 Free: 160208480 I/O Pool Total: 20971520 Used: 361576 Free: 20609944 Critical Pool Total: 4087852 Used: 40 Free: 4087812 Critical Pool Total: 106460 Used: 40 Free: 106420 PID TTY Allocated Freed Holding Getbufs Retbufs Process 153 0 1461539184 749742680 307884712 14266252 0 Auth Manager 0 0 304511544 14111208 272960272 0 0 *Init* 185 0 887586464 301222848 31368752 0 0 CDP Protocol switch# show proc mem detail proc iosd task 153 Process ID: 153 Process Name: Auth Manager Total Memory Held: 307882352 bytes Processor memory Holding = 307882352 bytes pc = 0x16FCD45C, size = 291258544, count = 4441 pc = 0x16FCF828, size = 9378512, count = 143

For Classic IOS, use: • show process mem sort

• show process mem <pid>

Auth Manager holding too much

Collect process memory breakdown for TAC

23

Page 24: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

System Resources TCAM

• Check TCAM usage for ACLs, security, L3 routes, PBR, DHCP Snoop, IPSG, WCCPv2

%C4K_HWACLMAN-4-ACLHWPROGERR: Input VOIP_FROM_CE_IPv6 - hardware TCAM limit, qos being disabled on relevant interface

%C4K_HWACLMAN-4-ACLHWPROGERR: Input Security: 101 - hardware TCAM limit, some packet processing will be software switched

C4K_HWACLMAN-4-ACLHWPROGERRREASON: Input(75/Normal, 1/Normal) Invalid Acl-based Feature - hardware TCAM policers exceeded

24

Page 25: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Monitoring TCAM

switch# show platform hardware acl statistics utilization brief CAM Utilization Statistics -------------------------- Used Free Total -------------------------------- Input Security (160) 42 (2 %) 2006 (98 %) 2048 Input Security (320) 66 (3 %) 1982 (97 %) 2048 Input Qos (160) 15 (0 %) 2033 (100%) 2048 Input Qos (320) 14 (0 %) 2034 (100%) 2048 Input Forwarding (160) 2 (0 %) 2046 (100%) 2048 Input Unallocated (160) 0 (0 %) 55296 (100%) 55296 switch# show platform hardware qos policer utilization ------------------------------------------- Policer utilization summary: Direction Assigned Used Free ------------------------------------------- Input 2048 ( 12.5%) 4 ( 0.1%) 2044 ( 99.8%) Output 2048 ( 12.5%) 1 ( 0.0%) 2047 ( 99.9%) Free 12288( 75.0%) 0 ( 0.0%) 12288(100.0%)

Low utilization

25

Page 26: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

System Resources Queue Memory

• Reserved queue memory for each linecard, exceeding this eats into global pool

• When global pool exhausted, the above message appears

• Options:

• decrease queue depths on a per port basis

• combine classes under the same queue

%C4K_HWPORTMAN-3-TXQUEALLOCFAILED: Failed to allocate the needed queue entries for Gi6/13

26

Page 27: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Monitoring Queue Memory

Entry Sup6-E/6L-E/7L-E Sup7E Total queue memory 512K 1M

Free Reserve: global pool 100K 100K CPU, recirc, drop queues 20K 40K

Queue entries per slot1 x = 400K/ nSlots2 X = 860K/nSlots

Queue entries per port on a line card y = x / nPorts3 y = x/nPorts

Queue entries per class transmit queue z = y/nTxQs4 z = y/nTxQs

1. In a redundant chassis, two supervisor slots are treated as one 2. nSlots – number of Slots 3. nPorts – number of Ports in a line card 4. nTxQs – number of transmit queues in use

27

Page 28: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Monitoring Queue Memory switch# show platform software qm Drop port Tx Queue allocations (Size: 8184, Base: 0x019008) Tx Queue allocations for recirc ports (Size: 24576, Base: 0x01D1D0) CPU Subport Tx Queue allocations (TotalSize: 8656) … Superport Tx Queue space distribution ------------------------------------- Superport Slot Percent Base Addr End Addr Entries --------- ---- ------- --------- -------- ------ 4 1 10 0x047ED8 0x04C858 18841 5 1 10 0x04C878 0x0511F8 18841 6 1 10 0x051218 0x055B98 18841 7 1 10 0x055BB8 0x05A538 18841 8 0 10 0x0231D0 0x027B50 18841 9 0 10 0x027B70 0x02C4F0 18841 10 0 10 0x02C510 0x030E90 18841 11 0 10 0x030EB0 0x035830 18841 … 40 1 10 0x05A558 0x05EED8 18841 41 1 10 0x05EEF8 0x063878 18841 42 1 10 0x063898 0x068218 18841 43 1 10 0x068238 0x06CBB8 18841

• 18841 * 8 QM entries available for physical slot 2

• 150728 / 48 = 3140 entries/port • >3140 entries will eat into global pool

Drop, Recirc, CPU reservations

28

Page 29: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting System Resources Commands CLI Purpose

List IOS process CPU % on IOS-XE show proc cpu detail process iosd sort

Monitor Cat4k platform CPU statistics show platform health show platform cpu packet statistics

SPAN packets to/from CPU monitor session 1 source cpu monitor session 1 destination interface <int>

Enable/monitor Cat4k CPU buffer debug platform packet all buffer show platform cpu packet buffered

Display process memory and buffer holdings

show proc mem sort show process mem <pid> show buffers

Display process memory and buffer holdings on IOS-XE

show proc mem detail proc iosd sort show proc mem detail proc iosd task <pid> show buffers detailed process iosd

Display Cat4k ACL and policer usage show platform hardware acl statistics utilization brief show platform hardware qos policer utilization

Display Cat4k queue memory usage show platform software qm

29

Page 30: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow

Tools/Tips

Appendix

30

Page 31: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Why is any packet sent to port(s), to CPU, or dropped?

Losing packets on the 4k without a clue why?

1. Collect “show tech” and iterations of the below

2. Step through the platform

1. Identify counters outside of baseline, find an explanation based on counter meaning

2. Identify unexpected platform programming, work upwards

• incrementing counters are most useful

• Some counters are normal

• Baseline data is useful

31

Page 32: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Areas Of Investigation

HW-based checks Queue/buffer failure

PHY, stub, packet processor, forwarding engine

show interfaces <int> counters all show platform hardware interf <int> statis show platform software interf <int> statis show platform software interf <int> stub statis show platform software interf <int> stub cts statis all show platform hardware ret rrq show platform software drop-port

CPU queues CPU controller show platform cpu packet driver show platform cpu packet statistics

STP L2 lookup show platform hardware stp vlan <vlan>

L3 entries forwarding lookup show platform hardware ip route [ipv4|ipv6] network <net> <mask> show platform hardware ip route [ipv4|ipv6] host <ip or group>

ACL input classification, output classification

show access-list <*acl> show platform hardware acl input entries static show platform hardware acl [input|output] entries interface <int> all show platform hardware acl [input|output] entries vlan <vlan> all show platform hardware acl [input|output] actions <action>

L2 entries, floodsets

L2 lookup show plat hard mac add <mac> show plat hard ret chain index <index> show platform hardware floodset vlan <vlan>

* Ensure HW statistics are enabled (see ACL section)

32

Page 33: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path PHY and Stub ASIC

Line Card Stub ASICs

Front Panel Ports

Supervisor

Layer 1 issues

Malformed frames/packets

Oversubscription

Flow-control

Storm-control

33

Page 34: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Layer 1 Issues

• Match speed and duplex • Isolate bad hardware using known good hardware • Specific to end device? Patch/line cord? Front panel port? Linecard? • Exclude patch panel if possible • Peer misbehaving? Sniff wire for malformed frames

switch# show interfaces g5/5 count errors | exclude \ 0\ *0\ *0\ *0 Port CrcAlign-Err Dropped-Bad-Pkts Collisions Symbol-Err Gi5/5 23736730 0 0 0 Port Undersize Oversize Fragments Jabbers Port Single-Col Multi-Col Late-Col Excess-Col Port Deferred-Col False-Car Carri-Sen Sequence-Err See Appendix for Error descriptions

34

Page 35: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Layer 1 Issues

switch# show platform software interface gigabitEthernet 1/1 stub statistics XgstubMan(0:N-0)Port( 1 ) Rx Stats: … OverrunPackets : 0 AlignmentErrorPackets : 0 FcsErrorPackets : 0 SymbolErrorPackets : 0 InvalidOversizePackets : 0 Ipv4HdrChecksumErrorPackets : 0 Ipv4HdrErrorPackets : 0 Ipv6HdrErrorPackets : 0 … switch# show platform software interface gigabitEthernet 1/1 statistics Superport8(Gi1/1-6) Non-Zero Software Statistics … RxSequenceErrors : 255 RxSymbolErrors : 255

Note: counters may increment during plug / unplug

Platform commands can narrow down stub ASIC vs packet processor

35

Page 36: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Layer 1 Issues

(config)# logging event link-status global (config-if)# logging event link-status switch# show platform software interface all | inc downs:|PimPhyport … GalGlmPort(0:N/21), Active? : true, PimPhyport Name : Gi1/22, EpmPortMan Name : EpmPortMan(0:N/21) Name( EpmPortMan(0:N/21) ), PimPhyport name( Gi1/22 ) #link downs: 41712 switch# show platform software interface gi1/1 mii … 0x00 ControlReg 0x1140 0x01 StatusReg 0x79C9 … 0x04 AutoNegAdvReg 0x01E1 0x05 AutoNegLinkPartnerAbilityReg 0x0000 0x06 AutoNegExpansionReg 0x0064 0x07 AutoNegNextPageTransmitReg 0x2001 … 0x09 1000BaseTControlReg 0x0F00 0x0A 1000BaseTStatusReg 0x0000

Monitor for link flap via syslog

Configurable globally or per-interface

Get total number of flaps since switch boot

Compare with switch uptime

This command should be run twice

Use the second results, decode standard 802.3 registers

36

Page 37: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Oversubscription: stub/supervisor port buffers

completely even traffic flow does not occur in real-world – 2:1 1Gbps != (real world) 500 Mbps x 2 ports – 2:1 10bps != (real world) 5Gbps x 2 ports

ingress traffic on oversubscribed ports – control on the peer device

egress oversubscription – consider multi-path

max

avg

min

37

Page 38: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Flow control

• switch may send pause toward end-device if rx buffer passes high watermark

• stub will pause toward supervisor if end-device signals pause

Stub ASICs

Front Panel Ports Pause

Packet Processor

Pause

1

2

Drops 3 1. Device sends pause to stub

2. Stub sends pause to packet processor

3. Packet processor pauses tx-queue

38

Page 39: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Tx Oversubscription and Flow Control

switch# show interfaces g2/47 counters detail | begin Drops Port Tx-Drops-Queue-5 Tx-Drops-Queue-6 Tx-Drops-Queue-7 Tx-Drops-Queue-8 Gi2/47 0 0 0 37748571 switch# show interfaces g2/47 counters detail | begin RxPause Port Rx-No-Pkt-Buff RxPauseFrames TxPauseFrames PauseFramesDrop Gi2/47 0 130 0 0

Tx oversubscription will result in tx-queue drops

Pause frames from a peer will stop tx-queue processing

Queue 8 is the default queue with no QoS Configured

39

Page 40: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Rx Oversubscription

switch # show interface gi1/13 | include overrun 0 input errors, 0 CRC, 0 frame, 86432 overrun, 0 ignored switch# show interface gi1/13 counter all | begin Rx-No Port Rx-No-Pkt-Buff RxPauseFrames TxPauseFrames PauseFramesDrop Gi1/13 206658 0 0 0 switch# show platform software interface g1/13 stub stat | in Overrun OverrunPackets : 206658 (look for Rx Stats)

RxFifo stub overrun will be seen during Rx oversubscription

packet buffer depletion can also cause Rx-No-Pkt-Buff

40

Page 41: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Packet Processor

Shared Packet Memory

Line Card

Supervisor

Packet Processor

Central packet memory exhaustion

• Deep transmit queues • Egress oversubscription (example: SPAN) • Jumbo frames

%C4K_SWITCHINGENGINEMAN-4-IPPLLCINTERRUPTFREELISTBELOWHIPRIORITYTHRESHOLD: IPP LLC freelistBelowHiPriorityThreshold interrupt FreeListCount: 2058, lowestFreeCellCnt: 0

Has anyone seen a

longer log message?

41

Page 42: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Oversubscription: packet memory exhaustion

Deep buffers and congestion

limited gain (temporary buffering)

switch-global expense (ingress and egress)

1. Deep egress queue fills

2. Packet memory consumed

3. Packet memory unavailable for ingress

Packet Processor

Shared Packet Memory

Drops

Drops 1

2

3

Full

42

Page 43: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Oversubscription: packet memory exhaustion

Reduced buffers during congestion

limited expense (smaller threshold on given interface)

large gain (no packet memory exhaustion)

Other solutions:

even out packet port distribution

egress policers

Packet Processor

Shared Packet Memory

Drops

Restricted

43

Page 44: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Packet memory: keeping the FreeList healthy

switch# show platform hardware interface all | include FreeListCount FreeListCount : 64268 switch# show platform hardware interface all | include FreeListCount FreeListCount : 62100 switch# show interfaces g2/47 counters detail | begin Drops Port Tx-Drops-Queue-5 Tx-Drops-Queue-6 Tx-Drops-Queue-7 Tx-Drops-Queue-8 Gi2/47 0 0 0 37748571 (config)# policy-map egress_queue_limit class class-default queue-limit 500 (config)# hw-module system max-queue-limit <value>

64K*280 Byte cells in Sup6E, Sup6L-E

128K*256 Byte cells in Sup7E, Sup7L-E

Drop in FreeList will accompany IPP log message

1. Locate interfaces tail dropping

2. Reduce tx-queue size OR

3. Modify default queue size

44

Page 45: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Packet Loss / Path Forwarding ASIC

Line Card

Supervisor

NFE

CPU

TCAMs

Forwarding Engine

Stepping through forwarding ASIC stages

Identifying packet destiny – Punt? – Drop? – Forward to where? – Replicate to where?

Working backwards from ASIC counters

45

Page 46: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Forwarding ASIC Location Purpose Most Common Platform Troubleshooting Need

IM Input mapping Vlan re-mapping

L2 L2 lookup Layer 2 destination

IC Input classification ACLs (especially static ACL, which evaluate *all* traffic) For custom ACL, IOS-level CLI typically all that is needed

NF Netflow Platform troubleshooting not commonly required

IP Input policing IOS-level policer counters typically all that is needed

FL Forwarding lookup L3 Multicast replication

OC Output classification IOS-level CLI typically all that is needed

OP Output policing IOS-level policer counters typically all that is needed

OM Output mapping, replication

Vlan re-mapping Replication counters useful in very high density scenarios

QM Queueing Tx-queue programming

46

Page 47: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Mapping IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

• Physical / aggregate port mapping • Vlan mapping

switch# show platform mapping ports Interface Superport Subport CompactSubportId PortSet Phyport Aggport PimPhyport Gi1/1 8 1 20 2 13 8 0 … Gi7/48 35 4 210 8 402 Po1(417) 367 switch# show platform hardware portvlan-map-table interface gigabitEthernet 1/1 Aggport( 8 ): ----- PortVlanDirectTable ----- VlanId FwdVlanId SrcMissCtrl TxDropEn VlanTagStripEnOnTx 0 0 SrcMissCopyToCpu False False … ----- PortVlanHashTable ----- Index PartialAggport VlanId FwdVlanId Dir SrcMissCtrl TxDropEn VlanTagStripEnOnTx 1568 8 100 200 Rx SrcMissCopyToCpu - False 3188 8 100 200 Tx - False False

All ports on an Etherchannel share an Aggport

Vlan mapping in use

Mapping information used in many platform CLI outputs

47

Page 48: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Mapping / L2 Lookup IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

• Confirm if routing features are enabled on a vlan

switch# show platform hardware rxvlan-map-table vlan 902 Vlan 902: l2LookupId: 902 srcMissIgnored: 0 ipv4UnicastEn: 1 ipv4MulticastEn: 1 ipv6UnicastEn: 0 ipv6MulticastEn: 0 … switch# show int vl 902 | i SVI Hardware is Ethernet SVI, address is 001e.f73f.f5bf (bia 001e.f73f.f5bf) switch# show mac address-table vlan 902 | i 001e.f73f.f5bf 902 001e.f73f.f5bf static ip,ipx,assigned,other Switch switch# show plat hard mac add 001e.f73f.f5bf vlan 902 … Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex ----- -------------- ----- ---------- ---------------------------- 63248 001E.F73F.F5BF 902 SinglePort Cpu aggport(4) ND RouterAddr

IPv4 unicast and multicast routing enabled

SVI MAC present in MAC table (for unicast routing)

Note: all SVI use the same MAC address on 4k

48

Page 49: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: L2 Lookup • STP state check • SA Learning

switch# show span int gi 7/48 state | i VLAN0002 VLAN0002 forwarding switch# show platform hardware stp vlan 2 | i Gi7/48 Gi7/48 (375) Forwarding switch(config)# no mac address-table learning vlan 100 switch# show platform hardware rxvlan-map-table vlan 100 | i srcMiss srcMissIgnored: 1 switch# show mac add int gi 1/46 | i 902 902 0000.0500.0000 dynamic ip,ipx,assigned,other GigabitEthernet1/46 902 ffff.ffff.ffff system Gi1/46,Gi7/48,Switch switch# show plat hard mac add 0000.0500.0000 | i 0500|Index Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex 27760 0000.0500.0000 902 SinglePort Gi1/46(53) ND SrcOrDst F

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

no copies will be sent to CPU for MAC source address learning

HW matches SW

49

Page 50: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: L2 Lookup • SA Lookup: port security

switch# show run int gi 3/19 … interface GigabitEthernet3/19 switchport access vlan 172 switchport mode access switchport port-security spanning-tree portfast switch# show platform hardware mac vl 172 Flags are: ---------- D - Drop ND - Do not drop Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex ----- -------------- ----- ---------- ---------------------------- 2640 0017.9543.EA7F 172 SinglePort Gi3/19(74) ND SrcOrDst 49300 0017.9543.EA7F 172 SinglePort WildcardAggport D SrcOrDst

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Traffic sourced from this MAC from any port other than Gi3/19 will be dropped on vlan 172

50

Page 51: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: L2 Lookup • DA Lookup: private vlan example

switch# show run int gi 3/7 interface GigabitEthernet3/7 switchport private-vlan host-association 100 200 switchport mode private-vlan host spanning-tree portfast end switch# show platform hardware mac add c89c.1d53.612d Flags are: ---------- D - Drop ND - Do not drop Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex ----- -------------- ----- ---------- ---------------------------- 11700 C89C.1D53.612D 200 SinglePort Gi3/7(62) ND SrcOrDst 46352 C89C.1D53.612D 100 SinglePort Gi3/7(62) ND SrcOrDst 51376 C89C.1D53.612D 200 SinglePort Drop aggport(8190) D SrcOrDst

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Traffic toward C89C.1D53.612D on vlan 200 (isolated vlan) will reach the drop port instead

Note: Index order is not lookup order

51

Page 52: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: L2 Lookup • DA Lookup: multicast, broadcast

switch# show mac add multi vlan 902 | i 0100.5e01.0101 902 0100.5e01.0101 igmp Gi1/46,Switch switch# show plat hard mac add 0100.5e01.0101 | i 0100.5E01.0101|Index Index Mac Address Vlan Type SinglePort/RetIndex/AdjIndex 20224 0100.5E01.0101 902 Ret 104444 switch# show plat hard ret chain index 104444 RetIndex 104444 RetWordIndex: 522220 Link: 1048575(0xFFFFF) FieldsCnt: 1 SuppressRxVlanBridging: true Vlan: 902 BridgeOnly: N Gi1/46(53) Switch# show platform hardware floodset vlan 902 Vlan 902: Unicast Floodset: FloodToCpu: - RetIndex: 902 Gi1/46(53) Po1(417) …

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

unknown unicasts will be flooded to these ports

Multicast traffic to 0100.5e01.0101 replicated here, unless overridden by L3/ACL

Note since 15.0(2)SG / 3.2.0SG Broadcast is a per-vlan ffff.ffff.ffff entry instead of a floodset

52

Page 53: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: L2 vs L3 vs ACL What HW programming will direct the packet?

switch# show platform hardware ip fwdsel summary L2Value == other (port/RET) (0): IC L3 0 1 2 3 0 l2 ic ic ic 1 l3 ic ic ic 2 l3 l3 ic ic 3 l3 l3 l3 ic

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Fwdsel relevant to ACL (ic) only when there is a redirect action

Example:

L3 entry present, FwdSel=2

ACL redirect entry present, FwdSel=2

Winner = ACL (ic)

L3 Entry ACL Entry

L2 entry floodset

Depends on “fwdsel”

> >

53

Page 54: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Classification • SVI and ACL statistics require hardware resources • Not enabled by default

switch# show run … interface Vlan902 ip address 92.92.92.1 255.255.255.0 counter … ip access-list extended deny deny ip any any hardware statistics … switch# show platform hardware vlan statistic summary Region Name First Last First LastUsed Entries Entries Block Block Entry Entry Used Free Size 2 Counters Region 0 510 0 0 1 2043 Size 4 Counters Region 511 1022 2044 - 0 2048 VlanStatsTable Programming Complete: Yes

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Enable hardware counters

Ensure resources are available

54

Page 55: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Classification • Any ACL-based ingress classification (security, QoS, PBR) • ACL examples: local multicast sources, static ACL, PBR, PACL

switch# show platform hardware acl input entries vlan 902 all … Opcode : 40000 / 40000 IP Src : 92.92.92.0 / 255.255.255.0 IP Dst : 224.0.0.0 / 240.0.0.0 … ActIdx: 249 StatsIdx: 0 FwdIdx: (Cpu, Cpu: true, CpuEvent: 1, Port: 6) switch# show platform hardware acl input actions 249 … Idx: 249 … FwdSel: 2 … L3Action: 2

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

• Installed automatically when PIM is enabled on the SVI

• Matches local sources > TTL=1

• Redirects to CPU for S,G setup (if not overridden by L3 entry)

• Compare FwdSel with L3 entries

• L3Action: (0 = permit, 1 = drop, 2 = redirect)

55

Page 56: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Classification • ACL examples: local multicast sources, static ACL, PBR, PACL

switch# show platform hardware acl input entries static … CamIndex Entry Type Active Apply QoS Hit Count -------- ---------- ------ --------- --------- 2 IgmpToCpu Y N/A 14237 (estimate) … switch# show platform hardware acl input entries start 2 end 2 all … IP Src : 0.0.0.0 / 0.0.0.0 IP Dst : 224.0.0.0 / 240.0.0.0 IP Protocol : igmp / IpProtocolMask … ActIdx: 252 StatsIdx: 0 FwdIdx: (Cpu, Cpu: true, CpuEvent: 1, Port: 3) switch# show platform hardware acl input actions 252 … FwdSel: 3 L2Action: 2

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

• Watch for increment

• Hit does not mean packet count

IGMP sent to 224/4

will go to CPU

if FwdSel wins over L3

L2Action: (0 = permit, 1 = drop, 2 = redirect)

56

Page 57: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Classification • ACL examples: local multicast sources, static ACL, PBR, PACL

switch# show platform hardware acl input entries vlan 901 all … IP Src : 1.1.1.1 / 255.255.255.255 IP Dst : 0.0.0.0 / 0.0.0.0 … ActIdx: 244 StatsIdx: 0 FwdIdx: (Adj, Adj: 8) switch# show platform hardware acl input actions 244 … FwdSel: 2 … L3Action: 2 switch# show platform hardware ip adjacency entry 8 000008: vlan: 192 port: Po1 (417) size: 1 ifaId: 20 fwdCtrl: 5 cpucode: 3 sifact4: FwdToCpu sifact6: FwdToCpu sa: 00:1E:F7:3F:F5:BF da: 00:0C:29:6D:1A:ED rwFmt: Unicast packets: 0 bytes: 0

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Packets sourced from 1.1.1.1/32

will be redirected to adjacency 8 (Po1)

If FwdSel wins over L3

Note: PBR ACLs are removed if adjacency becomes unavailable

57

Page 58: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Classification • ACL examples: local multicast sources, static ACL, PBR, PACL • Note: packets classified as non-IP, IPv4, IPv6 (cannot MAC ACL on an IP packet)

switch# show ip access deny Extended IP access list deny 10 deny ip any any (1056 matches) switch# show ip int gi 1/2 Inbound access list is deny switch# show plat hard acl inp entr int gi 1/2 all … IP Src : 0.0.0.0 / 0.0.0.0 IP Dst : 0.0.0.0 / 0.0.0.0 IP Protocol : IpProtocolNull / IpProtocolNull … ActIdx: 254 StatsIdx: 0 FwdIdx: (None, rep: 0) switch# show plat hard acl inp act 254 … FwdSel: 0 … L2Action: 1

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

All IPv4 traffic will be dropped

Fwdsel doesn’t matter

L2Action: (0 = permit, 1 = drop, 2 = redirect)

58

Page 59: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Classification / Policing • Order of operations

flow record microflow match ipv4 source address class-map match-all microflow match flow record microflow policy-map ingress class voice-signalling set dscp cs3 police cir 32000 bc 8000 conform-action transmit exceed-action set-dscp-transmit cs1 exceed-action set-cos-transmit 1 class microflow police cir 100000 conform-action transmit exceed-action drop class class-default set dscp default set cos 0

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM Unconditional Marking

Microflow policing

• Flexible Netflow

• Class-map matching FNF

• Policer

Normal policer

Conditional Marking

Classification

Ingress Classification

Ingress Policing

Ingress Marking Unconditional

Ingress Marking Conditional

Forwarding

59

Page 60: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Input Classification / Policing • Monitoring ingress Qos

switch# show policy-map interface gigabitEthernet 1/46 GigabitEthernet1/46 Service-policy input: ingress Class-map: voice-signalling (match-all) 28283457437 packets Match: dscp ef (46) QoS Set dscp cs3 police: cir 32000 bps, bc 8000 bytes conformed 76128704 bytes; actions: transmit exceeded 1810581188160 bytes; actions: set-dscp-transmit cs1 set-cos-transmit 1 conformed 32000 bps, exceed 761238000 bps

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Class-map stats are shared across interfaces with the same policy map

• Ensure counters increment

• Classification displays using the packet counts

• Policing displays using bytes

60

Page 61: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Forwarding Lookup • L3 unicast destination lookups, multicast (*,G) / (S,G) lookups, urpf lookups

switch# show ip route 192.168.200.200 Routing entry for 192.168.200.0/24 Known via "static", distance 1, metric 0 Routing Descriptor Blocks: * 192.168.100.100 Route metric is 0, traffic share count is 1 switch# show ip arp | i 192.168.100.100 Internet 192.168.100.100 0 000c.296d.1aed ARPA Vlan192 switch# show mac address dynamic | i 000c.296d.1aed 192 000c.296d.1aed dynamic ip,ipx,assigned,other Port-channel1 switch# show platform hardware ip route ipv4 network 192.168.200.0 255.255.255.0 Block: 0 En: true EntryMap: LSB Width: 80-Bit Type: Dst … 000022: v4 192.168.200.0/24 --> vrf: Global Routing Table (0) adjStats: true fwdSel: 2 mrpf: 0 (None) fwdIdx: 0 ts: 0 adjIndex: 8 vlan: 192 port: Po1 (417) fwdCtrl: 5 cpucode: 3 sifact4: FwdToCpu sifact6: FwdToCpu sa: 00:1E:F7:3F:F5:BF da: 00:0C:29:6D:1A:ED

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Remember: unicast traffic won’t be destination-routed unless: • routing is enabled on the vlan • traffic is sent to L3 MAC • FwdSel of route wins over ACL

62

Page 62: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Forwarding Lookup • L3 unicast destination lookups, multicast (*,G) / (S,G) lookups, urpf lookups • In general: (S,G) > vlan multicast source ACL > (*,G)

switch# show ip mroute 239.1.1.1 91.91.91.100 … (91.91.91.100, 239.1.1.1), 00:08:11/00:01:32, flags: JT Incoming interface: Vlan901, RPF nbr 0.0.0.0 Outgoing interface list: Vlan902, Forward/Sparse, 00:07:49/00:02:53 switch# show platform hardware ip route ipv4 host 239.1.1.1 … 008194: v4 91.91.91.100/32 239.1.1.1/32 --> vrf: Global Routing Table (0) adjStats: true fwdSel: 3 mrpf: 901 (FwdToCpu) fwdIdx: 0 ts: 0 retIndex: 49150 retTs: 0 Vlan: 901 BridgeOnly: Y Gi1/46(53) Vlan: 901 BridgeOnly: Y Gi7/1(328) Vlan: 901 BridgeOnly: Y Po1(417) Vlan: 902 BridgeOnly: N Gi1/46(53)

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

BridgeOnly = Y, packet will be bridged (to Gi1/46 vlan 901) BridgeOnly = N, packet will be routed (to Gi1/46 vlan 902)

Packets matching the (S,G) NOT ingressing mrpf vlan will fail rpf check, punt to CPU

63

Page 63: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Forwarding Lookup Quiz scenario: • Switch configured for multicast routing, sparse mode • No RP address is configured • A local multicast source starts

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Why does the new source stream to the

CPU?

Answer: • Vlan local source ACL punts traffic to the CPU • No S,G is ever created to override the ACL (via fwdsel)

64

Page 64: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Forwarding Lookup • L3 unicast destination lookups, multicast (*,G) / (S,G) lookups, urpf lookups

switch# show run int vl 901 interface Vlan901 ip address 91.91.91.1 255.255.255.0 ip verify unicast source reachable-via rx allow-default switch# show platform hardware ip route ipv4 network 91.91.91.0 255.255.255.0 … Block: 3 En: true EntryMap: LSB Width: 80-Bit Type: Src … 012333: v4 91.91.91.0/24 * --> vrf: Global Routing Table (0) defaultRoute: false rpfVlan: 901 (Drop) ts: 0

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Routed traffic sourced from 91.91.91.0/24 Where RPF fails (ie doesn’t ingress vlan 901) Will be dropped

65

Page 65: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Output Classification, Policing, Mapping

• ACL-based output classification: Security, qos • ACL and policer CLI the same (change input -> output) • Mapping behavior shown in ingress mapping CLI • STP state is not checked in HW

• entries/floodsets simply don’t include ports in blocked state • Packets for replication enqueued into replication queue

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

switch# show platform hardware ret rrq … ReasonQueue is not empty ReasonHead: 0xDEB ReasonTail: 0xDD7 DataQueue is not empty DataHead: 0xDE2 DataTail: 0xDD7 Prefull drops: 171477 Over Threshold drops: 0

Control, lookup queues are in use

Drops have occurred due to reaching first drop threshold

66

Page 66: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Output Classification / Policing • Order of operations

policy-map egress class voice set dscp ef set cos 5 priority police cir percent 33 class voice-control set dscp af31 set cos 3 bandwidth remaining percent 5 class class-default dbl

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Marking

Queuing

Policing

Note: dbl, shape, bandwidth, queue-limit and priority commands are all queuing commands

MQC for port-channels: • Policy with queuing actions – only physical ports • Policy with non-queuing actions – only port channel

Output Classification

Output Policing

Output Marking Unconditional

Output Marking Conditional

Queuing

Classification

67

Page 67: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Output Classification / Policing • Monitoring egress Qos switch# show policy-map int g1/36 output GigabitEthernet1/36 Service-policy output: AutoQos-VoIP-Output-Policy Class-map: AutoQos-VoIP-Bearer-QosGroup (match-all) 625530530 packets Match: qos-group 46 QoS Set ip dscp ef cos 5 priority queue: Transmit: 32344068480 Bytes, Queue Full Drops: 0 Packets police: cir 33 % cir 330000000 bps, bc 10312500 bytes conformed Packet count - n/a, 32335870400 bytes; actions: transmit exceeded Packet count - n/a, 7813435520 bytes; actions: drop conformed 325185000 bps, exceed 97368000 bps

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Class-map stats are shared across interfaces with the same policy map

• Ensure counters increment

• Classification display using the packet counts

• Policing display using bytes

• Queue full drops are in packets

68

Page 68: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: Output Queuing • DBL processing (if packet is not scheduled for drop) • Descriptor enqueued in queue memory

switch# show platform hardware interface gigabitEthernet 1/1 tx-queue … Phyport TxQ Head Tail Pre Empty Num BaseAddr Size Shape-Ok Empty Packets TxQ Subport ------------------------------------------------------------------------------- Gi1/1 0 0x0000 0x0000 True 0 0x20D10 16 True True Gi1/1 1 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 2 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 3 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 4 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 5 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 6 0x0000 0x0000 True 0 0x00000 0 True True Gi1/1 7 0x0000 0x0000 True 0 0x20D20 3152 True True

IM

L2

IC

NF

IP

FL

OC

OP

OM

QM

Reminder: SPAN copies are probably sent to different port / queues

Default queues configured Currently empty

69

Page 69: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

policy-map egress_queueing class dscp32-48 police cir 990000 conform-action transmit exceed-action drop priority class dscp0-15 bandwidth 250000 queue-limit 400 class dscp16-31 bandwidth 250000 queue-limit 512 class class-default switch# show platform hardware interface g2/48 tx-queue … Phyport TxQ Head Tail Pre Empty Num BaseAddr Size Shape-Ok Empty Packets TxQ Subport ------------------------------------------------------------------------------- Gi2/48 0 0x0000 0x0000 True 0 0x5ECE8 352 True False Gi2/48 1 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 2 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 3 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 4 0x0000 0x0000 True 0 0x00000 0 True False Gi2/48 5 0x0000 0x0000 True 0 0x5E958 512 True False Gi2/48 6 0x0000 0x0000 True 0 0x5EB58 400 True False Gi2/48 7 0x008A 0x0088 False 1421 0x5EE48 1520 True False

Packet Loss / Path: Output Queuing Tx Q Class 0 dscp32-48

5 dscp16-31

6 dscp0-15

7 dscp49-63, class-default

Low priority queues can be starved, policer recommended

Last queue is default queue

In this example, it is non-empty

First and last appear where expected, middle reversed

70

Page 70: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: ASIC Drop Categories

Common Drop Event Reason Typical Description

BridgeToRxPortDrop received in a vlan with no other ports, replicated to a floodset/entry where ingress port was a member

DblDrop packets dropped by DBL (including DBL on CPU ports)

InpL2AclDrop, InpL3AclDrop, OutL2AclDrop, OutL3AclDrop

packets denied by ACL

rplErrDrop broadcast/multicast packets dropped while being replicated, many normal reasons to increment, including: rpf failure, floodset containing drop port, packets replicated to the CPU but also bridged to a floodset/entry containing the CPU

SptDrop spanning-tree drop; packets dropped because a port is not in a forwarding state

SrcHitDrop dropped at source learning stage; example: static MAC drop entry

TxQueFullDrop a tx port is oversubscribed

• show platform software drop-port shows global ASIC drop events (not per interface) • these counters are frequently expected • baseline and/or high packet rate very useful

71

Page 71: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Packet Loss / Path: CPU Queues

72

switch# show plat cpu pack driv Forerunner Packet Engine 1.83 (0) Receive Queues: received packets summary Qu Capac Guara CurPo Unpro Accum Kept BperP Packets 2 2512 112 610 0 2 2 73 610 58 512 256 37 12 5 511 216 591103 Receive Queues: dropped packets summary Qu Total Packets Drop No Cell Drop Overrun Drop Underrun 58 591103 43623295103 0 0 Transmit Queues Qu PosAdd Pendng Packets Bytes 0 595 0 8633668179 663318795241 1 863 0 5315423 363150782

However, combine high “Kept” with: • CurPo does not increment • Drop No Cell does increment … queue 58 is stuck!

• High “Kept” indicates high rate of traffic • Incrementing “Drop No Cell” indicates

queue oversubscription

• Check for transient flooding / loss versus stuck queue • Decode queue meaning with show platform software cpu events

Page 72: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow

Tools/Tips

Appendix

73

Page 73: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting VSS Core

Distribution VSS member 1

VSS member 2

Access

VSL

Core Switch 1 Core Switch 2

Access Switch 1 Access Switch 3 Access Switch 2

• Differences

• VSL Health

• Packet Path

74

Page 74: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting VSS: Tips and Differences

• Available on Sup7E/4500X (ipbase or better), Sup7L-E (entservices or better)

• No quad-sup SSO, but you can use in-chassis standby (ICS) uplinks

• Configure VSS before installing ICS

• ICS must remain in rommon

• Split-brain detection uses ePAGP

• MEC policers are applied independently (eg 100Mbps = 100 @ active, 100 @ standby)

• No qos groups

• Not currently supported: smart Install, linecards prior to 46**, custom VSL qos

75

Page 75: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

switch# show redundancy | i Current Current Processor Information : Current Software state = ACTIVE Current Software state = STANDBY HOT switch# show switch virtual Executing the command on VSS member switch role = VSS Active, id = 1 Switch mode : Virtual Switch Virtual switch domain number : 100 Local switch number : 1 Local switch operational role: Virtual Switch Active Peer switch number : 2 Peer switch operational role : Virtual Switch Standby Executing the command on VSS member switch role = VSS Standby, id = 2 Switch mode : Virtual Switch Virtual switch domain number : 100 Local switch number : 2 Local switch operational role: Virtual Switch Standby Peer switch number : 1 Peer switch operational role : Virtual Switch Active

Troubleshooting VSS: VSL Health

Chassis SSO is established

VSS is functioning

76

Page 76: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting VSS: VSL Health

switch# show switch virtual link port-channel | i Po Group Port-channel Protocol Ports 10 Po10(SU) - Te1/3/1(P) Te1/3/2(P) 20 Po20(SU) - Te2/3/1(P) Te2/3/2(P) Group Port-channel Protocol Ports 10 Po10(SU) - Te1/3/1(P) Te1/3/2(P) 20 Po20(SU) - Te2/3/1(P) Te2/3/2(P)

switch# show policy-map int te1/3/2 | i Class|drops Class-map: VSL-MGMT-PACKETS (match-any) (queue depth/total drops) 0/0 Class-map: VSL-L2-CONTROL-PACKETS (match-any) (queue depth/total drops) 0/0 Class-map: VSL-L3-CONTROL-PACKETS (match-any) (queue depth/total drops) 0/6 Class-map: VSL-VOICE-VIDEO-TRAFFIC (match-any) (queue depth/total drops) 0/0 Class-map: VSL-SIGNALING-NETWORK-MGMT (match-any) (queue depth/total drops) 0/0 Class-map: VSL-MULTIMEDIA-TRAFFIC (match-any) (queue depth/total drops) 0/0 Class-map: VSL-DATA-PACKETS (match-any) (queue depth/total drops) 0/491 Class-map: class-default (match-any) (queue depth/total drops) 0/37

VSL members bundled

• Watch for non-zero queue depth or incrementing drops on control queues

• Drops on non-control queues? Increase VSL links/speed

77

Page 77: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting VSS: Packet Path

switch# show platform hardware floodset vlan 97 … Executing the command on VSS member switch role = VSS Active, id = 1 Vlan 97: Unicast Floodset: FloodToCpu: - RetIndex: 97 Gi1/5/69(236) Alternate VSL aggport(1528) … … Ipv4 Multicast Floodset: FloodToCpu: N RetIndex: 16481 Gi1/5/69(236) Po10(842)

Executing the command on VSS member switch role = VSS Standby, id = 2 Vlan 97: Unicast Floodset: FloodToCpu: - RetIndex: 97 Alternate VSL aggport(1528) Gi2/1/1(420) Gi2/7/38(777) … Ipv4 Multicast Floodset: FloodToCpu: N RetIndex: 16481 Gi2/1/1(420) Gi2/7/38(777) Po20(1108)

• VSS virtual data path visible in platform programming

• Reflected in all packet path programming If traffic needs to cross chassis, VSL aggport, VSL Po, or CPU must be used

78

Page 78: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow

Tools/Tips

Appendix

79

Page 79: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE: Power Supply and Linecards switch# show environment status <snip> Supervisor Led Color : Green Module 1 Status Led Color : Green Module 2 Status Led Color : Green PoE Led Color : Green switch# show power detail Power Fan Inline Supply Model No Type Status Sensor Status ------ ---------------- --------- ----------- ------- ------- PS1 PWR-C45-4200ACV AC 4200W good good good PS1-1 110V good PS1-2 110V good PS2 Watts Used of System Power(12V) Mod Model budgeted instantaneous peak out of reset in reset ---- ------------------- -------- ------------- ------ ------------ -------- 1 WS-X4648-RJ45V-E 92 -- -- 92 10 2 WS-X4548-GB-RJ45V 60 -- -- 60 25

PoE is operational on the line card

If not good, check power supply LEDs

Linecards are fully powered

80

Page 80: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE: Analyze Power Budget

switch# show power detail Power Summary Maximum (in Watts) Used Available ---------------------- ---- --------- System Power (12V) 847 1360 Inline Power (-50V) 6 1580 Backplane Power (3.3V) 40 40 ---------------------- ---- --------- Total 893 (not to exceed Total Maximum Available = 2100) Inline Power Admin Inline Power Oper Mod Model PS Device PS Device Efficiency ---- ------------------- -------- ------------- ------ ------------ -------- 1 WS-X4648-RJ45V-E 7 6 9 8 93 2 WS-X4548-GB-RJ45V 0 0 17 15 89 Total 7 6 26 23

PoE Allocated

Inline power available. If not, this log would be seen:

%ILPOWER-5-ILPOWER_POWER_DENY: Interface <interface>: inline power denied

• Switch will allocate highest power level requested by the phone • Catalyst 4500 power allocation rules:

• Power line cards before IP phones • Prefer static over auto power

Cisco Power Calculator: http://tools.cisco.com/cpc/launch.jsp

81

Page 81: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE: Linecard Status

switch# show module Chassis Type : WS-C4510R-E Power consumed by backplane : 40 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 48 10/100/1000BaseT POE E Series WS-X4648-RJ45V-E JAE1329EAVL 2 48 10/100/1000BaseT (RJ45)V, Cisco/IEEE WS-X4548-GB-RJ45V JAE10244L7P 4 18 10GE (X2), 1000BaseX (SFP) WS-X4606-X2-E JAE12021FMP 5 6 Sup 6-E 10GE (X2), 1000BaseX (SFP) WS-X45-SUP6-E JAE1223KL3G 6 6 Sup 6-E 10GE (X2), 1000BaseX (SFP) WS-X45-SUP6-E JAE12460E61 M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 0024.1446.2d93 to 0024.1446.2dc2 1.0 Ok 2 0018.1958.cf70 to 0018.1958.cf9f 3.3 Ok 4 001d.4573.0ada to 001d.4573.0aeb 1.0 Ok 5 0022.90e0.d6c0 to 0022.90e0.d6c5 1.1 12.2(44r)SG 12.2(53)SG1 Ok 6 0022.90e0.d6c6 to 0022.90e0.d6cb 1.2 12.2(44r)SG 12.2(53)SG1 Ok

If not Ok, try resetting after executing all troubleshooting steps:

hw-module module <module> reset

Other status includes: Faulty, Authfail, Offline, PwrOver, PwrMax, PwrDeny. See Appendix for details.

82

Page 82: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE: Devices Drawing Too Much

(config-if)# power inline police switch# %INLINEPOWEROVERDRAWN: Inline powered device connected on port Gi2/2 exceeded its policed threshold. ERR_DISABLE: inline-power error detected on Gi2/2, putting Gi2/2 in err-disable state switch# show power inline police g2/2 Available:1580(w) Used:77(w) Remaining:1503(w) Interface Admin Oper Admin Oper Cutoff Oper State State Police Police Power Power --------- ------ ---------- ---------- ---------- ------ ----- Gi2/2 auto errdisable errdisable overdrawn 0.0 0.0

(config-if)# power inline static max 20000

• Policing available from 12.2(50SG)

• For phones that rarely draw more than allowed, configure static power

83

Page 83: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE: CDP / LLDP Negotiation

(config)# lldp run (config)# int gi 3/1 (config-if)# lldp tlv-select power-management

Cat 4K Feature Release LLDP 802.1ab 12.2(44)SG LLDP 802.3at PoE+ TLV, LLDP-MED 12.2(54)SG

Power Negotiation can occur via CDP, LLDP 802.3at or LLDP-MED

Switch "locks" to first protocol packet (CDP or LLDP) that has the power negotiation TLV

LLDP 802.3at power negotiation TLV overrides the LLDP-MED power negotiation TLV

Recommend - disable all but the desired power negotiation protocols on the switch interface & peer

84

Page 84: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE: Verify Data, Collect Debugs Change connections

– Record results of different line card, port, cable, end device

Is this a PoE issue or a PoE and data issue? – Disconnect phone, and connect non-PoE device

Configure “power inline never” on the port – Verify the link comes up

Re-enable power

Collect additional debugs

switch# show platform chassis module <id> switch# debug interface g1/48 Condition 1 set switch# debug ilpower powerman disconnect PD, connect PD, collect debugs) switch# undebug all All possible debugging has been turned off switch# undebug interface g1/48

Power device (PD)/phone not powering up at all?

‒ Confirm the device is IEEE compliant, check with vendor

‒ Validate with 3rd party PD testers

‒ Device capacitance or impedance as per IEEE?

When PoE is enabled on a port, auto MDIX is disabled. Please make sure you use the correct cable type. See the note in the Catalys4500 configuration guide.

85

Page 85: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE: Analyze Power Allocation Line Card PoE per Line Card PoE per Port

WS-X4748-UPOE+E 1440 60W WS-X4748-RJ45V+E 1440 30W WS-X4648-RJ45V+E 750 W 30 W WS-X4548-RJ45V+ 1050 W 30 W WS-X4648-RJ45V-E 750 W 20 W WS-X4548-GB-RJ45V 750 W 15.4 W WS-X4524-GB-RJ45V 750 W 15.4 W WS-X4248-RJ45V 750 W 15.4 W WS-X4248-RJ21V 750 W 15.4 W WS-X4224-RJ45V 750 W 15.4 W WS-X4148-RJ45V 750 W 7 W WS-X4148-RJ21V 750 W 7 W

Does the PoE line card support enough power per port? Does the PoE line card support enough power? ( slots 3-10 pair limit in 4510)

Catalyst 4500 Line Cards Data Sheet: http://www.cisco.com/en/US/prod/collateral/modules/ps2710/ps5494/product_data_sheet0900aecd802109ea_ps4324_Products_Data_Sheet.html IP Phone Data Sheets: http://www.cisco.com/en/US/products/hw/phones/ps379/products_data_sheets_list.html.

86

Page 86: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting PoE Commands Troubleshooting Steps Commands

Check Link debounce settings show interfaces debounce

Check number of debounce events show platform software interfaces mii | inc Debounce

Check Digital Optical Monitoring Data show interface <> transceiver detail

Verify PoE line card is online show module

Verify inline power available and operational show power detail

Verify the inline power status of the port show power inline <interface> [detail]

Verify PoE line card supports enough power per port, per slot Appendix table, line card datasheets

Verify phone is not drawing more power than it should show power inline police <interface>

Verify power negotiation is successful debug interface <interface> debug ilpower powerman undebug all undebug interface <interface>

Gather various module specific debugs show platform chassis module <id>

87

Page 87: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – System Resources – Packet path / loss – VSS – PoE – Netflow

Tools/Tips

Appendix

88

Page 88: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Flexible Netflow Overview

• Flexible NetFlow (FnF) available on switch, Sup7L-E and 4500X-32

• Original netflow – src/dst IP, src/dst L4 port, protocol, TOS, and input interface

• Flexible netflow – user defined fields (supports L2, IPv4, IPv6)

• Support both v9 (flexible) and v5 (fixed tuple) export formats

• Uses

• Troubleshooting – profile for suspected patterns and port

• Network security – monitor and record network meta-data, spot new patterns

• Usage monitoring and billing

89

Page 89: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting FNF Export Flow stats not received at collector

• UDP export only, check for packet loss along path to collector

• Issue can be with the collector as well

• Confirm NetFlow export version matches the collector

• Note mandatory fields are required for v5 export

(config)# flow exporter flowexporter1 (config-flow-exporter)# destination 10.10.22.22 (config-flow-exporter)# export-protocol netflow-v5 (config-vlan-config)# ip flow monitor flowmonitor1 input Warning: Exporter flowexporter1 could not be activated because the following fields are mandatory: ipv4 source address ipv4 destination address transport source-port transport destination-port ipv4 protocol

90

Page 90: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting FNF Export • Flows stats may be lost if there are more flows than permitted in the monitor cache

• Constant cache aging on flow monitors can also drive CPU higher

switch# show flow monitor ipv4fm cache Cache type: Normal Cache size: 4096 Current entries: 3891 High Watermark: 4096 Flows added: 12288 Flows aged: 8397 - Active timeout ( 1800 secs) 0 - Inactive timeout ( 15 secs) 0 - Event aged 0 - Watermark aged 599 - Emergency aged 7798

(config-if)# no ip flow monitor ipv4fm input (config-if)# exit (config)# flow monitor ipv4fm (config)# cache entries 64000 (config)# int gi 1/46 (config-if)# ip flow monitor ipv4fm input switch# show flow monitor ipv4fm cache Cache type: Normal Cache size: 64000 Current entries: 32768 High Watermark: 32768 Flows added: 32768 … - Emergency aged 0

Tune cache size to match flow flux

91

Page 91: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Troubleshooting Monitoring FNF Collisions

• Limit cache entries

• If cache limit is already reached and hash table is full, scope of monitoring will need to be adjusted

switch# show platform hardware flow table utilization … Buckets w/ X Bucket Count Used Entry Count Used Entries (% of Buckets) (% of Entries) ------------ --------------- ---------------- 0 0 ( 0.0) 0 ( 0.0) 1 0 ( 0.0) 0 ( 0.0) 2 0 ( 0.0) 0 ( 0.0) 3 0 ( 0.0) 0 ( 0.0) … 14 0 ( 0.0) 0 ( 0.0) 15 1 ( 0.0) 15 ( 0.0) 16 8191 ( 99.9) 131056 ( 99.9) Total Used 8192 (100.0) 131071 ( 99.9) Total Free N/A 1 ( 0.0) Unaccounted packets: User configured flow monitor cache limit reached: 4419746531 IPv6 entry table full: 0 Hash Collosions: 176000251

Flow Hash Table Buckets 8K

Entries per bucket 16

Total hash table entries 128K

Approx. total usable space 108K

%C4K_HWFLOWMAN-5-FLOWUNACCOUNTEDPACKETS: Flow stats for 46444030 packets are not accounted due to hardware hash collisions or full hardware flow table

All 16-entry buckets are full = constant collisions

92

Page 92: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Agenda

Products Overview

Troubleshooting – Method – Packet path / loss – VSS – PoE – System Resources – Netflow

Tools/Tips

Appendix

93

Page 93: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tools: Wireshark

Wireshark Best Practices Do not display directly to console without a buffer, file or a duration limit

Write to PCAP file on storage, display on switch or using laptop Wireshark GUI

Only the core filter is implemented in hardware as ACLs. Use a restricted filter to avoid high CPU

Available on Sup7E, Sup7L-E, 4500X

Onboard full packet capture, filter, decode / display

Up to 8 instances supported

94

Page 94: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tools: Wireshark

Forwarding Engine

IOS-XE

Ring Buffer

Console

File Core Filter

Display Filter

Display Filter

Capture Filter

switch# monitor capture mycap int gi 1/46 in match ipv4 protocol tcp 10.1.1.1/32 any file location bootflash:mycap.pcap limit duration 3 switch# monitor capture mycap start *Apr 15 17:56:24.291: %BUFCAP-6-ENABLE: Capture Point mycap enabled. *Apr 15 17:56:27.720: %BUFCAP-6-DISABLE_ASYNC: Capture Point mycap disabled. Reason : Wireshark session ended switch# show monitor capture file bootflash:mycap.pcap display-filter "ip.ttl == 100“ 1 0.000000 10.1.1.1 -> 91.91.91.100 TCP [TCP ZeroWindow] 0 > 0 [<None>] Seq=1 Win=0 Len=2

95

Page 95: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tools: Wireshark

Troubleshooting Steps Commands

Create a monitor monitor capture mycap <interface | vlan | control-plane>

Add core filter monitor capture mycap [access-list <acl> | match <in-line match CLI>]

Display monitor details show monitor capture

Start/stop a monitor session monitor capture mycap start | stop

Display a pcap file show monitor capture file <filename>

Display a pcap file in detail show monitor capture file <filename> detailed

Display a pcap file with filter show monitor capture file <filename> display-filter “filter-detail”

Check if wireshark is running show proc cpu | inc dumpcap

96

Page 96: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tools: Embedded Event Manager

Extremely versatile tool for monitoring, automating, working around issues

(a) What do I want to detect? (b) What do I want to do after that?

event manager applet high-cpu event snmp oid 1.3.6.1.4.1.9.9.109.1.1.1.1.10.1 get-type exact entry-op ge entry-val “80" poll-interval 10 action 1.0 syslog msg "HIGH_CPU! CPU is at: $_snmp_oid_val“ action 2.0 cli command "enable" action 2.1 cli command "show process cpu | redirect bootflash:cpu.txt" action 2.2 cli command "configure terminal" action 2.3 cli command "event manager scheduler suspend“ %HA_EM-6-LOG: TEST: HIGH_CPU! CPU is at: 99 event manager applet interface-flapping event syslog pattern ".*UPDOWN.*GigabitEthernet1/1.*" occurs 4 action 1.0 syslog msg “GigabitEthernet Interface 1/1 changed state 4 times“ action 2.0 cli command "enable" action 2.2 cli command "configure terminal" action 2.3 cli command “interface GigabitEthernet1/1 “ action 2.4 cli command “shutdown”

Collect process CPU usage when CPU is high

Bring an interface down when it flaps too frequently

97

Page 97: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Embedded Event Manager / Netflow Integration

1. Packets with TTL=1 sent to the switch (TTL=1 streams can cause high CPU) 2. NetFlow Engine collects the flow capturing the TTL value:

%HA_EM-6-LOG: ttl: Flow Monitor ttl reported Low TTL for 10.10.10.3 10.10.10.4

3. EEM triggers a syslog when flow is detected:

switch# sh runn flow record ttl match ipv4 ttl match ipv4 protocol match ipv4 source address match ipv4 destination address collect counter bytes collect counter packets collect timestamp sys-uptime first collect timestamp sys-uptime last switch# sh runn flow monitor ttl Current configuration: flow monitor ttl record ttl cache timeout active 40 switch# sh runn int gi 6/1 no switchport ip flow monitor ttl input ip address 10.10.10.2 255.255.255.254

switch(config)# event manager applet ttl event nf monitor-name "ttl" event-type create event1 entry-value "2" field ipv4 ttl entry-op lt action 1.0 syslog msg "Flow Monitor $_nf_monitor_name reported Low TTL for $_nf_source_address $_nf_dest_address"

check – show flow monitor ttl cache format record for IP TTL: 1

98

Page 98: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tips: Crashes

Enhanced crashdump features in 15.0(2)SG2 / 3.2.2SG and higher

exception coredump highly recommended on IOS-XE

Classic IOS full core in 15.1(1)SG2 onwards

On IOS-XE, collect all files in crashinfo: and kinfo:

99

Page 99: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tips: Miscellaneous

Enable NTP to troubleshoot across switches

Include date and time for debug and log messages

service timestamps [debug, log] msec localtime show-timezone

Automatically output time and CPU utilization with each command (exec mode)

terminal exec prompt timestamp

When logging the console, add comments and prefix with “!” to avoid error messages

switch#!!! show module after peer reload

switch# show module

100

Page 100: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tips: Make Life Easier

Search Bug Toolkit for known issues

Output Interpreter to decode command output

System Message Guide for mitigation recommendations

Smart Call Home in 12.2(52)SG

Catalyst 4000 Troubleshooting TechNotes

Catalyst 4500 Configuration Guide and Release Notes

NetPro discussion groups on http://www.cisco.com

101

Page 101: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Tips: Platform Control Plane Enhancements Common Drop Event

Reason First

Available

Control Packet Data Plane Qos

12.2(54)SG Per-interface qos policies can drop control packets

Control Packet Enhancements

15.0(2)SG / 3.2.0SG

Many static ACLs matching control traffic removed CPU now included in special control floodsets on a per-vlan basis access-list hardware capture mode now controls only IGMP ACLs

CPU queue rate limits 15.1(1)SG / 3.3.0SG

DBL (per-flow rate limits) are applied to some CPU queues Improved areas include: • port security / dot1x violate mode • non-RPF multicast (fast drop) Drops appear as DblDrop in show platform software drop-port show platform software ip mfib fastdrop deprecated

102

Page 102: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches

© 2013 Cisco and/or its affiliates. All rights reserved. BRKCRS-3142 Cisco Public

Maximize your Cisco Live experience with your free Cisco Live 365 account. Download session PDFs, view sessions on-demand and participate in live activities throughout the year. Click the Enter Cisco Live 365 button in your Cisco Live portal to log in.

Complete Your Online Session Evaluation

Give us your feedback and you could win fabulous prizes. Winners announced daily. Receive 20 Cisco Daily Challenge

points for each session evaluation you complete. Complete your session evaluation

online now through either the mobile app or internet kiosk stations.

103

Page 103: 2013_usa_pdf_BRKCRS-3142_Troubleshooting Cisco Catalyst 4500 Series Switches