Upload
truongkien
View
993
Download
29
Embed Size (px)
Citation preview
Troubleshooting Cisco Catalyst 3850 and 3650 Series Switches
Shashank Singh, Technical Leader, Cisco Services
BRKCRS-3146
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Cisco Spark
Questions? Use Cisco Spark to chat with the speaker after the session
1. Find this session in the Cisco Live Mobile App
2. Click “Join the Discussion”
3. Install Spark or go directly to the space
4. Enter messages/questions in the space
How
Cisco Spark spaces will be available until July 3, 2017.
cs.co/ciscolivebot#BRKCRS-3146
Shashank SinghTechnical Leader, Cisco Services
Email: [email protected]: @shashankcisco
Shashank is a Technical Leader with Routing andSwitching Technical Leadership team in San Jose, CAand has extensive experience in troubleshooting Catalystline of products including Catalyst 3850/3650 seriesswitches.
Shashank works as an escalation point for Cisco TACand partners with engineering teams to solve some of themost complex customer problems pertaining to Ciscoswitches.
Prior to this role, Shashank has worked as a TACengineer for over five years, troubleshooting switchingproducts and technologies. Shashank has a softwaredevelopment background from his previous role as asoftware developer in General Electric.
Your Instructor Today…
• Product Overview
• Image Management
• Troubleshooting Memory & CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• Troubleshooting QoS
• Glimpse of future – IOS XE 16.X
• Summary
Agenda
Key switch components
Baselining & Anomaly Detection
Tools and Techniques
Product Overview
In this section, you will learn about ...
• Overview of Catalyst 3850/3650 switch
• IOS-XE architecture
• Supported uplink modules
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Catalyst 3850 Switch
B u i l t o n C i s c o ’ s I n n o v a t i v e “ U A D P ” A S I C
480 Gbps Stacking BandwidthMACsec 128 and 256-bit
encryption
MPLS
IEEE 802.3bz 2.5/5Gbps Ethernet
80 Gbps Uplink Bandwidth
Stackpower
Line Rate on All Ports
SGT/SGACL
DNA
POE+ & UPoE
FRU Fans, Power Supplies
Granular QoS/Flexible NetFlow
BRKCRS-3146 7© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Catalyst 3650 Switch
MPLS
40 Gbps Uplink Bandwidth
Line Rate on All Ports
FRU Fans
Granular QoS/Flexible NetFlow
Modular 160 Gbps 9 member Stack
SGT/SGACL
POE+ & UPoE
Fixed 1G/10G Uplinks
IEEE 802.3bz 2.5/5Gbps Ethernet
New Front-End Power Supplies
T h e f o u n d a t i o n f o r f u l l w i r e d a n d w i r e l e s s c o n v e r g e n c e o n a s i n g l e p l a t f o r m .
Campus Fabric
BRKCRS-3146 8© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Cable Type 1G 2.5G 5G 10G
Cat5e ● ● ● NOT
SUPPORTED
Cat6 ● ● ● ●55m
Cat6a ● ● ● ●100m
Can an mGig port work at 100 mbps if
end device cannot work at a higher speed?
Catalyst 3850 Multigigabit Ethernet SwitchesWhy is it Needed?
3850 48-port
12 mGig ports 24 mGig ports
UPOE, EEE, MACsec
On ALL portsUPOE, EEE, MACsec
On ALL ports
New 2x40G and 8x10G
Uplink support
New 2x40G and 8x10G
Uplink support
3850 24-port
# mgig
ports
Advanced port
capabilities
New high-
speed
uplinks
• 802.11ac-2 (3.5Gbps), maintain switch to AP reach at higher speeds (future
proof for higher speeds)
• Infrastructure investment protection
• Auto-negotiation of cable type of speeds supported
• Brownfield deployments can leverage existing Cat5e extending ROI and
support mGig at 2.5G and 5G speeds at a distance of 100m
• Greenfield deployments with Cat6a will support 10G but can also now
support mGig at 2.5G and 5G speeds at a distance of 100m
Yes it can. Use auto-negotiation
instead of hard coding speed to 100.
BRKCRS-3146 9
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
IOS XE 16.x
Hosted AppsIOSd
LXC*
SMD
Crimson
DB
Common
Infrastructure / HA
Management
Interface
Module Drivers
Linux Kernel
WCM
Wireshark
Modular IOS is broken into sub systems within
IOSd for future bug patching functionality.
Common Infrastructure / HA
Management Interface
Module Drivers
Kernel
Cisco IOS EvolutionSame Look & Feel, More Powerful Architecture
IOS
IOS
Common Infrastructure / HA
Management Interface
Module Drivers
Kernel
IOS XE 3.6.X/3.7.X
Features Components
Hosted AppsIOSd
Features
Components
WCM
Wireshark
IOSd Blob
IOS Sub
SystemsIOS Sub
SystemsIOS Sub
Systems
Common Infrastructure / HA
Management Interface
Module Drivers
Kernel
Common Infrastructure / HA
Management Interface
Module Drivers
Linux Kernel
LXC = Linux Containers
WCM = Wireless Client Manager
SMD = Session Manager Daemon
Crimson Database stores operational state of
all operating system features in a consistent
format
BRKCRS-3146 10
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Recommended IOS XE Release Benefits of running recommended release?
Suggested Cisco IOS XE Software Releases for Cisco Catalyst 3850 & 3650 Switches
• Evaluated by Cisco for longevity & stability.
• Optimizations, critical fixes & hardening. Do 3850 Multigigabit Ethernet Switches run
IOS XE version 3.6.X?
No. Multigigabit Ethernet variants run IOS XE
versions 3.7.X or 16.X.
If accidently booted on 3.6.X, switch will
remain in bootloader prompt & will let you
boot correct image from flash or USB stick.
BRKCRS-3146 11
• Product Overview
• Image Management
• Troubleshooting Memory & CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• Troubleshooting QoS
• Glimpse of future – IOS XE 16.X
• Summary
Agenda
In this section, you will learn about ...
• 3850/3650 Image naming convention
• Packages in the image
• Install vs. bundle boot
• Password recovery
Image Management
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
IOS XE Image Naming Convention
cat3k_caa-universalk9.SPA.03.06.06.E.152-2.E6.bin
Converged
Access Switch Universal License
S - Digitally Signed
P - Production
A- Key Version
IOS-XE Version IOSd Version
cat3k_caa-universalk9.16.03.03.SPA.bin
Universal LicenseS - Digitally Signed
P - Production
A- Key Version
Unified Denali version
BRKCRS-3146 14
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Booting IOS XE SoftwareWhat is new?
Install Boot (default mode)
• Packages are installed on flash
• Supports AP image pre-download
• No additional memory requirement
• Image must be installed in flash:
• request platform software
package expand
• boot flash:packages.conf
Bundle Boot
• Packages are expanded in RAM
• No AP image pre-download
• Additional memory required
• Image can be booted from flash:,
usbflash: or tftp:
• boot flash:cat3k_caa-
universalk9.16.03.03.SPA.
bin
BRKCRS-3146 15
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Known issues with Install mode• CSCuu10600: "%Signature verification failed" during IOS-XE upgrade, stack/INSTALL
• CSCuw82216: Catalyst3850: Upgrade in install mode corrupts the flash - EXT2-fs error
• These two issues are fixed in our recommended release. Work around: upgrade Via USB Flash or switch to Bundle mode
3850# show version
Switch Ports Model SW Version SW Image Mode
------ ----- ----- ---------- -------------------- ----
1 56 WS-C3850-48P 16.3.3 CAT3K_CAA-UNIVERSALK9 INSTALL
Configuration register is 0x102
3850# show boot
---------------------------
Switch 1
---------------------------
Current Boot Variables:
BOOT variable = flash:packages.conf;
Boot Variables on next reload:
BOOT variable = flash:packages.conf;
Manual Boot = no
Enable Break = no
Install Boot mode
Boot Variable to be used
during next reload
BRKCRS-3146 16
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
From Install mode to bundle mode … End
3850(config)# no boot system switch all
3850(config)# boot system switch all flash:cat3k_caa-universalk9.16.03.03.SPA.bin
3850(config)# do write mem
Building configuration...
Compressed configuration from 5100 bytes to 2737 bytes[OK]
3850# reload
Reload command is being issued on Active unit, this will reload the whole stack
Proceed with reload? [confirm]
<Snip> ..
3850# show version
Cisco IOS Software [Denali], Catalyst L3 Switch Software (CAT3K_CAA-UNIVERSALK9-M), Version
16.3.3, RELEASE SOFTWARE (fc3)Technical Support: http://www.cisco.com/techsupportCopyright
<snip> ..
Switch Ports Model SW Version SW Image Mode
------ ----- ----- ---------- ---------- ----
1 56 WS-C3850-48P 16.3.3 CAT3K_CAA-UNIVERSALK9 BUNDLE
Configuration register is 0x102
Modify the Boot Statement
Bundle Mode
BRKCRS-3146 17
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Password recovery on 3x50 does NOT follow the 3750 family procedure
Power cycle switch and hold the Mode button until the status LED gets amber, that will get you in Boot Loader prompt (Switch:)
3850/3650 Password Recovery
Switch: flash_init
Switch: SWITCH_IGNORE_STARTUP_CFG=1
Switch: SWITCH_DISABLE_PASSWORD_RECOVERY=0
Switch: boot
--- System Configuration Dialog ---
Would you like to enter the initial configuration dialog? [yes/no]: no
Press RETURN to get started!
Switch> enable
Switch#
Initialize flash and Boot
Variables
Boot the Switch
Skip Initial Config and go to
enable (No password required)
BRKCRS-3146 18
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850/3650 Password recovery - End
Switch# configure terminal
Switch(config)# no enable password
Switch(config)# no enable secret
Switch(config)# enable secret <New Password>
Switch(config)# no system ignore startupconfig switch all
Switch(config)# system disable password recovery switch all
Switch(config)# end
Switch# write memory “or” copy running-config startup-config
Remove and
Change Password
Re-enable reading startup
config and disable
password recovery
Save Changes
BRKCRS-3146 19
• Product Overview
• Image Management
• Troubleshooting Memory & CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• Troubleshooting QoS
• Glimpse of future – IOS XE 16.X
• Summary
Agenda
Troubleshooting Memory & CPU
In this section you will learn about…
• 3850/3650 CPU complex and CPU Punt Path
• Capturing packets punted to CPU
• Troubleshooting high CPU utilization
• Troubleshooting memory Utilization
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Supports
Dual IP stack - IOSd + LFTS (for other Linux processes)
Multiple Control Plane sources - IOSd, WCM, SANet (SMD)
Different Data Plane forwarding systems- QFP, UADP, etc
Platform specific driver hook-up to LSMPI to relay packets from/to
Data Plane for different hardware complex.
IOS XE Punt & Inject InfrastructureDefinitions
Punt: Ingress control packets are intercepted by Data Plane and sent to the Control Plane (CPU) for processing
Inject: Control Plane (CPU) generated protocol packets are sent to the Data Plane to egress out on IO interface(s)
LSMPI: Linux Shared Memory Punt Interface – transport between Data Plane and Control Plane
LFTS: Linux Forwarding Transport Service - Linux application socket level interface opened for application tracking.
IOSd WCM SANET(SMD)
LSMPI
ASICUADP
ASIC
Control Plane
Data Plane
PI FED
shim
LFTS
PI ASIC
shim
QFP
PI QFP
shim
Catalyst 3850/3650
ASR1kASR900
BRKCRS-3146 22
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850/3650 CPU Complex
Cavium 6230
800 MHz, 4 core CPU
2MB L2 Cache
UADP 1
UADP 2
USB/RJ-45 Console10/100/1000 RJ-45
Ethernet Mgmt
SGMII UART
PCIe
PCIe
4GB DDR3
w/ ECC
DDR3 - 1333
FPGA for
Stack Power
I2C
RTC
ACT II
FPGA for PHY,
LED, etc.
I2C
2GB Flash64MB
Bootloader
Boot Bus
BRKCRS-3146 23
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
CPU Utilization
Why Should I be concerned about high CPU utilization ?It is very important to protect the control plane for network stability, as resources (CPU, Memory and buffer) are
shared by control plane and data plane traffic (sent to CPU for further processing).
What are the usual symptoms of high CPU usage ?• Control plane instability e.g., OSPF flap
• Reduced switching / forwarding performance
• Slow response to Telnet / SSH
• SNMP poll miss
At what percentage level should I start troubleshooting ?It depends on the nature and level of the traffic. It is very essential to find a baseline CPU usage during normal
working conditions, and start troubleshooting when it goes above a specific threshold.
E.g., Baseline CPU usage 25%. Start troubleshooting when the CPU usage is consistently at 50% or above.
BRKCRS-3146 24
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Why should packets be sent to CPU?
Common Cause Recommended Solution
Same interface forwarding change design, use “no ip redirect”
ACL logging disable ACL logging
ACL deny causing switch to send ICMP unreachable no ip unreachables
Forwarding/Feature exception (out of TCAM/adj space) reduce TCAM usage
SW-supported feature disable the feature or reduce the amount of traffic
IP packets with TTL<2 or options disable the offending traffic
Broadcast Storm Fix STP loop, disable traffic
Unexpected control/data traffic Control Plane Policing (CoPP), Deny ACL
Software Bug Open a Service Request
BRKCRS-3146 25
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
show platform software fed switch active punt cause summary
Statistics for all causes
Cause Cause Info Rcvd Dropped
------------------------------------------------------------------------------
7 ARP request or response 498132 0
21 RP<->QFP keepalive 79 0
show process cpu sort | ex 0.00
CPU utilization for five seconds: 67%/48%; one minute: 17%; five minutes: 4%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
34 1719 964 1783 24.63% 1.98% 0.40% 0 ARP Input
65 73 256 285 0.55% 0.06% 0.01% 0 Net Background
72 4472 523 8550 0.07% 5.31% 1.37% 0 IOSD ipc task
194 98 1913 51 0.07% 0.07% 0.02% 0 IP ARP Retry Age
211 34 462 73 0.07% 0.03% 0.01% 0 UDLD
Punt Cause
Troubleshooting High CPUCPU utilization for IOSd processes only on IOS XE 16.X
For entire system check
show processes cpu platform sorted
Biggest Consumer
Process is ARP.
BRKCRS-3146 26
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
show process cpu platform history 1min
1 minutes ago, CPU utilization: 64%
2 minutes ago, CPU utilization: 66%
3 minutes ago, CPU utilization: 67%
4 minutes ago, CPU utilization: 64%
5 minutes ago, CPU utilization: 64%
6 minutes ago, CPU utilization: 66%
7 minutes ago, CPU utilization: 66%
8 minutes ago, CPU utilization: 64%
9 minutes ago, CPU utilization: 66%
10 minutes ago, CPU utilization: 72%
11 minutes ago, CPU utilization: 0%
show controllers cpu-interface
queue retrieved dropped invalid hol-block
-------------------------------------------------------------------------
Routing Protocol 3427 0 0 0
L2 Protocol 32117 0 0 0
sw forwarding 0 0 0 0
broadcast 552 0 0 0
icmp 0 0 0 0
icmp redirect 0 0 0 0
Troubleshooting High CPU Identifying busy CPU queues
Queue-wise accounting
CPU went high
10 min ago
Display Interval
Queue Name
BRKCRS-3146 27
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Dig Deeper on 16.3.X - High CPUCommands Troubleshooting Step
show platform hardware fed switch active qos queue stats
internal cpu policerCheck if CoPP enabled or not for each queue and drops
show platform software fed switch active punt cause summary Possible causes and packets received/dropped
show platform software fed switch active inject cause summary Reasons for injection at Fed Process
show processes cpu extendedShow extended cpu usage report of last 5 seconds for
IOS(d) process
show processes cpu platform sorted Show CPU usage per IOS-XE process
show proc cpu sortedShow sorted output based on percentage of usage for IOS(d)
processes
BRKCRS-3146 28
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Equivalent Commands on IOS XE 3.6.X and 3.7.X Troubleshooting CPU utilization
Troubleshooting Steps Commands
Check CPU usage on IOS threads show process cpu detailed process iosd [sorted]
Check CPU usage on platform dependent
processes
show process cpu detailed process {fed | platform_mgr | stack-mgr |
ha_mgr | eicored…}
Check traffic on the RX and TX CPU
queues
show platform punt client, show platform punt tx
Check details of CPU queues show platform punt statistics port-asic 0 cpuq 0 direction {rx | tx}
BRKCRS-3146 29
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Embedded WiresharkOverview
• Allows for packet data to be captured at various points in the
packet processing path; flowing through, to and from Catalyst
3850/3650 switch.
• Requires IPBase or IPServices license.
• No need to have physical access to the switch or a separate
computer (unlike SPAN)C3850
Gi1/0/1
Buffer/ Bootflash:
Export Data
Capture point –Interface/ Control-plane/VLAN
TFTP Server
• During a Wireshark packet capture, hardware forwarding happens concurrently.
• Capture can be saved and viewed on switch itself, or can be exported as a .pcap file to be viewed
on a computer.
Catalyst 3850 Series Switch High CPU Usage Troubleshoot
BRKCRS-3146 30
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Embedded Wireshark
3850#monitor capture my_cap match any control-plane both filter any
3850#monitor capture my_cap start
Started capture point : my_cap1
3850#ping 192.168.1.11
Sending 5, 100-byte ICMP Echos to 192.168.11,
timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5)
3850#monitor capture my_cap stop
Stopped capture point : mycap1
Attach wireshark to control-plane
Start capture
Ping switch IP address,
packet goes to CPU
Stop the capture
CPU Packet Capture
BRKCRS-3146 31
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850#show monitor capture my_cap buffer brief
----------------------------------------------------------------------------
# size timestamp source destination dscp protocol
----------------------------------------------------------------------------
0 0 0.000000 10.154.66.69 -> 172.16.94.195 0 BE TCP
1 0 0.005004 172.16.94.195 -> 10.154.66.69 48 CS6 TCP
--snip--
37 0 5.152036 172.16.94.195 -> 10.154.66.69 48 CS6 TCP
40 0 5.153043 192.168.1.11 -> 192.168.1.10 0 BE ICMP
41 0 5.153043 10.154.66.69 -> 172.16.94.195 0 BE TCP
43 0 5.155042 10.154.66.69 -> 172.16.94.195 0 BE TCP
44 0 5.158048 192.168.1.10 -> 192.168.1.11 0 BE ICMP
45 0 5.159040 172.16.94.195 -> 10.154.66.69 48 CS6 TCP
3850#monitor capture clear
Captured data will be deleted [clear]?[confirm]
cleared buffer : my_cap1
3850#no monitor capture my_cap
View capture
Clear buffer
Stop capture
Embedded WiresharkCPU Packet Capture
BRKCRS-3146 32
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850#monitor capture my_cap export location usbflash0:my_cap.pcap
40 0 5.153043 192.168.1.11 -> 192.168.1.10 0 BE ICMP
41 0 5.153043 10.154.66.69 -> 172.16.94.195 0 BE TCP
43 0 5.155042 10.154.66.69 -> 172.16.94.195 0 BE TCP
44 0 5.158048 192.168.1.10 -> 192.168.1.11 0 BE ICMP
45 0 5.159040 172.16.94.195 -> 10.154.66.69 48 CS6 TCP
Export to USB Disk or TFTP
You can view the exported file in Wireshark
BRKCRS-3146 33
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Memory Utilization (RAM)
Why Should I be concerned about high memory utilization?It is very important have enough free memory to support features and network convergence events that require
transient memory.
What are the usual symptoms of high memory usage ?• Memory utilization of process(es) keeps increasing
• System runs out of buffers and software packet forwarding stops
• Memory allocation failures are reported
• System crashes after reporting out of memory
At what percentage level should I start troubleshooting ?It depends on the nature and level of feature config on the switch. It is very essential to find a baseline memory
usage during normal working conditions, and start troubleshooting when it goes above specific threshold.
E.g., Baseline memory usage 40%. Start troubleshooting when the memory goes above 70% and constantly keeps
increasing without adding any new configuration.
BRKCRS-3146 34
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Memory Utilization (RAM)Why is memory utilization high?
Common Cause Recommended Solution
Extensive Config Reduce configuration to supported scale
Excessive memory allocated to trace buffers Reset trace buffers to default sizes
DoS Attack/Punted traffic causing buffer depletion Identify packets and block them using an ACL
Protocol flaps/re-convergence causing high
transient memory utilization
Identify reason for network instability
Memory Leak caused by software bug Open a Service Request
Set trace control <>
buffer default
BRKCRS-3146 35
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Memory Utilization Alarms
Committed memory > 95% for warnings*May 30 19:55:33.384 PDT: %PLATFORM-4-LEMENT_WARNING:Switch 1
R0/0: smand: 4/RP/0: Committed Memory value 96% exceeds
warning level 95%
Committed memory > 99% for critical errors
show platform software status control-processor brief
Load Average
Slot Status 1-Min 5-Min 15-Min
1-RP0 Healthy 0.40 0.40 0.35
Memory (kB)
Slot Status Total Used (Pct) Free (Pct) Committed (Pct)
1-RP0 Healthy 3958028 2544852 (64%) 1413176 (36%) 3288040 (83%)
CPU Utilization
Slot CPU User System Nice Idle IRQ SIRQ IOwait
1-RP0 0 2.40 0.50 0.00 97.10 0.00 0.00 0.00
1 1.80 0.30 0.00 97.90 0.00 0.00 0.00
2 1.10 0.40 0.00 98.50 0.00 0.00 0.00
3 7.50 0.10 0.00 92.39 0.00 0.00 0.00
4 0.90 0.20 0.00 98.90 0.00 0.00 0.00
5 3.00 0.90 0.00 96.10 0.00 0.00 0.00
BRKCRS-3146 36
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
show proc memory sorted
Processor Pool Total: 886295488 Used: 345992688 Free: 540302800
lsmpi_io Pool Total: 6295128 Used: 6294296 Free: 832
PID TTY Allocated Freed Holding Getbufs Retbufs Process
289 0 362372784 67944416 268093424 3748898 67 HTTP CORE
73 0 32768232 1322824 31831480 0 637980 IOSD ipc task
164 0 8710232 415696 5673776 0 0 SNMP MA SA
413 0 3929984 5680 3981304 849828 0 EEM ED Syslog
0 0 0 0 3545664 0 0 *MallocLite*
1 0 1686544 6944 1724600 0 0 Chunk Manager
425 0 1521744 33024 1533720 0 0 EEM Server
0 0 5948784 4810448 726936 17522495 0 *Dead*
4 0 2691272 600360 637760 0 0 RF Slave Main Th
414 0 390160 5680 441480 72316 0 EEM ED Generic
29 0 422296 896 404464 0 0 IPC Seat RX Cont
388 0 310248 1616 377632 0 0 Crypto IKEv2
314 0 304960 1640 373320 0 0 DHCP Client
205 0 631128 340616 265928 0 0 mDNS
Troubleshooting Memory UtilizationWhich process is holding most memory?
Total MemoryProcess
HTTP CORE in this case
BRKCRS-3146 37
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
show memory allocating-process totals
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor FF9B4FD010 886295488 945983712 40311776 536270200 537368132
lsmpi_io FF9ACBE1A8 6295128 6294304 824 824 412
Allocator PC Summary for: Processor
PC Total Count Name
0xAAB3404678 193360840 93301 HTTP CORE
0xAAAE38D944 27808280 530 *Init*
0xAAAE403E0C 27047776 981 DynCmd object c
0xAAB0573440 21658208 7713 *Packet Header*
0xAAB35CDDB0 20989080 454 XOS_MEM_UTILS
0xAAB0573498 19068568 7484 *Packet Data*
Troubleshooting Memory Utilization - EndDrill Down deeper – Is a process not releasing memory?
Is count increasing
continuously?
Memory leak due to
HTTP CORE process?
BRKCRS-3146 38
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
top - 20:22:02 up 2:13, 0 users, load average: 0.44, 0.29, 0.31
Tasks: 274 total, 2 running, 272 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.8%us, 0.4%sy, 0.0%ni, 96.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 3958028k total, 2934896k used, 1023132k free, 151096k buffers
Swap: 0k total, 0k used, 0k free, 1006324k cached
*** Delay time Not changed ***
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17682 root 20 0 2606m 394m 175m S 12 10.2 16:28.87 fed main event
31884 root 20 0 245m 52m 43m S 2 1.4 3:05.98 repm
30028 root 20 0 1755m 639m 250m S 2 16.5 3:42.94 linux_iosd-imag
15231 root 20 0 1440m 175m 155m S 2 4.5 2:48.06 sif_mgr
31884 root 20 0 245m 52m 43m S 2 1.4 3:05.86 repm
1 root 20 0 8324 4352 2180 S 0 0.1 0:02.64 systemd
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
3850#terminal terminal-type xterm
3850#monitor platform software process RP active
CPU & Memory Utilization – Real time monitoringTop processes inside IOS XE
Percent of CPU utilized
Percent of memory utilized
IOS-XE Processes
BRKCRS-3146 39
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting Steps Commands
Check memory usage on system show processes memory sorted
Check memory usage of a particular process show processes memory detailed process fed
Check memory usage of IOSd show processes memory detailed process iosd
Check allocators of memory within IOSd show memory detailed process iosd allocating-process totals
Equivalent Commands on IOS XE 3.6.X and 3.7.X Troubleshooting memory utilization
BRKCRS-3146 40
• Product Overview
• Image Management
• Troubleshooting Memory &CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• Troubleshooting QoS
• Glimpse of future – IOS XE 16.X
• Summary
Agenda
Troubleshooting Stack & High Availability
In this section you will learn about…
• 3850/3650 Stacking Architecture
- Stacking Show commands
- Troubleshooting failure to form a stack
• 3850/3650 HA Architecture
- Election of Active and Standby
- Show commands for checking HA states
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850 StackWise-480 Overview
• 3850 StackWise-480 is a new generation of Catalyst 3850 stacking
• 240Gbps of bandwidth (120Gbps TX & 120Gbps RX per connector)
• Similar to previous stacking implementations, ring redundancy is achieved via ring-wrap capabilities provided in hardware
• NOT backward compatible with currently fielded stacking technologies, most notably StackWise Plus.
Which Stacking Technology?StackWise-480
BRKCRS-3146 43
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Stack–Cables and Components
Catalyst 3850 | Stackwise-480 Catalyst 3650 | Stackwise-160
3 lengths of cable, 0.5 1 and 3 Meters 1 ring in 3650 vs 3 rings in 3850
Can I connect Stackwise-480 cables on 3850-XS?
No, 3850-48XS does not support Stackwise-480. It’s
a high end model with 640G switching capacity &
supports Stackwise Virtual using front panel ports.
BRKCRS-3146 44
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• 6 rings in total
• 3 rings go East
• 3 rings go West
• Each ring is 40G
• Total Stack BW = 240G
• With Spatial Reuse = 480G
Stack Interfaceof UADP
Stack Interface of UADPASIC
Assuming4 x 24-port3850 Switches
Packets are segmented/reassembled in HW (256 byte
segments)
How many Stack Ring in my stack?
BRKCRS-3146 45
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Destination StrippingPacket travels ½ the rings.Taken out of stack by destination
13
13
Assuming4 x 24-port3850 Switches
42
42
Understanding Spatial ReuseDoubling the capacity of my stack
BRKCRS-3146 46
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
What is the status of my stack?show switch detail
Switch/Stack Mac Address : 6400.f124.df80 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
------------------------------------------------------------
*1 Active 6400.f124.df80 10 0 Ready
2 Standby 6400.f124.de80 1 0 Ready
Priority, followed by MAC Address determines
which switch gets elected as Active.
show switch stack-ports summary
Sw#/Port# Port Status Neighbor Cable Length Link OK Link Active Sync OK #Changes to LinkOK In
Loopback
------------------------------------------------------------------------------------------------------------
1/1 OK 2 50cm Yes Yes Yes 0 No
1/2 OK 2 Unknown Yes Yes Yes 0 No
2/1 OK 1 100cm Yes Yes Yes 1 No
2/2 OK 1 50cm Yes Yes Yes 1 No
show platform hardware authentication status
Mainboard Authentication: Passed
FRU Authentication: Passed
Stack Cable A Authentication: Failed << Corrupt EEPROM?
Stack Cable B Authentication: Passed
show platform software sif switch 1 r0 exceptions
SIF INT : SIFEXCEPTIONINTERRUPTA1_SIFRAC5PMARECEIVEFIFOSPILL3_FIELD_IDX
Occurred count: 1
BRKCRS-3146 47
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
What happens When Image Version Mismatches?• If switches are in version mismatch state, they will not stack.
• If versions do not match, upgrade standby/member switch to the Active’s version
show switch
Switch# Role Mac Address Priority Version State
---------------------------------------------------------------------------
*1 Active 6400.f125.1480 1 V01 Ready
2 Standby 6400.f125.2680 1 V01 Ready
3 Member 6400.f125.2500 1 0 V-Mismatch
4 Member 6400.f125.2480 1 0 V-Mismatch
3850(config)# software auto-upgrade enable
Any newly added member automatically upgraded.
Reload only new switch
BRKCRS-3146 48
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
What happens when there is License Mismatch?Member switch will not stack
license right-to-use deactivate ipservices
license right-to-use activate ipbase acceptEULA
Reload switch
IP Base
IP Base
IP BaseIP
Services
A
S
BRKCRS-3146 49
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
What happens when there is License Mismatch?
show license right-to-use slot 1
Slot# License name Type Count Period left
----------------------------------------------------------
1 ipbase permanent N/A Lifetime
1 lanbase permanent N/A Lifetime
1 apcount adder 4 Lifetime
show license right-to-use mismatch
Slot# License Name Adder AP Count Base AP Count
---------------------------------------------------------------
3 ipservices 0 0
Lanbase license is permanent on
Master
Find out member with mismatch
BRKCRS-3146 50
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
HA Redundancy on 3850/3650
Interfaces
L2 Control
L3 Control
QoS
Interfaces
L2 Control
L3 Control
QoS
Wireless
Wireless
Feature State is synced between Active and Standby Member in stack
Feature States are inactiveon Standby Member
S
A
BRKCRS-3146 51
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Route Processor Domain – a set of SW processes (e.g. IOSd, WCM) that
implement the centralized Active and Standby portions of the stack control plane
Line Card Domain – a set of SW processes (e.g. FED, Platform Manager) that
implement the distributed Line Card portions of the stack control plane
Infra Domain – Support SW for the RP and LC Domains
Active Switch – supports the Active RP Domain, a LC Domain and Infra Domain
Standby Switch – supports the Standby RP Domain, a LC Domain and Infra
Domain
Member Switch – supports a LC Domain and Infra Domain.
Election – assigning roles or functions within the stack
HA– Roles and Definitions
BRKCRS-3146 52
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
RP InfraLC
RP Infra
InfraLC
InfraLC
SLC
• Active starts Route Processor (RP) Domain
(IOSd, WCM, etc) locally
• Programs hardware on all Line Card(LC) Domains
• Traffic resumes once hardware is programmed
• Starts 2min Timer to elect Standby in parallel
• Active elects Standby
• Standby starts RP Domain locally
• Starts Bulk Sync with Active RP
• Standby reaches “Standby Hot”
2min timer
A
Catalyst 3850/3650 – HA State Machine
BRKCRS-3146 53
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
show switch
Switch/Stack Mac Address : 2037.06cf.0e80 - Local Mac Address
H/W Current
Switch# Role Mac Address Priority Version State
------------------------------------------------------------
*1 Active 2037.06cf.0e80 10 PP Ready
2 Standby 2037.06cf.3380 8 PP Ready
3 Member 2037.06cf.1400 6 PP Ready
4 Member 2037.06cf.3000 4 PP Ready
Stateful Switchover Redundancy (SSO)Mac Address doesn’t
change for stack duration
Standby
Active
show redundancy states
my state = 13 –ACTIVE
peer state = 8 -STANDBY HOT
Mode = Duplex
Unit ID = 2
Redundancy Mode (Operational) = SSO
Redundancy Mode (Configured) = SSO
Redundancy State = SSO
Communications = Up
client count = 76
client_notification_TMR = 360000 milliseconds
keep_alive TMR = 9000 milliseconds
Terminal state for SSO. If “peer state” is stuck
in any other state for more than 10 minutes,
open a service request with TAC
If Communication channel is not Up, there
might be a problem with stack connectivity.
Check stack cable.
BRKCRS-3146 54
• Product Overview
• Image Management
• Troubleshooting Memory & CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• Troubleshooting QoS
• Glimpse of future – IOS XE 16.X
• Summary
Agenda
Troubleshooting Hardware Forwarding
In this section, you will learn about ...
• TCAM (Ternary Content Addressable Memory)
• Unicast Forwarding – Layer 2
• Unicast Forwarding – Layer 3
• Multicast Forwarding
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Ternary Content Addressable Memory
• Features that need packet forwarding at line rate program entries in TCAM
• TCAM is partitioned in several banks and regions
• Features use a Hash Table Manager (HTM) to select and configure region
• Entries wrongly programmed in TCAM will lead to wrong or unexpected
forwarding decisions
TCAM on Catalyst 3850/3650
BRKCRS-3146 57
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
What features are using the TCAM?Establish a Baseline
show platform hardware fed switch 1 fwd-asic resource tcam utilization
CAM Utilization for ASIC# 0
Table Max Values Used Values
--------------------------------------------------------------------------
Unicast MAC addresses 32768/512 82/22
Directly or indirectly connected routes 32768/8192 7/89
IGMP and Multicast groups 8192/512 0/16
Security Access Control Entries 3072 173
QoS Access Control Entries 2816 52
Netflow ACEs 1024 15
Input Microflow policer ACEs 256 7
Output Microflow policer ACEs 256 7
Control Plane Entries 512 187
Policy Based Routing ACEs 1024 9
<Snip>
Features
Maximum # entries/
Maximum # Masks
Current usage
Asic 0 (24 ports Per
Asic)
On IOS XE versions 3.6.X and 3.7.X, check
show platform tcam utilization asic all
BRKCRS-3146 58
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Mac Address LearningHow does learning happen?
• Catalyst 3850/3650 support up to 32000 mac addresses in TCAM
• Hardware assisted software learning
• Port ASIC learns MAC Address and puts it into a Learning Cache – (Mac
Address Table Manager MATM)
• Forwarding Engine Driver(FED) reads MATM Table and programs entry in TCAM
BRKCRS-3146 59
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Unicast Forwarding – Layer 2
show mac address-table address 501c.bf66.0b48
Mac Address Table
-------------------------------------------
Vlan Mac Address Type Ports
---- ----------- -------- -----
1 501c.bf66.0b48 DYNAMIC Gi1/0/1
Total Mac Addresses for this criterion: 1
show platform software matm switch ?
<1-9> Switch number
active Active instance
standby Standby instance
Software Mac Address Table
Look at MAC Address Table
Manager on which stack switch?
Verifying Mac Address
Gi1/0/1
10.10.10.2Vlan 1
501c.bf66.0b48
3850
10.10.10.1Vlan 1
3850 acting as layer
2 switch for vlan 1
BRKCRS-3146 60
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
show platform software object-manager switch active f0 statisticsForwarding Manager Asynchronous Object Manager Statistics
Object update: Pending-issue: 0, Pending-acknowledgement: 0
Batch begin: Pending-issue: 0, Pending-acknowledgement: 0
Batch end: Pending-issue: 0, Pending-acknowledgement: 0
Command: Pending-acknowledgement: 0
Any errors with programming
MAC address?
Unicast Forwarding – Layer 2Verifying Mac Address
show platform software fed switch active matm macTable vlan 1VLAN MAC Type Seq# macHandle siHandle diHandle *a_time *e_time ports
501c.bf66.0b47 0X8002 0 0xffcc735968 0xffcc726978 0x97 0 0 Vlan1
501c.bf66.0b48 0X101 3 0xffcc7022f8 0xffcc702168 0xf096 0 0 Gi1/0/1
Total Mac number of addresses:: 2
*a_time=aging_time(secs) *e_time=total_elapsed_time(secs)
show platform hardware fed switch active matm macTable vlan 1HEAD: MAC address 501c.bf66.0b48 in VLAN 1
KEY: vlan 3, mac 0x501cbf660b48, l3_if 0, gpn 150, epoch 15, static 0, flood_en 1, vlan_lead_wless_flood_en
3, client_home_asic 0
MASK: vlan 0, mac 0x0, l3_if 0, gpn 0, epoch 0, static 0, flood_en 0, vlan_lead_wless_flood_en 0,
MAC address check in FED –
hardware
MAC address check in FED –
software
BRKCRS-3146 61
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
show platform software fed switch standby matm macTable vlan 1VLAN MAC Type Seq# macHandle siHandle diHandle *a_time *e_time ports
501c.bf66.0b47 0X8002 0 0xffc0703b28 0xffc0736908 0x97 0 0 Vlan1
501c.bf66.0b48 0X1 3 0xffc073d498 0xffc073d308 0xf096 300 46 Gi1/0/1
The Meaning of Type & Sequence Number
• A MAC Address is aged out only on the switch where it is first learned
• Other switches learn through Notifications
show platform software fed switch active matm macTable vlan 1VLAN MAC Type Seq# macHandle siHandle diHandle *a_time *e_time ports
501c.bf66.0b47 0X8002 0 0xffcc735968 0xffcc726978 0x97 0 0 Vlan1
501c.bf66.0b48 0X101 3 0xffcc7022f8 0xffcc702168 0xf096 0 0 Gi1/0/1
Type 0x101 means 501c.bf66.0b48 is a
dynamic entry on active switch that will
age on this switch.
Type 0x1 means 501c.bf66.0b48 is learnt on standby
switch through notification & cannot be aged out on
this switch.
If sequence number keeps changing
frequently, it indicates MAC re-learning.
Unicast Forwarding – Layer 2 - End
BRKCRS-3146 62
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850#show ip cef 50.50.50.50 detail
50.50.50.50/32, epoch 3, flags [attached]
Adj source: IP adj out of Vlan500, addr 50.50.50.50 FFAFC4ABC0
Dependent covered prefix type adjfib, cover 50.50.50.0/24
attached to Vlan500
3850#show adjacency 50.50.50.50 detail
Protocol Interface Address
IP Vlan500 50.50.50.50(8)
0 packets, 0 bytes
epoch 0
sourced in sev-epoch 0
Encap length 14
80E01D24AC50E4AA5D9933D00800
L2 destination address byte offset 0
3850#show interface vlan 500 | in bia
Hardware is Ethernet SVI, address is e4aa.5d99.33d0 (bia e4aa.5d99.33d0)
Unicast Forwarding – Layer 33850 acting as router
Gig2/0/48
3850
Gig2/0/47172.16.1.1 50.50.50.50
172.16.1.2 50.50.50.51
Rewrite Info
Cross check source MAC
BRKCRS-3146 63
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850#show platform software ip switch active R0 cef prefix 50.50.50.50/32 detail
Forwarding Table
50.50.50.50/32 -> OBJ_ADJACENCY (10), urpf: 11
Connected Interface: 115
Prefix Flags: Directly L2 attached
OM handle: 0x805ce088
3850#show platform software adjacency switch active R0 index 10
Adjacency id: 0xa (10)
Interface: Vlan500, IF index: 115, Link Type: MCP_LINK_IP
Encap: 80:e0:1d:24:ac:50:e4:aa:5d:99:33:d0:8:0
Encap Length: 14, Encap Type: MCP_ET_ARPA, MTU: 1500
Flags: no-l3-inject
Incomplete behavior type: None
Fixup: unknown
Fixup_Flags_2: unknown
Nexthop addr: 50.50.50.50
IP FRR MCP_ADJ_IPFRR_NONE 0
OM handle: 0x805cda30
Unicast Forwarding – Layer 3 - End3850 acting as router
Gig2/0/48
3850
Gig2/0/47172.16.1.1 50.50.50.50
172.16.1.2 50.50.50.51
Rewrite Info
Cross check next hop
From previous output
BRKCRS-3146 64
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850-1#show ip mroute 239.1.1.1 10.33.33.33
(10.33.33.33, 239.1.1.1), 1d04h/00:01:47, flags: JT
Incoming interface: Vlan33, RPF nbr 0.0.0.0
Outgoing interface list:
Vlan77, Forward/Sparse, 1d02h/00:02:47
Multicast Forwarding
Ingress vlan 33
Egress Vlan 77
3850-1#show ip mfib 239.1.1.1 10.33.33.33 verbose
(10.33.33.33,239.1.1.1) Flags: K HW DDE
0xB OIF-IC count: 0, OIF-A count: 1
SW Forwarding: 7/0/1278/0, Other: 0/0/0
HW Forwarding: 10334626/99/1278/988, Other: 0/0/0
Vlan33 Flags: RA A MA
Vlan77 Flags: RF F NS
CEF: Adjacency with MAC: 01005E010101B07D47E147F30800
Multicast Rewrite Info
MRIB Accept, MFIB Accept
Drops?
RPF Failure, OIF Null etc
MRIB Forward, MFIB Forward
show platform hardware fed switch 1 fwd-asic counters tla RWE drop
RweDropCount on Asic 0
[0] dropCount 0x00000000
3850-1
Vlan 33
Sender IP 10.33.33.33
Multicast IP 239.1.1.1
3850 with PIM
Acting as mcast router
3850-2
3850 with IGMP Snooping
Acting as switch
Gig1/0/1
Gig1/0/47
Vlan 77Gig1/0/2
10.77.77.76
Gig1/0/48
10.77.77.77
Multicast
Receivers
BRKCRS-3146 65
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3850-2#show platform software fed switch active ip igmp snooping vlan 77
Vlan 77
---------
Snoop Enabled : On
Flood Mode : Off
I-Mrouter : Off
Oper State : Up
STP TCN Flood : Off
Routing Enabled : Off
PIM Enabled : Off
<...snip...>
==============================================================
Mrouter PortQ :
If 0x8 GigabitEthernet1/0/1
Flood PortQ :
If 0x8 GigabitEthernet1/0/1
If 0xa GigabitEthernet1/0/47
If 0xa GigabitEthernet1/0/48
3850-2#show ip igmp snooping groups vlan 77
Vlan Group Type Version Port List
-------------------------------------------
77 239.1.1.1 igmp v2 Gi1/0/47, Gi1/0/48
Multicast Forwarding - End
Egress Vlan 77
Mrouter Port
Layer 2 ports –
multicast receivers
Layer 2 ports –
multicast receivers
Are 3850 Stack members capable of forwarding
multicast coming in locally?
Yes! Stack members have forwarding information for
both layer 2 and layer 3 multicast and can forward
traffic to local egress ports or stack ports as needed.
3850-1
Vlan 33
Sender IP 10.33.33.33
Multicast IP 239.1.1.1
3850 with PIM
Acting as mcast router
3850-2
3850 with IGMP Snooping
Acting as switch
Gig1/0/1
Gig1/0/47
Vlan 77Gig1/0/2
10.77.77.76
Gig1/0/48
10.77.77.77
Multicast
Receivers
BRKCRS-3146 66
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
ASIC level drops and exceptionsshow platform hardware fed switch 1 fwd-asic drops exceptions
Run command multiple times to
check for incrementing count
BRKCRS-3146 67
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Equivalent Commands on IOS XE 3.6.X and 3.7.X Troubleshooting unicast forwarding
Troubleshooting Steps Commands
Check TCAM utilization show platform tcam utilization asic all
Check hardware MAC address table show platform matm macTable vlan #
show platform matm <H.H.H> vlan #
Check ip route in hardware show platform ip route switch X, show platform ip route summary
Check adjacency in hardware show platform ip adjacency switch X
Check ASIC level drops show platform fwd-asic drops exceptions
BRKCRS-3146 68
• Product Overview
• Image Management
• Troubleshooting Memory & CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• QoS Implementation and Troubleshooting
• Glimpse of future – IOS XE 16.X
• Summary
Agenda
QoS Implementation and Troubleshooting
In this section, you will learn about ...
. QOS implementation on Catalyst 3850/3650
. QOS Troubleshooting examples
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
QoS – What’s New with Catalyst 3850/3650
• Modular QoS based CLI (MQC)
• Alignment with 4500E series (Sup6, Sup7)
• Class-based Queuing, Policing, Shaping, Marking
• More Queues
• Up to 2P6Q3T queuing capabilities
• Standard 3750X provides 1P3Q3T
• Not limited to 2 queue-sets
• Flexible MQC Provisioning abstracts queuing hardware
Granular QoS control at the wireless edge
Tunnel termination allows customers to provide QoS treatment per SSIDs, per-Clients and common treatment of wired and wireless traffic throughout the network
Enhanced Bandwidth Management
Approximate Fair Drop (AFD) Bandwidth Management ensures fairness at Client, SSID and Radio levels for NRT traffic
Wireless Specific Interface Control
Policing capabilities Per-SSID, Per-Client upstream and downstream
AAA support for dynamic Client based QoS and Security policies
Per SSID Bandwidth Management
Wired Wireless
Policy-map PER-PORT-POLICING
Class VOIP
set dscp ef
police 128000 conform-action transmit
exceed-action drop
Class VIDEO
set dscp CS4
police 384000 conform-action transmit
exceed-action drop
Class SIGNALING
set dscp cs3
police 32000 conform-action transmit exceed-
action drop
Class TRANSACTIONAL-DATA
set dscp af21
Class class-default
set dscp default
BRKCRS-3146 71
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
QoS – What’s New with Catalyst 3850/3650Default Behavior Change
3750, With “mls qos” enabled at global level all the ports are untrusted and
DSCP/precedence/COS of the incoming packets are reset to 0
3750, “mls qos trust” is needed at the interface level to change the trust mode
3850, port is trusted by default, DSCP/precedence/COS values are retained
BRKCRS-3146 72
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3750 MLS QoS vs. 3850 MQC QoS3750 3850
Basic Structure MLS MQC
Global ConfigSupport mls qos
Support some of MQC at ingress
No mls qos support
Support MQC [class-map, policy-map]
Interface ConfigSupport mls qos config and some of MQC cli at
ingressAttach the policy to the interface
Port Ingress Classification/Policing/Marking/Queuing Classification/Policing/Marking
Port Egress Queueing Classification/Policing/Marking/Queuing
SVI Ingress Classification/Policing/Marking Classification/Marking
SVI Egress None Classification/Marking
3750 to 3850/3650 QoS conversion
BRKCRS-3146 73
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
3560-2# show access-lists QOS
Extended IP access list QOS
10 permit icmp host 192.168.30.1 host
192.168.30.2 dscp af11(5 matches)
QoS Example Verify Default trust mode on 3850
Gig2/0/7Gig0/7 Gig2/0/5 Gig0/5
3560-1 38503560-2
AF11=DSCP10=TOS 40
Access List QOS
Permit icmp host 192.168.30.1 host
192.168.30.2 dscp af11
3560-1# ping 192.168.30.2 repeat 5 tos 40
192.168.30.1192.168.30.2
All interfaces are
switchport mode trunk
With no explicit QoS config
BRKCRS-3146 74
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
TOS = 0
QoS Example Marking of packets to af11
Gig2/0/7Gig0/7 Gig2/0/5 Gig0/5
3560-1 38503560-2
AF11=DSCP10=TOS 40
access-list TEST
permit icmp host 192.168.30.1 host 192.168.30.2
class-map QOS
match access-group TEST
policy Map MARK-AF11
Class QOS
set dscp af11
interface gig2/0/7
service-policy input MARK-AF11192.168.30.1
3560-1# ping 192.168.30.2 repeat 5
3850#show platform software fed switch 2 qos policy target status
Loc Interface IIF-ID Dir State:(cfg,opr) Policy
--- ------------ ---------------- --- --------------- ---------------
GigabitEthernet1/0/1 0x00000000000008 OUT VALID,SET_INHW QoS
3560-2#show access-lists QOS
10 permit icmp host 192.168.30.1 host
192.168.30.2 dscp af11 (5 matches)
3850#show platform hardware fed switch 2 qos dscp-cos counters interface gigabitEthernet 2/0/7 | in DSCP0
Ingress DSCP0 5 0
Egress DSCP0 0 0
3850#show platform hardware fed switch 2 qos dscp-cos counters interface gigabitEthernet 2/0/5 | in DSCP10
Ingress DSCP10 0 0
Egress DSCP10 5 0
192.168.30.2
BRKCRS-3146 75
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
QoS – Case StudyProblem - LACP Portchannel does not come up
Gig2/0/1 Gig1/0/1
3850-1 3850-2
interface range GigabitEthernet2/0/1-2
switchport mode trunk
channel-protocol lacp
channel-group 1 mode active
service-policy output WIRED_EGRESS_QOS
!
policy Map WIRED_EGRESS_QOS
Class DSCP_VOICE
priority level 1
Class DSCP_CALL_SIGNALING
bandwidth remaining 20 (%)
queue-buffers ratio 20
Class class-default
bandwidth remaining 80 (%)
queue-buffers ratio 80
Gig2/0/2 Gig1/0/2interface range GigabitEthernet1/0/1-2
switchport mode trunk
channel-protocol lacp
channel-group 1 mode active
%EC-5-L3DONTBNDL2: Gi1/0/1 suspended: LACP currently not enabled on the remote port.
%EC-5-L3DONTBNDL2: Gi1/0/2 suspended: LACP currently not enabled on the remote port.
BRKCRS-3146 76
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
QoS – Case Study - EndSolution- LACP Portchannel does not come up
Gig2/0/1 Gig1/0/1
3850-1 3850-2
interface range GigabitEthernet2/0/1-2
switchport mode trunk
channel-protocol lacp
channel-group 1 mode active
service-policy output WIRED_EGRESS_QOS
!
Policy Map WIRED_EGRESS_QOS
Class DSCP_VOICE
priority level 1
Class DSCP_CALL_SIGNALING
bandwidth remaining 20 (%)
queue-buffers ratio 20
Class class-default
bandwidth remaining 80 (%)
queue-buffers ratio 80
Gig2/0/2 Gig1/0/2
interface range GigabitEthernet1/0/1-2
switchport mode trunk
channel-protocol lacp
channel-group 1 mode active
3850-1#show platform hardware fed switch 2 qos queue config interface Gi 2/0/1
DATA Port:21 GPN:1 AFD:Disabled QoSMap:1 HW Queues: 168 - 175
DrainFast:Disabled PortSoftStart:2 - 1440
----------------------------------------------------------
DTS Hardmax Softmax PortSMin GlblSMin PortStEnd
--- -------- -------- -------- --------- ---------
0 1 4 0 5 0 5 0 0 0 4 1920
1 1 4 0 8 240 7 160 3 60 4 1920
2 1 4 0 9 960 8 640 4 240 4 1920
3 1 4 0 5 0 5 0 0 0 4 1920
4 1 4 0 5 0 5 0 0 0 4 1920
5 1 4 0 5 0 5 0 0 0 4 1920
6 1 4 0 5 0 5 0 0 0 4 1920
7 1 4 0 5 0 5 0 0 0 4 1920
3850-1#show platform hardware fed switch 2 qos queue stats Gi 2/0/1
-------------------------------
Queue Buffers Enqueue-TH0 Enqueue-TH1 Enqueue-TH2
----- ------- ----------- ----------- -----------
0 0 0 0 0
1 0 0 0 452
2 0 0 0 37645
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
-------------------------------
Queue Drop-TH0 Drop-TH1 Drop-TH2 SBufDrop QebDrop
----- ----------- ----------- ----------- ----------- -----------
0 0 0 9393 0 0
1 0 0 0 0 0
2 0 0 0 0 0
LACP PDUs
dropped
Root cause
LACP PDUs egress through priority class –
DSCP_VOICE. As per current configuration there are
zero queue-buffers assigned to the priority class,
hence all traffic in this class will be dropped.
Solution
Assign queue-buffers to Class DSCP_VOICE.
Policy Map WIRED_EGRESS_QOS
Class DSCP_VOICE
priority level 1
queue-buffers ratio 5
Class DSCP_CALL_SIGNALING
bandwidth remaining 20 (%)
queue-buffers ratio 15
Class class-default
bandwidth remaining 80 (%)
queue-buffers ratio 80
On IOS XE 3.6.X and 3.7.X, check
show platform qos queue stats interface
BRKCRS-3146 77
• Product Overview
• Image Management
• Troubleshooting Memory & CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• Qos Implementation and Troubleshooting
• A glimpse at the future, IOS XE 16.X
• Summary
Agenda
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
One Single Release for all Enterprise Platforms
IOS XE 16.X – Powering Next-Gen Networks
Switches Wireless Routers
Unified Software Stack (IOS-XE 16.x)
O p e r a t i n g S y s t e mCLI, SNMP
IoX
UADPCPP
P l a t f o r m A S I C s / C P U
M a n a g e a b i l i t y
APIC-EMPrime Infra. WebUI SD-Access
BRKCRS-3146 79
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
WebUI - IOS XE 16.3.X
http://172.16.94.216/webui/http://172.16.94.216/webui/
• 172.16.94.216 is an IP address configured on the 3850
• Privilege 15 is for monitoring & configuration
• Privilege 1-14 (or omit privilege option) = monitoring only
config terminal
username <name> privilege 15 password<pass>
ip http server
ip http authentication local
BRKCRS-3146 80
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Guest Shell – IOS XE Denali 16.5.X
Guest Shell is a Linux container providing a standard Linux
environment for a user to run scripts/applications via Python
3850-Denali#config t
Enter configuration commands, one per line. End with CNTL/Z.
3850-Denali(config)#iox
3850-Denali(config)#exit
3850-Denali#guestshell enable
Management Interface will be selected if configured
Please wait for completion
Guestshell enabled successfully
3850-Denali#guestshell run bash
[guestshell@guestshell ~]$
[guestshell@guestshell ~]$ exit
exit
3850-Denali#guestshell run python flash:script_name.py ?
LINE <cr>
3850-Denali#
3850-Denali#guestshell run python
Python 2.7.5 (default, Jun 17 2014, 18:11:42)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more
information.
>>>
Also Supported…
• ZTP – Zero Touch Provisioning can retrieve a Python
script via DHCP at boot time
• EEM – Use Embedded Event Manager to trigger a
Python script in response to an event
DMI = Data Model Interface = Netconf/Yang interface
PnP = Plug N Play = Zero Touch provisioning
Virtual Services
Manager
Create a Linux
shell to run Linux
commands
Run a Python script
Start an interactive
Python interpreter
~2 minutes
BRKCRS-3146 81
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
IOSd Event-trace - IOS XE Denali
3850#show monitor event-trace ?adjacency Adjacency Events
all-traces Show all the event traces
arp ARP Events
c3pl Show group traces
cce Show group traces
cef Show CEF traces
cfd Crypto Fault Detection event trace
cfm Show group traces
checkpoint "Checkpoint debug"
continuous Display components which have
cpu-report display cpu-report
crypto Crypto traces
cts cts
datainteg Data integrity events
dmvpn DMVPN traces
eigrp Show EIGRP traces
epm Show group traces
fhrp Show FHRP traces
flexvpn FlexVPN event trace
flow Flow traces
hw-api HW-API Events
ifnum Show group traces
interprocess Interprocess event trace
ipv6 IPv6
link_oam Show group traces
lisp Show group traces
3850#show monitor event-trace arp all
*Apr 10 17:15:39.817: REPOP ADJ:
*Apr 10 17:15:40.418: IF ADDR: IF: GigabitEthernet0/0
*Apr 10 17:15:40.418: IF ADDR: IF: GigabitEthernet0/0
*Apr 10 17:15:41.565: FLUSH:
*Apr 10 17:15:41.798: IF UP: IF: Port-channel100
*Apr 10 17:15:41.842: ADD ENTRY: Link: IP A: 3.3.3.10 IF: Port-channel100
Mode: Interface
*Apr 10 17:15:41.877: ADD ENTRY: Link: IP VRF: Mgmt-vrf A: 172.16.94.216
IF: GigabitEthernet0/0 Mode: Interface
*Apr 10 17:15:41.879: IF DOWN: IF: Port-channel100
*Apr 10 17:15:41.879: IF ADDR: IF: Port-channel100
--snip--
Flight recorder – Refined list of messages that are too low level for a syslog
BRKCRS-3146 82
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Vision – Faster TroubleshootingContextual Troubleshooting isolates network issues faster
Challenges:
• User unsure of which process/feature to debug
• User ends up enabling debugs for all flows
Answer:
• Radioactive Tracing helps Conditional Logging
across Features & Processes
Traces
Path
Quickly
Administrator Cisco Support
Try turning on
traces for Feature
X, Process Y …Where are the
calls getting
dropped?
BRKCRS-3146 83
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Radio Active Tracing – Denali 16.3.X
3850# debug platform condition mac 0017.59BE.3A32
Enable Granular debugging on MAC Address
D46D.50AF.8DBD across CPU and Process boundaries
3850#show debug
<snip>
Conditional Debug Global State: Start
Conditions Direction
------------------------------|---------------------------
MAC Address 0017.59BE.3A32 N/A
Verify condition is set and
started for a given Mac Address
01/27 11:48:14.082 [dot1x] [17810]: UUID: 9800000000067, : [0017.59BE.3A32: Gi2/0/14] New client detected, sending session start event
for 0017.59BE.3A32
01/27 11:48:14.082 [sadb] [17810]: UUID: 9800000000067, : match record by attr:ATTR_DB, attr type:42
01/27 11:48:14.082 [sadb-attr] [17810]: UUID: 9800000000067, : No record found for aaa_type: 42, data: 0017.59be.3a32
01/27 11:48:14.082 [auth-mgr] [17810]: UUID: 9800000000067, : [0000.0000.0000:unknown] Record not found for attr_type 42
01/27 11:48:14.082 [auth-mgr] [17810]: UUID: 9800000000067, : [0017.59BE.3A32: Gi2/0/14] Session Start event called with conn_hdl 6, vlan:
0, identity: 0x7600051d
Traces automatically generated
Conditional Debug and Radioactive Tracing
BRKCRS-3146 84
• Product Overview
• Image Management
• Troubleshooting Memory & CPU
• Troubleshooting Stack & High Availability
• Troubleshooting Hardware Forwarding
• QoS Implementation and Troubleshooting
• A glimpse at the future, IOS XE 16.X
• Summary
Agenda
Summary
• Do you have a better understanding of:
• Key components of Catalyst 3850/3650 hardware and IOS XE
• How to baseline switch and detect anomalies
• Troubleshooting tools and techniques at your disposal
• Would you like to see:
• More/Less of any particular topic
• More topics
• Longer session
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Recommended Material
• BRKARC-3438 - Cisco Catalyst 3850 and 3650 Series Switching Architecture
• BRKCRS-3300 - IOS XE : Enabling the Digital Network Architecture
• BRKCRS-2813 - DNA Campus Fabric - Monitoring and Troubleshooting
• BRKCRS-2810 - DNA Campus Fabric Automation - A Look Under the Hood
• LTRCRS-2810 - DNA Campus Fabric Automation- Hands-on Lab
• Cisco Unified Access
• Cisco Digital Network Architecture (DNA)
BRKCRS-3146 87
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Give us your feedback to be entered into a Daily Survey Drawing. A daily winner will receive a $750 gift card.
• Complete your session surveys through the Cisco Live mobile app or on www.CiscoLive.com/us.
Complete Your Online Session Evaluation
Don’t forget: Cisco Live sessions will be available for viewing on demand after the event at www.CiscoLive.com/Online.
© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Lunch & Learn
• Meet the Engineer 1:1 meetings
• Related sessions
BRKCRS-3146 89
Closing Statement
Deploy Catalyst 3850/3650
with Confidence
Thank you