Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Troubleshooting Inter-PLC
Communication Latency Problem and
Understanding Allen-Bradley Producer-
Consumer Model
Preliminary ReportPreliminary Report
Johnny Tang
2/7/2005
Problem
Description
DTL_Vac:PLC3 DTL_Vac:PLC4 DTL_Vac:PLC5 DTL_Vac:PLC6
Lin_Ics:NetSw1c1 Lin_Ics:NetSw1d1Lin_Ics:NetSw1b1
Fa0/17 Fa0/34Fa0/34 Fa0/23
Tank 3 Tank 4 Tank 5
SGV4 SGV5
Tank 6
SGV6
• The SGV4 and SGV6 will be closed with no
vacuum pressure trip indication, reported by
operation
• In SNS DTL vacuum control system, each
Tank vacuum system is controlled by one
Lin_Ics:NetSw11
Gio/5
Gio/6Gio/7
Ics:CoreSw1 Ics:CoreSw2
Tank vacuum system is controlled by one
Allen-Bradley PLC; the producer-consumer
real-time IO tags are utilized to pass up-
stream/down-stream tank vacuum pressure
status in between the PLCs.
• Based on the vacuum pressure status, PLC
determines if its SGV should be in closed
position for its tank vacuum protection.
• For fail-safe reason, PLC will assume the
worst case if it senses that its producer-
consumer tag for real-time IO is not
connected with the remote tag -> close the
valve
Initial Diagnosis: Is this latency only occurring in between
two PLCs that are not on the same edge switches
Catalyst2950Catalyst2950 Catalyst2950
Catalyst3550
Catalyst6500
Initial Diagnosis
DTL_Vac:PLC3 DTL_Vac:PLC4 DTL_Vac:PLC5 DTL_Vac:PLC6
Lin_Ics:NetSw1c1 Lin_Ics:NetSw1d1Lin_Ics:NetSw1b1
Fa0/17 Fa0/34Fa0/34 Fa0/23
Tank 3 Tank 4 Tank 5
SGV4 SGV5
Tank 6
SGV6
• Moved PLC4 onto PLC3’s edge switch;
verified that the latency in between PLC3 and
PLC4 disappeared, but is occurring in
between PLC4 and PLC5 instead
SOME BACKGROUND INFO
Lin_Ics:NetSw11
Gio/5
Gio/6Gio/7
Ics:CoreSw1 Ics:CoreSw2
SOME BACKGROUND INFO
• On Ethernet, producer-consumer model is
accomplished using multicast. Multicast will
flood all ports on a switch unless it has the
IGMP (Internet Group Management Protocol)
snooping feature. This feature enables a
switch to forward multicast packets to only the
ports where they will be consumed.
• Ethernet modules that are not a member of
a specific multicast group can avoid
processing packets that it does not need to
process
Further Diagnosis: Is the multicasting delay ?
Further
Diagnosis
Setup
DTL_Vac:PLC3 DTL_Vac:PLC4 DTL_Vac:PLC5 DTL_Vac:PLC6
Lin_Ics:NetSw1c1 Lin_Ics:NetSw1d1Lin_Ics:NetSw1b1
Fa0/17 Fa0/34Fa0/34 Fa0/23
Tank 3 Tank 4 Tank 5
SGV4 SGV5
Tank 6
SGV6
Objectives of Further Diagnosis
• Understand the inter-PLC communication
patterns and timing in data packet level
• Gathering information for Allen-Bradley and
Hub1Hub2Producer
-sniffer
Consumer-
sniffer
Lin_Ics:NetSw11
Gio/5
Gio/6Gio/7
Ics:CoreSw1 Ics:CoreSw2
• Gathering information for Allen-Bradley and
Cisco tech support
• Two network analyzers are configured:
Producer-sniffer and Consumer-sniffer
• Three Cisco switch debug setup during the
event periods
Acknowledgement: Susan Hicks and Jimie
Duncun of ORNL Network Group – Great deal
of experience on network sniffering, data
frame analysis; knowledge of Network Switch
configuration on multicast forwarding, IGMP
snooping setup
Observation 1: From Consumer-
sniffer Point of View
• Consumer-sniffer sees PLC3 (.18) or the producer sending its multicasting data to (.65) and (.64) each every 100ms
until the IP Data Frame, 0x8C01
• After 500ms delay from the IP Data Frame, 0x8C01, PLC4 (.19) sends to PLC3 (.18) a Data Request via EIP/CIP
unicast (because PLC4 didn’t see PLC3 producing)
IS PLC3 multicasting blocked or is it stopping producing ???
Oberservation 2: From
Producer-sniffer Point of View
• Producer-sniffer sees PLC3 (.18) or the producer continuing sending its multicasting data to (.65) and (.64) another 5
frames each every 100ms after the IP Data Frame, 0x8C01
• After that, it sees that PLC4 (.19) sends to PLC3 (.18) a Data Request via EIP/CIP unicast
The consumer (PLC4) didn’t see its producer’s (PLC3’s) multicasting for 10 data
frames before the delay event; Multicasting was blocked !
Observation 3: From Consumer-
sniffer Point of View
• A different MAC address (00-0B-46-39-5D-84) for the querier (the router 172.31.72.3) when it sends IGMP membership
QUERY to the PLC4 multicast address (239.192.67.65 and .64) – this only happens when this delay event happens.
Observation 4: From Switch
Debug Monitoring Point of View
The multicast routers will query specific multicast addresses before they are dropped from
the cache or they are “pruned” to keep multicast traffic to a minimum – validation from Cisco
• Should use f/w 3.004.2 for 1756-ENBT for IGMP v2 support
• Multicast for EtherNet/IP includes a parameter in the IP layer which is currently set to
a hard value of 1 in the firmware. When a router processes a packet with a TTL=1, it
will decrement the TTL value and then, because it is 0, will drop the packet. This
means a TTL of 1 will restrict I/O or produce tag to a single subnet (VLAN)
• Allen-Bradley uses 2950 switch as mid-layer and could not reproduce our problem;
3550 is layer 2/3 switch; please verify the configuration of that switch with Cisco
• If consumer doesn’t see its producer’s multicast for 4 time of the defined RPI
Recommendations from
Rockwell Software
• If consumer doesn’t see its producer’s multicast for 4 time of the defined RPI
(Requested Packet Interval), it will drop its multicast connection and re-initiate its
DATA request -> Lesson learned on fail safe consideration: what is the worst
cast for 100ms RPI tag ?
• Allen-Bradley EtherNet/IP module has its potential problem if there are more than one
queriers in the subnet
Problem still exists even after upgrading the 1756-ENBT new
firmware; we therefore move our focus on 3550 multilayer switch
• Remove multicast routing from the core switches to simplify the environment unless it
is necessary
• Configuration of IGMP querier function on core switch
• Use Cisco IOS Release 12.2 (25) SEA for the Catalyst 3550 Multilayer Switch (see
release note for detail
– Release Notes for the Catalyst 3550 Multilayer Switch, Cisco IOS Release 12.2(25)SEA
– Documentation Updates for the Catalyst 3550 Switches, Cisco IOS Release 12.2 (25) SEA
Recommendations from Cisco
The problem has been resolved after the Cisco
recommendation has been applied
• If there is a mix of fast and slow consumers to scan one produce tag, the fastest consume rate dictates the rate for the producer – keep them all the same rate
• TTL (Time-to-Live) is hard coded with 1; no multicast routing for now – keep application in one subnet for time critical interlock
• 5000 is the current max pps (packets per second) for each interface (1756-ENBT or 1794-AENT)
• The producer generates a multicast packet once every RPI; each consumer generates a unicast packet every RPI
Recap – Lessons Learned on
understanding Allen-Bradley Producer-
Consumer and Multicast
generates a unicast packet every RPI
• For each producer, an entire block of 32 addresses is reserved whether all the addresses are used or not
• One multicast address for all discrete modules; one multicast address for each of the 2 analog modules
• The consumer will be timeout after 4 times of RPI of its produce tag if the connection is lost – Fail Safe Consideration !
• Testing Single IGMP querier configuration in redundant router environment
• Upgrading all 3550 F/W where the multicast applications are applied
• A proposed Lessons Learned Workshop
– Core, Closet and Edge Switch Configurations on multicast, IGMP and QoS in support
applications for Real-Time IO via Ethernet (by Susan Hicks)
– Network Analysis and Filtering Techniques – What to look for among over millions of data
packets (by Jimie Duncan)
Follow-up Work