Troubleshooting Inter-PLC Communication Latency Problem and Understanding Allen ...fiber.hardfree.net/2011/open_data/PLC-PLC-multicast-fail... · 2011-01-20 · Troubleshooting Inter-PLC

Troubleshooting Inter-PLC

Communication Latency Problem and

Understanding Allen-Bradley Producer-

Consumer Model

Preliminary ReportPreliminary Report

Johnny Tang

2/7/2005

Problem

Description

DTL_Vac:PLC3 DTL_Vac:PLC4 DTL_Vac:PLC5 DTL_Vac:PLC6

Lin_Ics:NetSw1c1 Lin_Ics:NetSw1d1Lin_Ics:NetSw1b1

Fa0/17 Fa0/34Fa0/34 Fa0/23

Tank 3 Tank 4 Tank 5

SGV4 SGV5

Tank 6

SGV6

• The SGV4 and SGV6 will be closed with no

vacuum pressure trip indication, reported by

operation

• In SNS DTL vacuum control system, each

Tank vacuum system is controlled by one

Lin_Ics:NetSw11

Gio/5

Gio/6Gio/7

Ics:CoreSw1 Ics:CoreSw2

Tank vacuum system is controlled by one

Allen-Bradley PLC; the producer-consumer

real-time IO tags are utilized to pass up-

stream/down-stream tank vacuum pressure

status in between the PLCs.

• Based on the vacuum pressure status, PLC

determines if its SGV should be in closed

position for its tank vacuum protection.

• For fail-safe reason, PLC will assume the

worst case if it senses that its producer-

consumer tag for real-time IO is not

connected with the remote tag -> close the

valve

Initial Diagnosis: Is this latency only occurring in between

two PLCs that are not on the same edge switches

Catalyst2950Catalyst2950 Catalyst2950

Catalyst3550

Catalyst6500

Initial Diagnosis



Fa0/17 Fa0/34Fa0/34 Fa0/23


SGV4 SGV5

Tank 6

SGV6

• Moved PLC4 onto PLC3’s edge switch;

verified that the latency in between PLC3 and

PLC4 disappeared, but is occurring in

between PLC4 and PLC5 instead

SOME BACKGROUND INFO

Lin_Ics:NetSw11

Gio/5

Gio/6Gio/7


SOME BACKGROUND INFO

• On Ethernet, producer-consumer model is

accomplished using multicast. Multicast will

flood all ports on a switch unless it has the

IGMP (Internet Group Management Protocol)

snooping feature. This feature enables a

switch to forward multicast packets to only the

ports where they will be consumed.

• Ethernet modules that are not a member of

a specific multicast group can avoid

processing packets that it does not need to

process

Further Diagnosis: Is the multicasting delay ?

Further

Diagnosis

Setup



Fa0/17 Fa0/34Fa0/34 Fa0/23


SGV4 SGV5

Tank 6

SGV6

Objectives of Further Diagnosis

• Understand the inter-PLC communication

patterns and timing in data packet level

• Gathering information for Allen-Bradley and

Hub1Hub2Producer

-sniffer

Consumer-

sniffer

Lin_Ics:NetSw11

Gio/5

Gio/6Gio/7


• Gathering information for Allen-Bradley and

Cisco tech support

• Two network analyzers are configured:

Producer-sniffer and Consumer-sniffer

• Three Cisco switch debug setup during the

event periods

Acknowledgement: Susan Hicks and Jimie

Duncun of ORNL Network Group – Great deal

of experience on network sniffering, data

frame analysis; knowledge of Network Switch

configuration on multicast forwarding, IGMP

snooping setup

Observation 1: From Consumer-

sniffer Point of View

• Consumer-sniffer sees PLC3 (.18) or the producer sending its multicasting data to (.65) and (.64) each every 100ms

until the IP Data Frame, 0x8C01

• After 500ms delay from the IP Data Frame, 0x8C01, PLC4 (.19) sends to PLC3 (.18) a Data Request via EIP/CIP

unicast (because PLC4 didn’t see PLC3 producing)

IS PLC3 multicasting blocked or is it stopping producing ???

Oberservation 2: From

Producer-sniffer Point of View

• Producer-sniffer sees PLC3 (.18) or the producer continuing sending its multicasting data to (.65) and (.64) another 5

frames each every 100ms after the IP Data Frame, 0x8C01

• After that, it sees that PLC4 (.19) sends to PLC3 (.18) a Data Request via EIP/CIP unicast

The consumer (PLC4) didn’t see its producer’s (PLC3’s) multicasting for 10 data

frames before the delay event; Multicasting was blocked !

Observation 3: From Consumer-

sniffer Point of View

• A different MAC address (00-0B-46-39-5D-84) for the querier (the router 172.31.72.3) when it sends IGMP membership

QUERY to the PLC4 multicast address (239.192.67.65 and .64) – this only happens when this delay event happens.

Observation 4: From Switch

Debug Monitoring Point of View

The multicast routers will query specific multicast addresses before they are dropped from

the cache or they are “pruned” to keep multicast traffic to a minimum – validation from Cisco

• Should use f/w 3.004.2 for 1756-ENBT for IGMP v2 support

• Multicast for EtherNet/IP includes a parameter in the IP layer which is currently set to

a hard value of 1 in the firmware. When a router processes a packet with a TTL=1, it

will decrement the TTL value and then, because it is 0, will drop the packet. This

means a TTL of 1 will restrict I/O or produce tag to a single subnet (VLAN)

• Allen-Bradley uses 2950 switch as mid-layer and could not reproduce our problem;

3550 is layer 2/3 switch; please verify the configuration of that switch with Cisco

• If consumer doesn’t see its producer’s multicast for 4 time of the defined RPI

Recommendations from

Rockwell Software

• If consumer doesn’t see its producer’s multicast for 4 time of the defined RPI

(Requested Packet Interval), it will drop its multicast connection and re-initiate its

DATA request -> Lesson learned on fail safe consideration: what is the worst

cast for 100ms RPI tag ?

• Allen-Bradley EtherNet/IP module has its potential problem if there are more than one

queriers in the subnet

Problem still exists even after upgrading the 1756-ENBT new

firmware; we therefore move our focus on 3550 multilayer switch

• Remove multicast routing from the core switches to simplify the environment unless it

is necessary

• Configuration of IGMP querier function on core switch

• Use Cisco IOS Release 12.2 (25) SEA for the Catalyst 3550 Multilayer Switch (see

release note for detail

– Release Notes for the Catalyst 3550 Multilayer Switch, Cisco IOS Release 12.2(25)SEA

– Documentation Updates for the Catalyst 3550 Switches, Cisco IOS Release 12.2 (25) SEA

Recommendations from Cisco

The problem has been resolved after the Cisco

recommendation has been applied

• If there is a mix of fast and slow consumers to scan one produce tag, the fastest consume rate dictates the rate for the producer – keep them all the same rate

• TTL (Time-to-Live) is hard coded with 1; no multicast routing for now – keep application in one subnet for time critical interlock

• 5000 is the current max pps (packets per second) for each interface (1756-ENBT or 1794-AENT)

• The producer generates a multicast packet once every RPI; each consumer generates a unicast packet every RPI

Recap – Lessons Learned on

understanding Allen-Bradley Producer-

Consumer and Multicast

generates a unicast packet every RPI

• For each producer, an entire block of 32 addresses is reserved whether all the addresses are used or not

• One multicast address for all discrete modules; one multicast address for each of the 2 analog modules

• The consumer will be timeout after 4 times of RPI of its produce tag if the connection is lost – Fail Safe Consideration !

• Testing Single IGMP querier configuration in redundant router environment

• Upgrading all 3550 F/W where the multicast applications are applied

• A proposed Lessons Learned Workshop

– Core, Closet and Edge Switch Configurations on multicast, IGMP and QoS in support

applications for Real-Time IO via Ethernet (by Susan Hicks)

– Network Analysis and Filtering Techniques – What to look for among over millions of data

packets (by Jimie Duncan)

Follow-up Work

Documents

Troubleshooting Inter-PLC Communication Latency Problem and Understanding Allen ...fiber.hardfree.net/2011/open_data/PLC-PLC-multicast-fail... · 2011-01-20 · Troubleshooting Inter-PLC