172254

7/29/2019 172254

http://slidepdf.com/reader/full/172254 1/435

Lucent Technologies—ProprietaryThis document contains proprietary information of

Lucent Technologies and is not to be disclosed or usedexcept in accordance with applicable agreements.

Copyright © 2000 Lucent TechnologiesUnpublished and Not for Publication

All rights Reserved

Issue 16.0December 2000

401-661-045

Flexent™/AUTOPLEX®

Wireless NetworksExecutive Cellular Processor (ECP)

Release 16.0Common Network Interface (CNI)Ring Maintenance

7/29/2019 172254


Lucent Technologies—ProprietarySee notice on first page

Copyright © 2000 Lucent Technologies

All Rights Reserved

This material is protected by the copyright laws of the United States and other countries. It may not bereproduced, distributed, or altered in any fashion by any entity including other Lucent Technologies

business units or divisions without the expressed written consent of the Customer Training andInformation Products Department.

NoticeEvery effort was made to ensure that the information in this document was complete and accurate atthe time of printing. However, information is subject to change.

Federal Communications Commission Statement (FCC) Notification and Repair

InformationNOTE: This equipment has been tested and found to comply with the limits for a Class A digital device,pursuant to Part 15 of the FCC Rules. These limits are designed to provide reasonable protectionagainst harmful interference when the equipment is operated in a commercial environment. Thisequipment generates, uses, and can radiate radio frequency energy, and if not installed and used inaccordance with the instruction manual, may cause harmful interference to radio communications.

Operation of this equipment in a residential area is likely to cause harmful interference in which casethe user will be required to correct the interference at his/her own expense.

Security StatementIn rare instances, unauthorized individuals make connections to the telecommunications networkthrough the use of remote access features.

In such event, applicable tariffs require that the customer pay all network charges for traffic. LucentTechnologies cannot be responsible for such charges and will not make any allowance or give anycredit for charges that result from unauthorized access.

Trademarks5ESS is a registered trademark of Lucent Technologies.AUTOPLEX is a registered trademark of Lucent Technologies.

AutoPACE is a registered trademark of Lucent Technologies.BILLDATS is a registered trademark of Lucent Technologies.DEFINITY is a registered trademark of Lucent Technologies.DOS Windows is a trademark of Sun Microsystems, Inc.Informix is a registered trademark of Informix Software, Inc.Intel is a registered trademark of the Intel Corporation.Motorola is a registered trademark of the Motorola Corporation.Paradyne is a trademark of Paradyne Corporation.Sun is a trademark of Sun Microsystems, Inc.Solaris is a trademark of Sun Microsystems, Inc.SPARC is a trademark of Sun Microsystems, Inc.UNIX is a registered trademark in the United States and other countries, licensedexclusively through X/Open Company Ltd.

Other trademarks may appear in this document as well. They are marked on first usage.

7/29/2019 172254


Issue 16.0 December 2000 iii


Contents

About This Document xv

s Purpose xv

s Reasons for Reissue xv

s Intended Audience xvi

s How to Use This Document xvi

s Conventions Used xvii

s Product Safety Labels xvii

s How to Order Documentation xviii

s How to Comment on This Document xix

1 Overview of the CNI Ring 1-1s DSN/CSN/ICN Hardware Descriptions 1-1

s CDN Hardware Description 1-2

CDN 1-3

CDN-I 1-3

CDN-II 1-4

CDN-IIx 1-4

CDN-III 1-5

s RPCN Hardware Description 1-5

s Direct Link Node Hardware Description 1-6

s SS7 Node Hardware Description 1-6

s EIN Ethernet Interface Node 1-6

s CNI Integrity Process Descriptions 1-7

s Error Analysis and Recovery Process 1-7

s Automatic Ring Recovery Process 1-7

s Node Audit Capability 1-8

s Ring Audit Capability 1-8

s RPCN Token Audit 1-8

s CNI Safety Net Capability 1-9

Inhibiting CNI Safety Net 1-9

Allowing CNI Safety Net Feature 1-10

s General Maintenance 1-10

Daily Activity Recommendation 1-10

Faulty Node Recovery Strategy 1-11

Routine Diagnostics 1-11

s Fault Descriptions 1-12

7/29/2019 172254


iv Issue 16.0 December 2000


401-661-045

ContentsRAC Parity/Format Error 1-12

Unexplained Loss of Token 1-17

SRC Match 1-21

RAC Output Parity Error 1-27General RAC Error Detected 1-30

Node Audit Failure 1-32

Interframe Buffer Parity Error 1-35

Read Format Error 1-38

Write Format Error” 1-39

s Emergency Maintenance 1-41

Ring Down Recovery 1-41

Rolling CNI Initializations 1-41

Global CDN Recovery 1-47

Single CDN Recovery 1-48

2 Description of the Ring Subsystem 2-1

s General 2-1

s Operation of the Ring 2-3

s Ring Nodes 2-5

Ring Peripheral Controller Nodes 2-6

Basic IMS User Nodes 2-6

Direct Link Nodes (DLN) 2-7

Call Processor/Data Base Nodes (CDN) 2-7Interframe Buffers 2-9

s Node Names and Addresses 2-10

s Ring Message Format 2-11

s Reconfigurations 2-13

Node Quarantine 2-13

Node Isolation 2-13

The Ring Config Module 2-16

s Initializations 2-17

Level-3 IMS Initializations (FPI and Boot) 2-18

Level-4 IMS Initializations (FPI and Boot) 2-19

s Audits 2-20Central Node Control Audit (AUD CNC) 2-20

Node State Audit (AUD NODEST) 2-20

Node Audit 2-21

7/29/2019 172254


Issue 16.0 December 2000 v


Contents

3 Ring Maintenance 3-1

s Overview 3-1

s Automatic Ring Maintenance 3-3

EAR or Ring Recovery 3-3

ARR or Deferrable Node Recovery 3-11

s Manual Ring Maintenance 3-25

Ring Maintenance Interfaces 3-25

Ring Diagnostics 3-36

Guide to Critical Ring Maintenance 3-39

s Examples of Ring Maintenance 3-66

Responses to Single, Ring-Related Faults 3-67

Responses to Multiple, Ring-Related Faults 3-85

4 Ring and Ring Node MaintenanceProcedures 4-1

s Introduction 4-1

s Ring Fault Conditions and Maintenance Approach 4-3

Ring Node Out-of-Service 4-3

Single-Ring Node Isolation 4-6

Multiple-Ring Node Isolation 4-11

Ring Down 4-19

s Ring Generic Access Package (RGRASP) 4-21

Feature Definition 4-21

Feature Description 4-21

Software Impact 4-22

Software Description 4-22

User Profile 4-22

Description of Feature Operation 4-22

Equipment Configuration Data (ECD) 4-25

Recent Change Procedures 4-25

Measurement 4-25

Network Management Impact 4-25

Maintenance/Troubleshooting Impact 4-25

Recording 4-26

Output Messages 4-29

Audits 4-30

7/29/2019 172254


vi Issue 16.0 December 2000


401-661-045

ContentsCritical Events 4-30

Support Tools 4-30

Related Documentation Cross-References 4-30

5 Ring Critical Events 5-1

s Introduction 5-1

s Critical Event Message Output 5-2

Logging Critical Events 5-2

Short Form CNCE Message 5-3

Long Form CNCE Message 5-3

Using the CHG:CEPARM Command 5-4

s

CNCE Descriptions 5-4

6 Diagnostic User’s Guide 6-1

s Introduction 6-1

s Overview 6-1

Diagnostics 6-1

Hardware and Interfaces 6-2

System Maintenance Interfaces 6-5

s Performing Diagnostics 6-6

Diagnostic Message Structure 6-6

System Diagnostics 6-8

Denied Diagnostic Requests 6-72

Inhibiting Diagnostic Requests 6-73

Diagnostic Aborts and Audits 6-73

s Operating System Diagnostics 6-75

7 Equipment Handling Procedures 7-1

s Introduction 7-1s Equipment Description and Handling Precautions 7-1

Power Packs and Fusing Descriptions 7-2

Fan and Filter Maintenance 7-13

s Ring Node Circuit Pack Handling Precautions 7-16

7/29/2019 172254


Issue 16.0 December 2000 vii


ContentsRing Node Equipment Visual Indicators 7-17

Removing Affected Equipment From Service 7-17

UN122C and UN123B Combination Circuit Pack

Installation 7-23Voice Frequency Link Hardware Equipment

Replacement Procedures 7-28

A Ring Error Analysis and Recovery A-1

s Introduction A-1

s Data Structures A-1

s General Information A-2

s Blockage Error A-3

s Hard Ring Parity Errors A-6

s Orphan Byte Error A-8

s Soft Ring Parity Error A-10

s Interframe Buffer Parity Error A-12

s RAC Output Parity Error A-14

s Write Format Error A-16

s Read Format Error A-18

s Received Too Short Error A-20

s Read Inhibit Error A-21

s Excessive Ring Command Interrupts A-23

s Token Removed from Ring A-25

s Source Match Error A-26

s Miscellaneous RAC Problem A-28

s Unexpected Loss of Token A-30

s Checksum Audit Failure A-30

s Node Processor Parity Failure A-31

B Ring Maintenance Reference Material B-1

s Ring Transport Errors B-1

Ring-Related Errors B-1Node-Related Errors B-3

Errors Without Consequences B-4

Unexplained Loss of Token B-5

s Some IMS Input Messages B-5

7/29/2019 172254


viii Issue 16.0 December 2000


401-661-045

Contentss Setting the ECD Flag for Manual Ring Mode B-6

s ECD Values for Interframe Buffers B-7

7/29/2019 172254


Issue 16.0 December 2000 ix


Figures

1 Overview of the CNI Ring 1-1

1-1. RAC Parity/Format Error 1-14

1-2. Unexplained Loss of Token 1-19

1-3. SRC Match 1-23

1-4. RAC Output Parity Error 1-29

1-5. General RAC Error 1-31

1-6. NAUD Failure 1-33

1-7. Interframe Buffer Error 1-37

1-8. Ring Down 1-43


2-1. Conceptual Illustration of an IMS Ring 2-2

2-2. A Ring Access Circuit on the IMS Ring 2-4

2-3. Interframe Buffers 2-9

2-4. IMS Message Format 2-11

2-5. Illustration of an Isolated Ring 2-14

2-6. Before (top) and After (bottom) Becoming a BISOor EISO Node 2-15


3-1. A 1105 Display Page 3-29

3-2. An 1106 Display Page 3-33

3-3. Isolated RACs of BISO and EISO Nodes 3-48

3-4. Manual Recovery - Method One 3-78

3-5. Manual Recovery - Method Two 3-79

4 Ring and Ring Node Maintenance Procedures 4-1

4-1. Ring OOS Normal 4-4

4-2. Single Node Isolation 4-8

7/29/2019 172254


x Issue 16.0 December 2000

401-661-045

Figures


4-3. New BISO Established 4-9

4-4. Diagnosing EISO Node 4-10

4-5. Two or More Faulty Nodes 4-14

4-6. New BISO Node 4-16

4-7. More Than One Faulty Node 4-18


5-1. CNCE Messages 5-3


6-1. General Format for Input/Output Messages 6-7


7/29/2019 172254


Issue 16.0 December 2000 xi

Tables


1 Overview of the CNI Ring 1-1



3-1. Node Problems Mapped to Maintenance States and EAR

Actions 3-17

3-2. ARR Responses to Maintenance-States 3-21

3-3. Output Messages that Report ARR Actions 3-23

3-4. Alarms Associated with IMS Output Messages 3-27

3-5. 1105-Page Symbols of Node Major States 3-31

3-6. Circuit Pack LED States 3-44

4 Ring and Ring Node Maintenance Procedures 4-1


5-1. CNCE Descriptions 5-5


6-1. Discontinued Availability CP Listings 6-3

6-2. DGN Message Input Variations 6-8

6-3. OP:RING Input Message Variations 6-9

6-4. IRN and IRN2 RPCN Node Diagnostic Phases 6-10

6-5. IRN LN (LIN - E/SS7) Node Diagnostic Phases 6-11

http://-/?-

http://-/?-

7/29/2019 172254


xii Issue 16.0 December 2000

401-661-045

Tables


6-6. IRN LN (LI4S/SS7) Node Diagnostic Phases 6-12

6-7. IRN DLNE Node Diagnostic Phases 6-14

6-8. IRN2 DLN30 Node Diagnostic Phases 6-15

6-9. IRN2 DLN60 Node Diagnostic Phases 6-17

6-10. IRN CDN-I Diagnostic Phases 6-18

6-11. IRN2 CDN-II/CDN-IIx Diagnostic Phases 6-20

6-12. IRN2 CDN-III Diagnostic Phases 6-22

6-13. IRN2 EIN Node Diagnostic Phases 6-23

6-14. IRN MDL (SCN, DSN, ICN) Diagnostic Phases 6-24

6-15. Discontinued Availability CP Listings 6-25

6-16. IRN and IRN2 RPC Trouble Location CP List 6-25

6-17. IRN LN (LIN-E/SS7) Trouble Location CP List 6-27

6-18. IRN LN (LI4S/SS7) Trouble Location CP List 6-28

6-19. IRN DLNE Trouble Location CP List 6-30

6-20. IRN2 DLN30 Trouble Location CP List 6-32

6-21. IRN2 DLN60 Trouble Location CP List 6-33

6-22. IRN CDN-I Manual Trouble Location CP List 6-34

6-23. IRN2 CDN-II/CDN-IIx Manual Trouble Location CP List 6-37

6-24. IRN2 CDN-III Trouble Location CP List 6-38

6-25. IRN2 EIN Node Trouble Location CP List 6-39

6-26. IRN MDL (CSN, DSN, ICN) Trouble Location CP List 6-40

6-27. Physical Node ID (Decimal Representation) 6-44

6-28. Physical Node ID (Hexadecimal Representation) 6-47

6-29. Physical Node Addresses (Decimal Representation) 6-50

6-30. Physical Node Addresses (Hexadecimal Representation) 6-53


7-1. Power Unit Index 7-3

7-2. Ring Node Power Supply Index 7-21

7-3. Hardware Version Values (with IFB) 7-25

7-4. Hardware Version Values (No IBF) 7-27

7/29/2019 172254


Issue 16.0 December 2000 xiii

Tables


A Ring Error Analysis and Recovery A-1

B Ring Maintenance Reference Material B-1

B-1. Some Versions of the RST Input Message B-5

7/29/2019 172254


xiv Issue 16.0 December 2000

401-661-045

Tables


7/29/2019 172254


Issue 16.0 December 2000 xv


About This Document

This chapter gives an overview of the contents, intended audience, and use of theFlexent™/AUTOPLEX ® Wireless Network Systems Common Network Interface

(CNI) Ring Maintenance manual.

Purpose

This guide gives you the instructions to maintain and troubleshoot the CNI Ring asused in a Flexent™/AUTOPLEX ® wireless network.

NOTE:This document is not intended for use with the 5ESS ® Digital Cellular Switch(DCS) component of a Flexent™/AUTOPLEX ® wireless network. The 5ESS ®

DCS documentation should be used for ring maintenance.

Reasons for Reissue

Issue 16 is reissued for the following reasons:

s To correct erroneous information

s To revise any technical errorss To make quality improvements

7/29/2019 172254


xvi Issue 16.0 December 2000


401-661-045

Intended Audience

The audience for this guide includes users who maintain the CNI r ing. This may

be the Lucent Technologies support personnel (CTSO) or the cellular provider’stechnicians.

How to Use This Document

This guide is organized as follows:

s Chapter 1—Overview of the CNI Ring

Describes the components of a CNI ring.

s Chapter 2—Description of the Ring Subsystem

Describes the ring subsystem.

s Chapter 3—Ring Maintenance

Explains the maintenance philosophy behind the CNI ring.

s Chapter 4—Ring and Ring Node Maintenance Procedures

Explains how to run the maintenance procedures for both the ring and thering nodes.

s Chapter 5—Ring Critical Events

Explains events that indicate abnormal behavior in the r ing.

s Chapter 6—Diagnostic User’s Guide

Explains how to perform diagnostics on ring nodes for a CNI ring-basedoffice.

s Chapter 7—Equipment Handling Procedures

Describes how to handle equipment when replacing hardware on the CNI

ring.

s Appendix A—Ring Error Analysis and Recovery

Describes the ring error analysis and recovery procedures and

mechanisms.

s Appendix B—Ring Maintenance Reference Material

Contains material in reference to maintaining the CNI ring.

s Glossary and Acronyms

s Index

7/29/2019 172254


Issue 16.0 December 2000 xvii


About This Document

Conventions Used

Specific typography is used in this guide to show actions or results.

Commands you enter on the keyboard are shown in

bold

Data screens or responses from the system are shown in

constant width

Options for commands are shown in

italics

Keys that must be pressed on your keyboard are shown in

ENTER

Product Safety Labels

Admonishments are strategically-placed reminders that assure safety ofpersonnel, minimize service interruptions or loss of data, and minimize damage toequipment, products, or software. The types of admonishments used in this guide

are listed below.

! DANGER:

Indicates the presence of a hazard that will cause death or severe personal injury if the hazard is not avoided.

! WARNING:Indicates the presence of a hazard that can cause death or severe personal injury if the hazard is not avoided.

! CAUTION:Indicates the presence of a hazard that will or can cause minor personal injury or property damage if the hazard is not avoided.

NOTE:Notifies you that something needs special attention or consideration.

7/29/2019 172254


xviii Issue 16.0 December 2000


401-661-045

How to Order Documentation

The FLEXENT™/AUTOPLEX ® Wireless Network Systems Customer

Documentation Catalog (401-610-000) is a guide to all FLEXENT™/AUTOPLEX® Wireless Network Systems customer documents and includes document

descriptions and ordering information.

To order FLEXENT™/AUTOPLEX ® Wireless Network Systems documents,

including documents on CD-ROM, and all other Lucent Technologies productdocumentation by phone, please use the following numbers:

Within the United States:

Voice: 1-888-LUCENT8 or 1-888-582-3688, prompt 1FAX: 1-800-566-9568

7/29/2019 172254


Issue 16.0 December 2000 xix


About This Document

Locations outside of the United States:

Australia and all European countries: (317) 322-6416Asia Pacific and China: (317) 322-6411

North America (excluding U.S.) and all other countries: (317) 322-6646

FAX for all international customers: (317) 322-6699

Product documentation can be ordered by mail using this address:

Lucent Technologies Customer Information CenterAttention: Order Entry Section

2855 N. Franklin RoadP.O. Box 19901Indianapolis, Indiana 46219

U.S.A.

To order documentation electronically, visit the Lucent Technologies CustomerInformation Center web site at:

http://www.cic.lucent.com

How to Comment on This Document

Lucent Technologies has endeavored to ensure that this document meets yourneeds. We are interested in your suggestions for improving the document. At the

back of this document is a postage-paid comment card. Please complete thecomment card and mail it to us at the preprinted address. If your copy of the

document has no comment card, please specify the title of the document and mailyour comments to

Lucent Technologies1000 E. Warrenville RoadP.O Box 3013

Naperville, Illinois 60566-7013U.S.A.

Attn: Customer Training and Information Products Manager—Room 2V-120

or e-mail your comments to

[email protected]

7/29/2019 172254


xx Issue 16.0 December 2000


401-661-045

7/29/2019 172254


Contents

Issue 16.0 December 2000 1-i


1

Overview of the CNI Ring

DSN/CSN/ICN Hardware Descriptions 1-1

CDN Hardware Description 1-2s CDN 1-3

s CDN-I 1-3

Double Plate CDN-I 1-4

Single Plate CDN-I 1-4

s CDN-II 1-4

s CDN-IIx 1-4

s CDN-III 1-5

RPCN Hardware Description 1-5

Direct Link Node Hardware Description 1-6

SS7 Node Hardware Description 1-6

CNI Integrity Process Descriptions 1-6

Error Analysis and Recovery Process 1-6

Automatic Ring Recovery Process 1-7

Node Audit Capability 1-7

Ring Audit Capability 1-8

RPCN Token Audit 1-8

CNI Safety Net Capability 1-8

s Inhibiting CNI Safety Net 1-9

s Allowing CNI Safety Net Feature 1-9

General Maintenance 1-10

s Daily Activity Recommendation 1-10

s Faulty Node Recovery Strategy 1-10

7/29/2019 172254


1-ii Issue 16.0 December 2000


401-661-045

Contentss Routine Diagnostics 1-11

Fault Descriptions 1-11

s RAC Parity/Format Error 1-12

Cause 1-12

Effect 1-12

Craft Recovery Action 1-12

s Unexplained Loss of Token 1-17

Effect 1-17


s SRC Match 1-21

Cause 1-21

Effect 1-21


s RAC Output Parity Error 1-27

Cause 1-27

Effect 1-27


s General RAC Error Detected 1-30

Cause 1-30

Effect 1-30


s Node Audit Failure 1-32

Cause 1-32

Effect 1-32

Craft Recovery Action 1-32s Interframe Buffer Parity Error 1-35

Cause 1-35

Effect 1-35


s Read Format Error 1-38

Cause 1-38

Effect 1-38


s Write Format Error 1-39

Cause 1-39

Effect 1-40Craft Recovery Action 1-40

Emergency Maintenance 1-41

7/29/2019 172254


Issue 16.0 December 2000 1-iii

Contents


s Ring Down Recovery 1-41

s Rolling CNI Initializations 1-41

s Global CDN Recovery 1-47

s Single CDN Recovery 1-48

7/29/2019 172254


1-iv Issue 16.0 December 2000


401-661-045

Contents

7/29/2019 172254


Issue 16.0 December 2000 1-1


1


The Common Network Interface (CNI) ring serves as the medium that connectsthe various cellular processors together. The following sections describe the basic

hardware configuration of each type of processor.

DSN/CSN/ICN Hardware Descriptions

A Digital Switch Node (DSN) is the CNI node that is used to connect the DigitalCellular Switch (DCS) to the rest of the system via data links to the DSN.

A Cell Site Node (CSN) is the CNI node that is used to connect the cell sites to the

rest of the system via data links to the CSN.

An Inter-Cellular Node (ICN) is the CNI node that is used to connect cellularsystems together via data links to the ICN.

The basic difference between each of these three node types is the software thatresides in each node. The hardware configuration for these nodes is identical.

In the Flexent/AUTOPLEX environment, each of these nodes is equipped with an

Integrated Ring Node (IRN) circuit pack. This IRN board comes in several differentmicrocode versions:

MC3F014A1 UN303

MC3F018A1 UN303B

MC3F026A1 UN303B

7/29/2019 172254


1-2 Issue 16.0 December 2000


401-661-045

MC3F026A1B UN303C

MC3F026A1C UN304

All of these versions can be used in a CSN, DSN or ICN. The IRN board can be

found in the Node Processor (NP) slot of each node.

A new circuit pack, the UN304/UN304B, has replaced the UN303 in manyapplications. When the UN304 is used, the node is called an IRN2. When the

UN304B is used, the node is called the IRN2B. Unless specifically stated, theterm IRN can apply to any of these circuit packs. When an IRN2B is used in a

CSN, it is known as a CSN Enhanced (CSNE). Unless specified otherwise, allreferences to CSN can include the CSNE.

The memory data link (MDL) circuit pack handles the transfer of information

between the data links and the node processor. A CSN can be equipped with twoMDL boards (MDL0 and MDL1), with each MDL capable of handling four datalinks. DSNs and ICNs should be equipped with only one MDL board.

There are two types of MDL circuit packs: a TN1317 version and a TN1640

version. Either type can be used in a CSN, DSN or ICN. The TN1640 versionprovides additional message throughput and should be used in CSNs containing

heavily loaded cell sites. See the System Capacity Monitoring and Engineering Guidelines , 401-610-009, for recommendations on how to assign CSN, DSN orICN data links.

The data links coming into each of these node types connect to an 11A, 12A, 13A,

or 13B adaptor board. The 11A adaptor board is used for RS232 connections, the12A adaptor board is used for RS449 connections, and the 13A and 13B adaptor

boards are used for V.35 connections. These adaptor boards are attached to the

backplane of the CSN/DSN/ICN on the vertical slot location occupied by the MDLboards. Each adaptor board holds up to four data links and there is one adaptor

board for each equipped MDL board.

CDN Hardware Description

A Call Processor/Data Base Node (CDN) is the CNI node which handles the call

processing functions of the FLEXENT™/FLEXENT/AUTOPLEX ® WirelessNetwork Systems. A CDN is basically a two-part unit consisting of a node andRing Application Processor (RAP) unit. The following versions of CDNs may be

found in existing systems:

s CDN

s CDN-I [sometimes referred to as a Standard Multi-Application Real Time

(SMART) Node (SN)]

s CDN-II [sometimes referred to as a Turbo CDN (TCDN)]

7/29/2019 172254





s CDN-IIx

s CDN-III.

Unless specified otherwise, references to CDN in this document apply to any of

these versions.

CDN

The original CDN used a double-plate RAP with 2-Mbyte memory boards. A

double plate CDN occupies two horizontal mounting plate locations in a CNIframe.

The CCC and CCS pair can be either a UN237 and UN236 pair or a UN625 and

UN626 pair. They must be a matched pair. That is, a UN2XX series CCC/CCSboard is not compatible with a UN6XX series CCC/CCS board.

The MASC board can be either a UN95 board or a UN295 board. There can be upto four MASC boards in the FLEXENT/AUTOPLEX environment (MASC0 -

MASC3).

The MASA boards are always TN56 boards. Each TN56 board provides 2 Mbytesof memory, and there can be up to eight MASA boards per MASC memory group.

The NPI board is always a TN1349 board.

CDN-I

In the FLEXENT/AUTOPLEX environment, the node is always equipped with anIRN circuit pack. Only two of the three possible microcode versions are approved

for use in a CDN-I. The approved versions are:

MC3F018A1 UN303B

MC3F026A1 UN303B

The RAP portion of a CDN-I is a 3B15-based computer. The basic functionalcomponents that make up this unit are a central controller cache (CCC) board, a

central controller support (CCS) board, a main store controller (MASC) board, themain store array (MASA) memory boards, and a node processor interface (NPI)

board.

A CDN-I comes in two different versions commonly referred to as double plate orsingle plate CDN-I.

7/29/2019 172254




401-661-045

Double Plate CDN-I

A double plate CDN-I occupies two horizontal mounting plate locations in a CNI

frame.

The CCC and CCS pair can be either a UN237 and UN236 pair or a UN625 andUN626 pair. They must be a matched pair. That is, a UN2XX series CCC/CCS

board is not compatible with a UN6XX series CCC/CCS board.

The MASC board can be either a UN95 board or a UN295 board. There can be up

to four MASC boards in the FLEXENT/AUTOPLEX environment (MASC0 -MASC3).

The MASA boards are always TN56 boards. Each TN56 board provides 2 Mbytes

of memory, and there can be up to eight MASA boards per MASC memory group.

The NPI board is always a TN1349 board.

Single Plate CDN-I

A single plate CDN-I only occupies one horizontal mounting plate location in aCNI frame. This space reduction is due to the replacement of the 2-Mbyte TN56

MASA boards with TN1398 MASA boards. The TN1398 boards provide 16Mbytes of memory per board, and there can be up to eight MASA boards in the

unit.

The CCC and CCS pair must be a UN625 and UN626 pair.

The MASC board must be a UN507 board.

The same NPI board (UN1349) is used in the single plate CDN-I as in the double

plate CDN-I.

CDN-II

The CDN-II is a Turbo CDN node type. The CDN-II is composed of an IRN2, an\ 80386-based NP, and an AP30’ (prime) attached processor (AP). The AP30’ is a

68030-based processor board with 80 Mbytes of local memory (16 Mbytes on thebase board and an additional 64 Mbytes of zig-zag in-line package (ZIP) memoryon a mezzanine board).

7/29/2019 172254





CDN-IIx

The CDN-IIx is a modified Turbo CDN node type. The CDN-II is composed of an

IRN2, an 80386-based NP, and a modified AP30 attached processor. The

modified AP30’ is a 68030-based processor board with 16 Mbytes of localmemory on the base board and from 64 to 256 Mbytes on a mezzanine board.The additional memory comes from two to eight 32-Mbyte serial in-line memorymodules (SIMM).

Unless otherwise specified, any reference to CDN-II applies to both the CDN-II

and CDN-IIx.

CDN-III

The CDN-III is an improved CDN that may be used to upgrade CDN-II or CDN-IIxtype nodes. The CDN-III consists of an IRN2 node core and AP60 attached

processor (TN2523), providing greater processing and memory capacity thanprevious CDNs. The AP60 uses an MC68LC060 processor.

RPCN Hardware Description

The Ring Peripheral Controller Node (RPCN) is the unit which provides theinterface between the ring and the ECP. In the FLEXENT/AUTOPLEXenvironment, the ring is always equipped with two RPCNs. This IRN board is

located in the NP slot of the RPCN. The microcode versions approved for use inan RPCN are:

MC3F026A1 UN303B

MC3F026A1 UN304

! CAUTION:Never use MC3F014A1 or MC3F18A1 microcode versions in an RPCN.

Doing so could seriously hinder the ring’s ability to perform automatic fault recovery tasks.

The RPCN can also be equipped with an IRN2 or IRN2B board, the UN304 or

UN304B. This board is also located in the NP slot of the RPCN.

The RPCN has a duplex dual serial bus selector (DDSBS) which basicallyterminates the ECPs connection to the ring. This board is a TN69B and has aconnection from the RPCN to each Control Unit (CU) of the ECP (CU0, CU1).

7/29/2019 172254




401-661-045

The RPCN also contains a 3B Interface (3BI) board which serves as the interface

between the DDSBS an the NP of the RPCN. This board is a TN914.

Direct Link Node Hardware Description

A Direct Link Node (DLN) is basically an RPCN equipped with an attachedprocessor (AP), with respect to its hardware configuration, but has a different task

to perform in the FLEXENT/AUTOPLEX environment. The function performed bya DLN is to route the data link message traffic between cellular systems.

The DLN is used to route messages into and out of the FLEXENT/AUTOPLEX

systems, and for both X.25 and SS7 types of intersystem networking. FLEXENT/ AUTOPLEX currently supports three types of DLNs: the DLNE, the DLN30, andthe DLN60.

s The DLNE has IRNB, AP30, 3BI, and DDSBS boards.

s The DLN30 replaces the IRNB board with an IRN2B to provide increasedperformance and higher reliability.

s The DLN60 provides more processing power and memory than previoustypes of DLNs. The DLN60 uses an IRN2 node core with an AP60 attached

processor. The DLN60 does not have a 3B21D computer interface.

SS7 Node Hardware Description

The SS7 nodes are used to interface with the Signal Transfer Points (STP). In theFLEXENT/AUTOPLEX environment, SS7 nodes are always equipped with an IRN

circuit pack. All three IRN microcode versions are approved for use in an SS7node.

An SS7 node is also equipped with a Link Interface board. This board handles one

data link from the FLEXENT/AUTOPLEX system to the STP. The LI board can beeither a TN916 (MC3F003A1) or a TN1316.

EIN Ethernet Interface Node

The Ethernet Interface Node ( EIN) is an Interprocess MessageSwitch (IMS) user

node on the Common Network Interface (CNI) ring. The Ethernet Interface Node(EIN) provides access through the Ethernet from the ring to the Application

Processor (AP). CNI provides the capability to transport data from the EIN to theAP and vice versa over the Ethernet.The EIN hardware consists of the following:

7/29/2019 172254





s Integrated Ring Node (IRN) 2 (IRN2) circuit pack (CP), UN304B

(MC3F024AIB)

s EIN Link Interface (ELI) CP, TN4016

s

Paddleboard, 9822EBs Cable ED3F064-37 G80.

CNI Integrity Process Descriptions

This section describes the various software processes responsible for monitoringthe CNI ring to verify that it is functioning properly. .

Error Analysis and Recovery Process

CNI provides an Error Analysis and Recovery process (EAR) which is responsiblefor analyzing error reports from the ring and determining the probable cause of the

fault. Once the cause of the fault is determined, automatic corrective actionistaken. This corrective action could be as simple as restoring the ring to its originalconfiguration (no recovery action was necessary) or could result in nodes being

removed from service and left in the isolated state.

Automatic Ring Recovery Process

CNI provides an Automatic Ring Recovery (ARR) process which is responsible

for automatically restoring nodes which have been removed from service by theEAR process. CNI also provides an Application Specified Unconditional Restore(ASUR) process that allows the application to specify the manner in which ARR is

to restore an out-of-service node (conditional or unconditional restore).

In the FLEXENT/AUTOPLEX environment, a node that is removed from servicewill be unconditionally restored (no diagnostics performed) if this is the first time

the node has been removed in the last hour. The only exception to this rule is inthe event that EAR suspects the ring interface circuitry of the IRN board may befaulty. In this case, the node will be left in the isolated state until diagnostics are

performed and the node passes phase 1 and phase 2. This is necessary toensure the stability of the ring. Restoring a node unconditionally that is in the ring

interface faulty state could result in faults being generated which seriouslythreaten the performance of the CNI ring.

7/29/2019 172254




401-661-045

If this is the second time a node has been removed from service by EAR in the

past hour, ARR will diagnose the node and only restore the unit if it passes alldiagnostic phases.

If this is the third time a node has been removed from service by EAR in the pasthour, the node will be left in the out-of-service state. This link node will remain in

this state until craft takes the appropriate recovery action to restore the node toservice.

Node Audit Capability

The Node Audit feature is a CNI process responsible for ensuring that nodeswhich are in the active state are functioning properly and are capable ofcommunicating with the ring. The Node Audit does this by periodically sending a

message from the ECP destined for a node, followed by a chaser message. This

chaser message is not destined for any particular node. Its purpose is to circulatearound the ring undisturbed and return to the node audit process.

When the link node receives this audit request, it should respond by sending areply message back to the ECP. If the ECP receives the reply message, all is well.If the reply is lost, but the chaser message arrives at the ECP as expected, then

another audit message is sent to the node. If this reply is also lost, the node isassumed to be in an insane state and will be removed from service. If the first

reply message was lost and the chaser message did not arrive at the ECP asexpected, this implies a possible RPCN or ring problem. This is discussed in the

“Ring Audit Capability” section of this chapter.

Ring Audit Capability

The Ring Audit feature is a CNI process based on the Node Audit process. TheRing Audit verifies the message communication path from the ECP to the ring.

This task is performed by monitoring the results of the chaser message sent outby the Node Audit Capability.

If a chaser message is lost, another chaser will be sent through the other RPCN. If

this test is successful, then the RPCN which was first tested is assumed to befaulty and is removed from service.

If the second chaser message is also lost, or the other RPCN is already out of

service, a Level 3 EAR is invoked in an attempt to isolate and correct the possiblering/RPCN trouble.

7/29/2019 172254





RPCN Token Audit

The RPCN Token Audit Capability is a CNI process that ensures a token message

is circulating around the ring at all times. Since a node must possess the tokenmessage in order to write to the ring, it is critical that this message be present.

The audit is performed by periodically forcing the RPCN to exercise its ring writecircuitry, thus forcing it to read the token message. If a special timer fires within the

RPCN before the token is detected, the token is assumed to be lost and theRPCN sends a lost token report to the EAR process in the ECP.

The EAR process then reports an unexplained loss of token. A token tracking

audit is then run in an attempt to discover where the token was lost. The EARprocess then initiates a Level 0 restart in an attempt to return the ring to service. Ifthis restart is unsuccessful, EAR escalates to a Level 3 ring recovery.

CNI Safety Net Capability

The CNI Safety Net Capability is an FLEXENT/AUTOPLEX process whose solepurpose is to verify that the CNI ring is up and functional. When Safety Net

detects a problem with the ring, it will respond by requesting a CNI Level 3initialization or CNI Level 4 initialization depending on the severity of the problem.

Safety Net checks the integrity of the ring every 60 seconds. It does so by sendinga message from the ECP to a different node every 60 seconds. If the message is

returned to the ECP by the node, then all is well. If the message is not returned tothe ECP, Safety Net increments a counter and begins repeating this process,

cutting the interval from 60 seconds to 10. If the failed message counter reachesits maximum error threshold (eight at present time), a Level 3 CNI initialization willbe requested to restore the communication path to the CNI ring.

Another critical item monitored by the CNI Safety Net is to ensure that the system

has a minimum of one active CDN. If Safety Net detects that all CDNs are out ofservice, an SI24 Defensive Check Failure Assert message is printed on the ROP.

This will repeat every minute for four additional minutes (five total messages). Onthe sixth SI24, a CNI Level 4 Initialization will be initiated. The Safety Net will then

turn itself off for 90 minutes. It should be noted that if Safety Net detects all CDNsare out of service, it will first check to see if a CDN is in the process of beingrestored. If so, it will allow that CDN to come up rather than begin a CNI

initialization.

7/29/2019 172254




401-661-045

Inhibiting CNI Safety Net

At times, it may be necessary to inhibit (turn off) the CNI Safety Net feature. This

need may arise due to a fault existing in the ring that prevents the system from

being recovered via a CNI Level 4 initialization. Safety Net would continue torequest CNI Level 4 initializations, getting in the way of craft attempts to clear thefault from the ring.

The Safety Net feature can be easily inhibited from the Emergency ActionInterface (EAI) page on the MCRT. Once on this page,

s Enter a 42 poke command.

s Enter i (inhibit) for the parameter value.

s Next, a 50 initialization is required to set the flag in ECP memory.

Once Safety Net has been inhibited, it will remain in this state until a 54

initialization occurs or the inhibit flag is cleared from the EAI page (see followingsection). Whenever Safety Net is inhibited, it is critical that craft personnelremember to turn the feature back on once the source of the fault has been

cleared. Failure to do so could result in an extended outage which Safety Net mayhave avoided.

Allowing CNI Safety Net Feature

The CNI Safety Net feature is always turned on at boot (54) time and remains thisway unless inhibited from the EAI page. Once the feature is inhibited, it will remainin this state until craft resets the inhibit flag.

To turn the Safety Net feature back on, once again go to the EAI page and:

s Enter a 42 poke command.

s Enter a to allow the feature to function.

s Enter a 50 initialization is required to clear the inhibit flag in ECP memory.

General Maintenance

This section provides craft with information which could assist in identifyingpotentially faulty hardware before the problem is serious enough to cause a ring

outage.

Also included in this section are descriptions of common CNI ring faults and the

steps necessary to correct the situation.

7/29/2019 172254





Daily Activity Recommendation

The most important tool available to craft to prevent a serious ring event is the

daily history of ring maintenance activity. This information is critical given the

FLEXENT/AUTOPLEX strategy for recovering faulty nodes. Quite often, a faultynode will be removed from service and restored so quickly that craft is unawarethe fault ever occurred. This recovery strategy will be briefly discussed in the nextsection.

The history of recent ring maintenance activity is kept in the RPTERR1 log file

located in the /etc/log directory. This file should be inspected daily for theoccurrence of ring faults. The UNIX command ls -l RPTERR1 will provide the date

and time of the last entry to this log file. If this time stamp indicates recent ringactivity, the log file should be examined to determine the nature of the activity.

When this log file reaches its maximum allowable size, it is moved to RPTERR0and a fresh RPTERR1 log file is started.

This activity could be the result of routine RPCN midnight diagnostics or the resultof a ring fault. If the activity is determined to be a r ing fault, locate the ring fault in

the “`Fault Descriptions” section of this chapter for assistance in correcting thesituation.

Faulty Node Recovery Strategy

Usually when a node is automatically removed from service, it is due to a transient

fault. This fault could be either a hardware glitch, or a software fault which causesthe node to basically shut down operation. Many of these transient faults can be

corrected by reinitializing the node. The only way for the node to request this is to

refuse to accept messages from the ring. Once this happens, messages destinedfor the node will be returned to the sender. When the sending node receives this

message, it reports this to the ECP and the ECP removes the node from service.

Once the node is removed, it is up to ARR to restore the node to service. Asmentioned in the “Automatic Ring Recovery Process” section, the first time a node

is removed from service within a 60-minute interval, it will be restoredunconditionally (no diagnostics performed). This is due to the transient nature ofmost faults. If it was a one-time event, the node will probably be ATP if diagnostics

are performed. Given this, it is more important to get the node back into service asquickly as possible rather than take the additional time to diagnose the node on

the first fault. If a second fault occurs within an hour, the node will be diagnosed.However, at times a node may contain questionable hardware which may only

result in the node being faulted a couple of times a day or even less frequently. It isthis borderline hardware that makes it imperative for craft to understand the

importance of monitoring the daily activity in the RPTERR1 log file mentionedearlier. If a persistent fault is detected, craft intervention may be necessary toisolate the source of the problem.

7/29/2019 172254




401-661-045

Routine Diagnostics

Given the ring’s ability to detect and report suspected faulty hardware, it is not

recommended that diagnostics be performed on every node around the ring.

However, it is recommended that RPCNs, CDNs and DLNs be taken down at leastonce a month (weekly if possible) and diagnosed. These nodes have beenselected for preventive maintenance due to both their importance to systemperformance, and the extended amount of time it takes to diagnose and restore

these nodes should a fault occur.

While CSNs, DSNs, ICNs and SS7 are certainly important to the system, theirloss does not seriously threaten system performance. Also, in the event one of

these nodes is lost, the recovery time is minimal if this is the first fault.

NOTE:On the subject of performing routine diagnostics, it should be noted that there is a

critical difference between a single plate and double plate (TN1398 or TN56memory boards) CDN-I unit. Requesting diagnostics on a double plate CDN-I willresult in the entire CDN-I being diagnosed. The same can not be said of a single

plate CDN-I. For a single plate CDN-I, craft MUST specify that demand phases 54through 61 be executed. These phases are responsible for diagnosing the 16-Mbyte memory boards (one phase for each MASA board equipped). These

memory diagnostics are done on a demand basis only due to the time required tocomplete memory diagnostics on the TN1398 circuit packs.

Fault Descriptions

This section describes various CNI ring faults. The output message associatedwith the fault is presented, followed by the cause of the fault, the effect the faulthas on the ring, and the recovery action to clear the fault. For a more detailed

description of possible faults, see Appendix A, Ring Error Analysis and Recovery.

In the following descriptions, the terms upstream node and downstream node willbe used. These terms describe relative position of nodes and are based on the

direction of data flow on the rings. Basically, any particular node will RECEIVEdata from its upstream neighbor and will SEND data to its downstream neighbor.Since the data flows in opposite directions on the two rings, a node’s upstream

neighbor on ring 1 is the downstream neighbor on ring 0 and its upstreamneighbor on ring 0 is the downstream neighbor on ring 1. For example, with

respect to ring 0, LN00-7’s upstream neighbor is LN00-6 and its downstream

neighbor is LN00-8.

7/29/2019 172254





RAC Parity/Format Error

The output message present on the ROP and the RPTERR1 log file for this fault is

as follows:

REPT RING TRANSPORT ERR

RAC PARITY/FORMAT ERROR DETECTED, LN00 7 RAC 0.

X’00000000 X’FFFFFFFF X’03000008 X’00000380

X’00004000 X’00000300 (3121083924)

Cause

The reporting node, LN00-7 in this example, is reporting that its upstreamneighbor on RAC 0 (LN00 6) tried to pass a bad message to it. This message is

used to report both bad parity and an orphan byte failure. The effect and recoveryaction is the same regardless of which error type it is, so it is not necessary to

determine which fault type it is from a craft perspective.

Effect

The node which had the bad message presented to it will refuse to accept themessage. This will force the node offering the bad message to report ring

blockage to EAR. EAR will attempt to reestablish normal ring communication byperforming a Level 0 ring recovery. If this fails to correct the error condition, EAR

will escalate to a Level 1 ring recovery which could result in nodes being removedand isolated.

Craft Recovery Action

The RPTERR1 log file should be examined to determine if this is the first instanceof the fault. If this is a recurring fault, the node reporting the fault and the upstreamneighbor node should be taken down and diagnosed.

If diagnostics do not find a problem with either node, attempt to clear the fault by

cleaning and reseating the circuit packs in the suspect nodes using therecommended contact cleaner.

NOTE:Miller Stevenson Company markets an aerosol form of the solvent-lubricant which

is recommended (1.0 percent OS-124 in Freon TA) for use on CNI ring backplanes

and circuit packs. This product is marketed as MS-181.

If the fault persists, replace packs in the following order:

7/29/2019 172254




401-661-045

1. If there is a pair of interframe buffer boards (IFB) between the node

reporting the fault and the upstream neighbor, replace the IFB associatedwith the node reporting the problem.

2. If the fault persists, and IFBs are involved, replace the IFB in the node

upstream of the node reporting the fault.

3. If the fault persists, replace the IRN board in the node upstream of the nodereporting the problem.

4. If the fault persists, replace the IRN board in the node reporting theproblem.

5. If the fault persists, and there are IFBs involved, there could be a cableproblem. Call for assistance to isolate the source of the fault.

See Figure 1-1 on page 1-15.

7/29/2019 172254





Figure 1-1. RAC Parity/Format Error

1st occurrence?

Chart 1

Done

Done

Replace packs &

diagnosed asper TLP list

node and both neighbors

RAC parity format error

Run diagnostics on the faulted

Examine UNIX

file /etc/log/RPTERR1

Transient fault. Monitor

/etc/log/RPTERR1 log file forseveral weeks. If fault

returns, go to 1st

occurrence no leg

ATP?

ATP?

Y

Y

Y

N

N

N

Go toChart 1A

7/29/2019 172254




401-661-045

Figure 1-1. RAC Parity/Format Error (contd)

Replace IRN board

in nodereporting problem

Replace IRN boardin upstream neighbor

IFB boards between

reporting node andupstream neighbor?

Y

Y

Chart 1A

Note 1: If RAC 0 is implicated in the output

message, the upstream neighbor is the lower nodenumber (LN32-4 is upstream of LN32-5). If RAC 1is implicated, the upstream neighbor is the higher

node number (LN32-6 is upstream of LN32-5).

N

N

N

Y

Done

Go toChart 1B

Call

forassistance

Cleared?

Cleared?

7/29/2019 172254





Figure 1-1. RAC Parity/Format Error (contd)

implicated or R1 if RAC 1 implicated.then replace the R0 board if RAC 0Note 3: If RPCN and it has no IRN,

the fault.Replace IRN in node reporting

Y

Y

N

N

N

N

Y

Y

Done

Possible cable problem. Callfor assistance in swapping

cables between rings

Bad cable. Configure cablesso that the faulty cable is

in RAC 1. Obtain new cableASAP!

Call for assistance

Fault

move?

Cleared?

Cleared?

Cleared?

Replace IFB in node

upstream of reporting node

Replace IFB in nodereporting the fault

Note 2: RPCN32 is upstream of the last node ingroup 00 (or group 31 if equipped) on RAC 1 and

downstream on RAC 0. RPCN00 is upstream of thelast node in group 32 (or group 63 if equipped) on

RAC 1 and downstream on RAC 0.

Chart 1B

7/29/2019 172254




401-661-045

Unexplained Loss of Token


as follows:


UNEXPLAINED LOSS OF TOKEN REPORTED ON RING 0.Cause

This message occurs when a RPCN detects that the token is no longer circulatingaround the ring.

Effect

EAR will initiate a token tracking procedure in an attempt to determine where the

token was last seen. If the procedure is successful, the following message willresult:

REPT TOKEN TRACKTOKEN WAS LOST BETWEEN LN63 1 AND LN63 6 ON RING: 0

X’00000000 X’3F63F104 X’00300001 X’40040001

There are several other versions of the message that could result depending onoutcome of the token tracking procedure. Reference the FLEXENT/AUTOPLEX

Output Message Manual for the other versions of this message which could result.

EAR will attempt to reestablish normal ring communication by performing a Level

0 ring recovery. If this fails to correct the error condition, EAR will escalate the ringrecovery to a Level 1 which could result in nodes being removed and isolated.


The RPTERR1 log file should be examined to determine if this is the first instanceof the fault. If this is a recurring fault, and the token tracking report was successful,remove and diagnose the two nodes mentioned in the report. If the token tracking

report was not successful, call for assistance.

If diagnostics do not find a problem with either node, attempt to clear the fault bycleaning and reseating the circuit packs in the suspect nodes using the

recommended contact cleaner.

NOTE:Miller Stevenson Company markets an aerosol form of the solvent-lubricant whichis recommended (1.0 percent OS-124 in Freon TA) for use on CNI ring backplanes

and circuit packs. This product is marketed as MS-181.

If the fault persists, start replacing circuit packs in the following order:

7/29/2019 172254





1. If there is a pair of interframe buffer boards (IFB) between the two nodes

identified in the token tracking report, replace the IFB in one of the nodes.

2. If the fault persists, and IFBs are involved, replace the IFB in the other node

identified in the token tracking report.

3. If the fault persists, replace the IRN board in one of the two nodes identified

in the token tracking report.

4. If the fault persists, replace the IRN board in the other node identified in the

token tracking report.

5. If the fault persists, call for assistance.


7/29/2019 172254




401-661-045

Figure 1-2. Unexplained Loss of Token

Report successful?

Examine ROP & UNIX file

occurrences

Call forassistance

Replace IRN board

in one of the nodes.If RPCN and it is notan IRN, then replace

the R0 board if ring 0 isimplicated or R1 ifring 1 is implicated

Cleared?

Go toChart 2A

Done

N

Unexplained loss of token

Chart 2

Examine ROP & UNIX file /etc/log/RPTERR1 for token

tracking report

N

N

N

N

Y

Y

Y

Y

Transient fault. Monitor /etc/log/RPTERR1 log file

for several weeks tosee if fault returns

1st occurrence?

/etc/log/RPTERR1 for other

ATP?

ATP?

Y

Done

Replace packs& diagnose

as perTLP list

Diagnoseboth

nodes

7/29/2019 172254





Figure 1-2. Unexplained Loss of Token (contd)

N

N

Y

Y

Y

Y

Y

Call for

assistance

IFB boardsbetween

suspect nodes?

Done

Replace othernodes IFB

Possible cable problem.Call for assistance in

swapping cablesbetween rings

Faultmove?

Cleared?

Cleared?

Cleared?

Replace IRN board in othernode.

Chart 2A

Replace one node's IFB

Bad cable. Configurecables so that thefaulty cable is in

RAC 1. Obtain newcable ASAP!

N

N

N

7/29/2019 172254




401-661-045

SRC Match


as follows:


RMV LN33 7 RQSTD; SRC MATCH RPTD BY LN31 6

X’6FB015F4 X’352070B8 (2834204595)

Cause

An SRC match failure results when a node does not take a message from the CNIring that was addressed to it. This message will eventually return to the sourcenode, who will remove the message from the ring and will report an SRC match to

the ECP against the destination node.

Effect

As stated above, the message will eventually return to the source node. The

source node will remove the message from the ring and report the SRC match tothe EAR. This will always result in the destination node being removed fromservice. ARR will then restore the node to service either conditionally or

unconditionally, depending on the frequency of the faults against this node.


An occasional SRC match, in itself, is normally not cause for concern. CNI

integrity software running in the nodes at times detects situations that require the

node to be reinitialize to clear the fault. The only means available for a node torequest itself to be reinitialized is for it to force itself to quit taking its messagesfrom the ring, commonly referred to as panic the node. By refusing to read itsmessages from the ring, the node is assured of being removed from service via

the SRC match mechanism and restored via ARR.

When SRC matches are detected, the RPTERR1 log file should be examined todetermine the frequency of the fault. If the fault is persistent, then there could be a

hardware problem and the node should be diagnosed. If the node is a single plateCDN-I, demand phases 54 through 61 must be performed to completely test themain store memory.


cleaning and reseating the circuit packs in the suspect node using therecommended contact cleaner.

7/29/2019 172254





NOTE:Miller Stevenson Company markets an aerosol form of the solvent-lubricant whichis recommended (1.0 percent OS-124 in Freon TA) for use on CNI ring backplanesand circuit packs. This product is marketed as MS-181.

If the fault persists, replace circuit packs in the following order:

1. If the faults are occurring immediately after the node is restored to service,check the ECD (rcvecd) and the application database (apxrcv, iun form) to

verify they are in sync with respect to the node type.

2. If the fault persists, replace the IRN circuit pack.

3. If the fault persists, replace the MDL boards one at a time, or replace the

LLI board if the node is an SS7 node.

4. If the node is a CDN, check the RPTERR1 log file for the existence of a

CDN panic message in the form of:

REPT COM100 TBLLN00 07 NADR: X’C07

Panic : Hardware

Local Bus Parity Error:

CCS0(lba=0x0):

CSRs=0x61100028,0x0

MASC0(lba=0x100000):

CSRs=0x422054,0x4c00b500

CCS 61100028

MASC 00422054

NPI 00000000

5. If a message similar to this appears, it is not necessarily a local bus parity

error. Go directly to page 3 of Figure 1-3 for CDN assistance.

6. If the fault persists, or the panic message is not present for a CDN, call for

assistance in clearing the fault.


7/29/2019 172254




401-661-045

Figure 1-3. SRC Match

Y

Y

Y

YY

Y

N

N

N

N

N

N

Done

Transient fault. Monitor /etc/log/RPTERR1 log file for

several weeks to see ifthe fault returns

1st occurrence?

Examine UNIXfile

/etc/log/RPTERR1

Replace packs &diagnose as per

TLP list

Run diagnostics onthe faulted node

SRC match

Chart 3

ATP?

ATP?

Check APXRCV DB toverify it agreeswith ECD entry

Agree?

Fault occursimmediately

after restoral?

Check ECD to verifynode type

Determine fault frequencyby examining ROP or

RPTERR1 log file

Done

Correct any discrepanciesand restore node

Cleared?

Done

Chart 3AGo to

Chart 3AGo to

7/29/2019 172254





Figure 1-3. SRC Match (contd)

Replace IRN boardin faulty node

Chart 3A

Replace MDL 1

board if equipped

Y

Y

Y

N

N

N

N

N

Y

Y

Call forassistance

Go toChart 3B

Done

Replace adaptorboards on node

backplane

Cleared?

Cleared?

Cleared?

Is node a CDN?

Cleared?

Replace MDL 0

board

7/29/2019 172254




401-661-045


Y Y

Y

Y

N

Replace theNPI board Replace the

NPI board

Unidentified

SYSerror

Go toChart 3C

Replace theCCS board

Replace theCCS board

Cleared?

Cleared?

Cleared?

Cleared?

Cleared?

Done

Call forassistance

Y

Y

N

N

N

Chart 3B

Check RPTERR1 errorlog for a

PANIC: HARDWAREmessage for this CDN

Present?

Y

N

N

N

Cache

error

Replace theCCC board

Replace theCCC board

NPI USEC

timerchange

Call forassistance

Double

biterror

Local bus

parity error

Cleared?

Call for

assistance

7/29/2019 172254






Go to next two pagesfor instructionson convertingaddress in thepanic message

to a MASA

board location

Chart 3C

TN56 memoryboards?

TN56 memoryboards?

Insert a new TN1398 boardin the first MASA slot. Iffault still exists, return

original board and slide

new board to the nextslot. Continue until newboard has been tried in

each MASA slot

Starting at demandPhase 54, run one

phase for eachMASA board

equipped (54-61)

Replace boards &diagnose as per

TLP list

ATP?

ATP?

Cleared?

Cleared?

Cleared?

Done

Done

N

N

N

N

N

N

N

N

Y

Y

Y

Y

Y

Y

Y

Y

Call forassistance

Done

Insert a new MASC board. If faultstill exists, return the original

board and slide the new boardto the next MASC until new board

has been tried in each MASC

Insert two new TN56 boardsin the first two MASA slots. If

fault still exists, returnoriginal boards and slide

new boards to the nextslot. Continue until the

two new boards have beentried in each MASA position

Valid boardnumber

Replace suspectedMASA board

7/29/2019 172254




401-661-045

RAC Output Parity Error


as follows:


RAC OUTPUT PARITY ERROR DETECTED, LN31 2 RAC 1.

X’00000000 X’00000000 X’03020002 X’00002280

X’00014000 X’00000300 (2923885816)

Cause

The node reporting the fault detected that it had attempted to write a messagewith bad parity to the ring.

Effect

The node which had the bad message presented to it will refuse to accept themessage. This will force the node offering the bad message to report ring

blockage to EAR. EAR will attempt to reestablish normal ring communication byperforming a Level 0 ring recovery. As part of this recovery process, each nodewill reread the message that it had presented to the downstream neighbor. When

doing this, the node reporting the fault detected that it had presented a messagecontaining bad parity to its downstream neighbor.

If this fails to correct the error condition, EAR will escalate the ring recovery to a

Level 1 which could result in nodes being removed and isolated.


The RPTERR1 log file should be examined to determine if this is the first instanceof the fault. If this is a recurring fault, the node reporting the fault should be

removed and diagnosed.

If diagnostics do not find a problem with either node, attempt to clear the fault bycleaning and reseating the circuit packs in the suspect nodes using the



is recommended (1.0 percent OS-124 in Freon TA) for use on CNI ring backplanesand circuit packs. This product is marketed as MS-181.


1. Replace the IRN board in the node reporting the fault.

7/29/2019 172254







7/29/2019 172254




401-661-045

Figure 1-4. RAC Output Parity Error

N

N

N

N

Y

Y

Y

Y

ATP?

ATP?Replace packs &diagnose as per

TLP list

Run diagnostics on thenode reporting the

fault

RAC output parity error


/etc/log/RPTERR1 log filefor several weeks tosee if fault returns

Chart 4

1st occurrence?

Examine ROP & UNIX file /etc/log/RPTERR1 for other

occurrences

Call for

assistanceCleared?

Done

Replace the IRN boardin the node reporting

the problem.

Note: If RPCN and it has noIRN, then replace the R0

board if ring 0 is implicated

or R1 board if ring 1 isimplicated

7/29/2019 172254





General RAC Error Detected


as follows:


GENERAL RAC ERROR DETECTED, LN63 1 RAC 0.

X’00000000 X’00000000 X’03018010 X’00000380

X’00000000 X’00000300 (2834204091)

Cause

This is a catch all error type used to report unexpected node hardware or softwarehardware conditions.

Effect

The node reporting the problem will not accept any data from the upstreamneighbor node, thus forcing that node to report blockage.


The RPTERR1 log file should be examined to determine if this is the first instance

of the fault. If this is a recurring fault, the node reporting the fault and its upstreamneighbor should be removed from service and diagnosed.


cleaning and reseating the circuit packs in the suspect nodes using the






2. If the fault persists, replace the IRN board in the upstream neighbor.



7/29/2019 172254




401-661-045

Figure 1-5. General RAC Error

N

N

N

Y

Y

Y

Y

NATP?

General RAC error


TLP list

Run diagnostics on the

node reporting thefault

Chart 5


/etc/log/RPTERR1 log filefor several weeks tosee if fault returns

1stoccurrence?


/etc/log/RPTERR1 for otheroccurrences

Call forassistance

Replace the IRN in theupstream neighbor.

Note: If RAC 0 is implicated,

the upstream neighbor is the

lower node # (LN32-4 isupstream of LN32-5). If RAC 1

is implicated, the upstreamneighbor is the higher node #

(LN32-6 is upstream of LN32-5)

Cleared?

Cleared?

Done

Replace the IRN board

in the node reportingthe problem.

Note: If RPCN and it has no

IRN, then replace the R0board if ring 0 is implicated

or R1 board if ring 1 is

implicated

ATP?N

Y

7/29/2019 172254





Node Audit Failure


as follows:


RMV LN32 4 RQSTD; NAUD FAILURE RPTD

X’6FB015F4 X’352070B8 (2834204595)

Cause

The Node Audit process has detected a node that is not responding to the nodeaudit requests, but the rest of the ring seems to be functioning normally.

Effect

The node at fault will be removed from service.


The RPTERR1 log file should be examined to determine if this is the first instanceof the fault. If this is a recurring fault, the node faulted should be removed anddiagnosed.


cleaning and reseating the circuit packs in the suspect nodes using therecommended contact cleaner.



NAUD failures can be caused by noisy data links on the node being removed from

service. Before proceeding to replace circuit packs, first use the CMpfcnts tool todetermine if there are questionable data links on the node being removed fromservice.



2. Replace one of the two MDL boards.

3. Replace the other MDL board, if equipped.


7/29/2019 172254




401-661-045


Figure 1-6. NAUD Failure

NAUD failure

Cleared?

Cleared?

Is node a CDN?

Correct link problemand monitor node

for several weeks

Replace IRN board

Go toChart 6A

Done

Done

Transient fault,

monitor RPTERR1for several

weeks to see if

fault returns

1st occurrence?

Examine UNIX file

/etc/log/RPTERR1

Y

Y

Y

Y

YN

N

N

N

N

N

N

N

Y

Y

Y

Familiar withCMpfcnts tool?

This fault could bethe result of noisydata links. Run

CMpfcnts to identifypossible problem

links

ATP?

ATP?Replace & diagnose packsas per TLP list

Diagnose faulty node

Call forassistance

Noisy links?

Chart 6

7/29/2019 172254





Figure 1-6. NAUD Failure (contd)

N

N

N

N

Y

Y

Chart 6A

Replace IRN board

in faulty node

Replace MDL 0

board

Replace MDL 1

board

Cleared?

Cleared?

Cleared?

Cleared?

Replaced adaptor

boards on nodebackplane

Call forassistance

Done

Y

Y

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254




401-661-045

Figure 1-7. Interframe Buffer Error

Replace the IRN board in

node reporting the error.Note 2: If RPCN & it has no IRN,

replace R0 board if ring 0 is

implicated or R1 if ring 1 isimplicated

N

Y

Y

Y

Y

Y

Y

NATP?

Interframe buffer parity error


TLP list

Run diagnostics on the node

reporting the problem

Chart 7

1st

occurrence?


/etc/log/RPTERR1 for otheroccurrences

Call forassistance

Replace the IRN in the

upstream neighbor.Note 1: If RAC 0 is implicatedthe upstream neighbor is the

lower node # (LN32-4 is

upstream of LN32-5). If RAC 1is implicated, the upstream

neighbor is the higher node #

(LN32-6 is upstream of LN32-5)If RPCN, see Note 2

Replace the IFB inthe upstream node

Replace the IFB

in the nodereporting problem

Cleared?

Cleared?

Cleared?

Cleared?

Done

Transient fault.Monitor the

RPTERR1 log filefor several weeks

to see if fault

returns

ATP?N

Y

N

N

N

N

7/29/2019 172254





Read Format Error

The output message present on the ROP and the RPTERR1 logfile for this fault is

as follows:


READ FORMAT ERROR DETECTED, LN00 7 RAC 0.

MSG SRC: LN00 3, msg type: zzzzz

X’00000000 X’FFFFFFFF X’03000008 X’00000380

X’00004000 X’00000300 (3121083924)

Cause

The reporting node, LN00-7 in this example, is reporting the upstream neighbor

on RAC 0 (LN00 6) tried to pass a message which had a bad message length.This error usually indicates there is a node on the ring which is clipping/mutilating

messages as they pass through this node. This fault type requires immediateattention. A clipped message, if undetected, could take the appearance of a valid

maintenance message. This maintenance message could take the appearance ofone which would force all nodes into a set quarantine state, thus removing themfrom service and resulting in a system outage.

Effect

The node which had the bad message presented to it will refuse to accept themessage a will send a error report to the home RPCN. This will force the node

offering the bad message to report ring blockage to EAR. EAR will attempt to re-established normal ring communication by performing a level 0 r ing recovery. Ifthis fails to correct the error condition, EAR will escalate to a level 1 ring recovery

which could result in nodes being removed and isolated.


The RPTERR1 log file should be examined to determine if this is the first instance

of the fault. If this is a recurring fault all reports must be examined in an effort todetermine a ring segment which most likely contains the faulty node.

If MSG SRC data is present in the output message, the suspected faulty nodeshould be one of the nodes between the SRC node and the node reporting the

fault (LN00 4 -> LN00 6) in the example above. If the SRC MSG data is notpresent, several reports must be examined to determine which area of the ring

most likely contains the faulty node. For example, if reports are present from bothLN00 7 and LN32 7, all nodes between LN00 7 and LN32 7 (LN00 8 -> LN32 6)

are probably not the source of the problem for RAC 0 reports.

7/29/2019 172254




401-661-045

NOTE:WRITE FORMAT ERROR messages may also be present and can be used toassist in locating the faulty segment.

All nodes in the suspected ring segment should be diagnosed. If diagnostics donot find a problem with any node, attempt to clear the fault by cleaning and

reseating the circuit packs in the suspected segment using the recommendedcontact cleaner.

NOTE:Miller Stevenson Company markets an aerosol form of the solvent-lubricant that is

recommended (1.0 percent OS-124 in Freon TA) for use on CNI Ring backplanesand circuit packs. This product is marketed as MS-181.


1. Select the first node in the suspected segment and replace the UN303

board. Monitor the RPTERR data daily to determine if fault has beencleared.

2. If fault persists, examine the additional faults reported. If the node reporting

the fault is in the suspected segment, all nodes from the node reporting thisnew fault to the previous nodes reporting the fault can be removed from the

suspected faulty list.

3. Repeat Step 1 for the next logical link node in the suspected faulty ring

segment. If any node contains IFBs, replace these as well once the UN303has been eliminated as a suspected pack.

4. If fault persists, and all packs in suspected segment have been replaced,call for assistance.

Write Format Error

The output message present on the ROP and the RPTERR1 logfile for this fault is

as follows:


WRITE FORMAT ERROR DETECTED, LN00 7 RAC 0.

X’00000000 X’FFFFFFFF X’03000008 X’00000380

X’00004000 X’00000300 (3121083924)

Cause

The reporting node, LN00-7 in this example, is reporting a message it wasattempting to write to the ring failed a validation check. This message is similar to

the READ FORMAT ERROR type in that it usually indicates there is a node on thering which is clipping/mutilating messages as they pass through this node. This

7/29/2019 172254





fault type requires immediate attention. A clipped message, if undetected, could

take the appearance of a valid maintenance message. This maintenancemessage could take the appearance of one which would force all nodes into a set

quarantine state, thus removing them from service and resulting in a system

outage.

Effect

The node which was trying to write the message will not do so, nor accept the

message being offered to it, and a error report is sent to the home RPCN. Thenodes previous to the reporting node will report ring blockage to EAR. EAR will

attempt to re-established normal ring communication by performing a level 0 ringrecovery. If this fails to correct the error condition, EAR will escalate to a level 1ring recovery which could result in nodes being removed and isolated.


The RPTERR1 log file should be examined to determine if this is the first instanceof the fault. If this is a recurring fault all reports must be examined in an effort to

determine a ring segment which most likely contains the faulty node. For example,if reports are present from both LN00 7 and LN32 7, all nodes between LN00 7and LN32 7 (LN00 8 -> LN32 6) are probably not the source of the problem for

RAC 0 reports.

NOTE:READ FORMAT ERROR messages may also be present and can be used to

assist in locating the faulty segment.

All nodes in the suspected ring segment should be diagnosed. If diagnostics donot find a problem with any node, attempt to clear the fault by cleaning andreseating the circuit packs in the suspected segment using the recommended

contact cleaner.

NOTE:Miller Stevenson Company markets an aerosol form of the solvent-lubricant whichis recommended (1.0 percent OS-124 in Freon TA) for use on CNI Ring

backplanes and circuit packs. This product is marketed as MS-181.


1. Select the first node in the suspected segment and replace the UN303

board. Monitor the RPTERR data daily to determine if fault has beencleared.

7/29/2019 172254




401-661-045

2. If fault persists, examine the additional faults reported. If the node reporting

the fault is in the suspected segment, all nodes from the node reporting thisnew fault to the previous nodes reporting the fault can be removed from the

suspected faulty list.

3. Repeat Step 1 for the next logical link node in the suspected faulty ringsegment. If any node contains IFBs, replace these as well once the UN303has been eliminated as a suspect pack.

4. If fault persists, and all packs in suspected segment have been replaced,call for assistance.

Emergency Maintenance

This section is intended to assist craft in those instances where the CNI ring

appears to be flat on its back and requires craft intervention to get the system

operational.

While this data provides useful information, it should not be used as a

replacement for calling for immediate assistance when such a situation occurs.Lucent Technologies personnel should be contacted whenever system recovery isinvolved rather than waiting until the “Ring Down Recovery” section of this chapter

has exhausted its helpful hints.

Ring Down Recovery

A ring down situation can take several forms. One of these is the case where theCNI ring is repeatedly rolling into either a CNI Level 3 or CNI Level 4 initialization.

The second form a ring down situation can take is where EAR is repeatedly

performing various levels of ring recovery in an attempt to isolate the cause of theproblem.

The third scenario is one that should never happen, but given this document has

just mentioned that it should never happen, it will be discussed. This is a casewhere all communication to the ring has been lost, but no integrity processappears to be doing anything about it. No section will be dedicated to discuss this

scenario, but in the event it does occur, start the recovery process by requesting aCNI Level 3 initialization, and call for assistance immediately.

Rolling CNI Initializations

If the ring is in a state of repeated CNI initializations, perform the following steps:

7/29/2019 172254





1. Determine if CNI Safety Net is requesting the CNI initializations. Do this by

checking the ROP for the existence of SI15, SI22 or SI24 Defensive Checkfailures. If present, go to Step 2, else go to Step 5.

2. Disable CNI Safety Net by going to the Emergency Action Interface page

and entering a 42 poke command. When the parameter field appears,enter i to inhibit Safety Net. Next, perform a 50 initialization to set the inhibitflag in memory. This should stop the rolling initializations so that theproblem can be investigated. If so, go to Step 3, else go to Step 5.

3. If Safety Net was requesting the initializations due to no CDNs being active

(SI24 asserts), determine if the rest of the ring appears to be up. If so, go toStep 4; for anything else, go to Step 5.

4. No CDNs are active, but the rest of the ring seems to be up. Go to the“Global CDN Recovery” and “Single CDN Recovery” sections in this

chapter for assistance in recovering from this fault.

5. Either the ring is in a rolling initialization due to CNI not being able to get an

RPCN up or SI15/SI22 asserts were present due to CNI Safety Net firing.

6. Verify that there are no power interruptions to the ring.

7. If the problem persists, examine the ROP closely to determine if CNIsoftware is flagging any node, or group of nodes, as being a possible

source of the problem. If so, pull the IRN board out of those nodes to forceisolation around that segment.

8. If the RPCNs are equipped with IRN boards, verify that they have theproper microcode versions. Again, only MC3F026A1 is approved for use in

a RPCN.

9. If problem persists, power down RPCN32 to force the ring to come up on

RPCN00.

10. If the problem persists, restore power to RPCN32. Maybe the problem is

related to a bad CU in the ECP. Force the ECP to do a CU switch andattempt a CNI Level 4 initialization.

11. If problem persists, force isolated segments by removing power from onemounting plate at a time (group of three nodes). After power is removed

from a group of nodes, request a CNI Level 4. If the problem persists,restore power to the previous group and remove power from the next

group. Repeat this step until every node has been tried in an isolatedsegment.

12. Again, it is assumed that you have already called for assistance, but if not,do so immediately.


7/29/2019 172254




401-661-045

Figure 1-8. Ring Down

Ring up?

Ring up?

Check the ROP for thepresence of SI15, SI22

or SI24 asserts

Correct?N

Present?

Inhibit safety net from theEAI page. Use pokes 42, I

for inhibit and 50 bootto set new value.

Request a CNILevel 3 INIT to

restart thedriver

Request a CNI

Level 4 INIT torepump the ring

Ring down

Chart 8

Power down RPCN32 andrequest a CNI INIT 4

Go toChart 8A

Go toChart 8A

Go toChart 8A

Go toChart 8B

Verify that the RPCNshave the correct IRN

micro code. OnlyMC3F026A1 can be used

in an RPCN

Rolling INITSstop?

Rolling INITSstop?

Rolling INITSstop?

Rolling INITSstop?

Rolling CNIINITS?

Ring is down buttaking no

recovery action

Power RPCN32 back up &power down RPCN00.Request a CNI INIT 4

Done

N

N

N

N

N

N

N

N

N

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Done

Correct and request aCNI INIT 4

Ring up?

7/29/2019 172254





Figure 1-8. Ring Down (contd)

If each RPCN is reporting a fault orone RPCN & the upstream neighbor ofthe other (last node in groups 31 or 63),then there could be two IFB problems.

Power down one RPCN to force thatsegment out of ring. Place a new IFB

in the other RPCN. If problem stillpresent, place a new IFB in the

neighbor node. If problem still exists,try new IRN. If RPCN is not IRN type,replace the R0 or R1 board based on

which ring the fault is reported onif the fault does not involve both pairs.

Done

Go to

Chart 8B

Go to

Chart 8C

Token trackinginformation?

Lost tokenReport?

Repeated RACparity errors

on both rings?

Rolling ringreconfigurations?

Mention missingfiles?

Are all linknodes OOS?

Chart 8A

All CDNsOOS?

Call for

assistance

Call forassistance

Y

Y

Y

Y

Y

Y

Y

Y

N

N

N

N

N

N

N

N

Pull the IRN fromthe two nodes

mentioned in thetoken tracking

report & requestCNI INIT 4

Follow normalmaintenanceprocedures tocorrect faulty

nodes

Check ROP/RPTERR1 for clues

Ring up?

If 1506, 1509, or 1803 IFBs, then pull theIRNs from the two nodes reporting the

fault to force a isolated segment.

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254




401-661-045

Global CDN Recovery

This section is intended to provide assistance when all CDNs are out of service

and fail to recover after a CNI Level 4 initialization. When this event does occur,

execute the following steps in an attempt to clear the fault AND immediately callfor assistance.

1. If you have not already done so, inhibit CNI Safety net by going to the EAI

page and entering a 42 command. When asked for the parameter value,enter i. Next, do a 50 boot to set the flag in memory.

2. Was a BWM just applied that required the CDNs to be repumped? If so,back the BWM out.

3. Check the ROP closely to see if there are any error messages present thatindicate files may be missing.

4. CDN memory could be scrambled. Inhibit ARR via inh:dmq:src arr inputcommand. Next, power cycle each CDN, and allow the CDN to initialize its

memory (approximately 5 minutes). Once the initialization is completed(red light on the MASA boards should be extinguished), request an

unconditional restoral of each CDN.

5. Perform an ECP stable clear to reinitialize the CNI integrity processes

using init:ecp:sc. Attempt to restore the CDNs unconditionally.

6. If the nodes are being removed during the database download portion of

recovery (page 2160 shows them in the init state), use UXprint todetermine if the nodes are always removed while downloading a specific

database.

7. Examine the ROP closely for the existence of either of these messages:

REPT:CDN x, y (CDN-I)REPT:CDN x, FAULT (CDN-II and CDN-III)

where y is either STACK, MEMORY or UNKNOWN. If present, contact CTS

personnel.

8. Check the application database (apxrcv) iun form to verify that the

CDNs are defined properly.

9. Check the ECD (apxrcv) ucb form to verify that the CDNs are defined

properly.

10. It is assumed you have called for assistance already, but if not, do soimmediately.

7/29/2019 172254





Single CDN Recovery

This section is intended to provide assistance when a single CDN will not restore

to service. When this occurs, execute the following steps in an attempt to clear the

fault:

1. Perform manual diagnostics on the suspect link node. If the CDN-I is asingle plate RAP, demand phases 54-61 must be requested to test the

MASA boards. One phase is required for each MASA board equipped.

2. If the restore fails during the pumping phase, (that is, ABORTED PUMP OF

IUN LN00 7), check the file /1apx10/ims/cdn/OFC.cdn.lv.x to verify that itis a contiguous file. If it is not, use the fmove command to make it

contiguous.

If the node is a CDN-II, check the file /1apx10/ims/cdn2/OFCcdn2 to verify

that it is a contiguous file. If it is not, use the fmove command to make itcontiguous.

3. If the node is a CDN-I and the fault persists, refer to “CDN-I Fault Isolation”in Chapter 6, Diagnostic User’s Guide, for assistance in running the on-

board firmware diagnostics.

If the node is a CDN-II node, try replacing the AP board (TN1630B). If the

fault persists, contact the CTS for assistance.

4. If the node is a CDN-I and the fault persists, inspect the RPTERR1 log file

for the presence of the Hardware Panic message.

REPT COM100 TBL

LN00 07 NADR: X’C07

Panic : Hardware

Local Bus Parity Error:

CCS0(lba=0x0):CSRs=0x61100028,0x0

MASC0(lba=0x100000):

CSRs=0x422054,0x4c00b500

CCS 61100028

MASC 00422054

NPI 00000000

5. If a message similar to this appears, it is not necessarily a local bus parityerror. Go directly to Chart 3B of Figure 1-3 for CDN assistance.

6. If flowchart fails to clear the fault, call for assistance.

7/29/2019 172254




401-661-045

7/29/2019 172254


Contents



2

Description of the RingSubsystem

General 2-1

Operation of the Ring 2-3Ring Nodes 2-5

s Ring Peripheral Controller Nodes 2-6

s Basic IMS User Nodes 2-6

s Direct Link Nodes (DLN) 2-7

s Call Processor/Data Base Nodes (CDN) 2-7

CDN-I 2-7

CDN-II 2-8

CDN-IIx 2-8

CDN-III 2-8

s Interframe Buffers 2-9

Node Names and Addresses 2-10

Ring Message Format 2-11

Reconfigurations 2-13

s Node Quarantine 2-13

s Node Isolation 2-13

s The Ring Config Module 2-16

Initializations 2-17

s Level-3 IMS Initializations (FPI and Boot) 2-18

s Level-4 IMS Initializations (FPI and Boot) 2-19

Audits 2-20

s Central Node Control Audit (AUD CNC) 2-20

s Node State Audit (AUD NODEST) 2-20

s Node Audit 2-21

7/29/2019 172254




401-661-045

Contents

7/29/2019 172254




2

Description of the Ring Subsystem

General

The Interprocess Message Switch (IMS) is a packet switch composed ofring-based communication nodes centered upon a 3B21D computer. Each ring

node is controlled by a microcomputer called the node processor. The nodes aredistributed around dual, parallel communication rings that propagate data in

opposite directions. Ring 0, the outer ring in the illustration below, propagates dataclockwise; and ring 1, the inner ring, propagates data counter-clockwise.Ordinarily, of the two ring paths, ring 0 is actively involved in transmitting user

messages, while ring 1 performs as a path for internal IMS communications.

Each ring node contains one interface to each of the two rings and one interfaceeither to the 3B21D or to a user's external system. Thus, IMS has two types of

nodes: nodes interconnecting the ring and the 3B21D, the most important ofwhich are called ring peripheral controller nodes (RPCNs), and nodesinterconnecting the ring with the user's external system, most of which are called

basic IMS user nodes (basic IUNs). As a processing resource, the centralized3B21D is also available to users, but its principal purpose is to provide

operational, administrative and maintenance control of the switch.

7/29/2019 172254




401-661-045

The following graphic illustrates a graphic conception of the ring.

Figure 2-1. Conceptual Illustration of an IMS Ring

The real situation is somewhat more complicated than this description, because

IMS has other types of nodes and because users are represented not only by anexternal communication system but also by internal hardware and software

residing in certain nodes. A full discussion of all classes of IMS nodes appears

shortly below.

IMS may be used either as a local area network or as a switching system. Morecommonly it is used as a switch to transfer user messages from incoming

transmission facilities to user-specified outgoing transmission facilities. A usermessage typically enters IMS through the external or user interface of an IUN, is

formatted and addressed to a destination IUN by the resident node processor, andis inserted on the ring by the resident ring interface. It then passes around the ringto the destination IUN where it is recognized and extracted by the ring interface,

reformatted by the node processor, delivered to the user interface and, then,returned to the user. In this typical transmission the 3B21D is not directly involved,

though it can be involved, depending on user requirements. When access to the3B21D is needed, a user message enters the ring as described above but is first

removed by an RPCN or similarly functioning node, which delivers it to the 3B21D,which processes it. The 3B21D then returns the processed message to an RPCN,which inserts it on the ring, from which it is removed by the destination IUN, which

further processes and returns it to the user.

3B21D

RPCN

BASIC IUN

LEGEND

7/29/2019 172254





In this illustration of IMS switching, a user message is transferred between

processes residing in different processors. By itself the illustration is misleading,because IMS is not an interprocessor message switch but an interprocess

message switch. It is capable of transmitting messages between any two

processes, whether user- or IMS-owned, residing in the same or in differentprocessors. This capability is provided by a major IMS software module called the

message switch.

Operation of the Ring

All ring nodes contain a ring interface. Each ring interface is equipped with a pair

of ring access circuits (RACs), one connected to each ring. Each RAC consists ofthree elements:

— a firstin-firstout buffer (FIFO) that is 10 bits wide,

— circuitry providing receive logic, and — circuitry providing transmit logic.

The FIFO is actually a component of the ring, which is a mixed medium composedalternately of storage devices and transmission leads. The storage devices are

the FIFOs. The transmission leads are a 12-bit ring bus that interconnects theFIFOs (and therefore the RACs). The ring bus contains eight data leads, two

formatting leads, and two control leads. A data-available control lead permits theupstream RAC to assert to the downstream RAC to which it is offering a byte ofdata. A data-taken control lead allows the downstream RAC to acknowledge to the

upstream RAC that it has accepted the offered byte. Data thus advances betweenadjacent RACs asynchronously, one byte at a time, by means of continuous

handshakes. Upstream and downstream are relative terms. Each RAC isupstream of the RAC to which it offers data and downstream of the RAC from

which it receives data.

A byte of data may be offered to a RAC either by the upstream RAC or by the

resident node processor, which connects with the RAC through an 18-bit DMAchannel composed of 16 data leads and two formatting leads. The first 8 bytes of

a message from either source consists of header information. Each header byte isexamined as it is offered by the second element of the RAC, the receive logic. The

receive logic checks for parity and formatting errors and determines messagedisposition. It also controls the loading of each data byte into the FIFO. The thirdRAC element, the transmit logic, disposes of the data in the FIFO according to

instructions from the receive logic.

7/29/2019 172254




401-661-045

If the message was addressed to the resident node or was a broadcast

message,1 the bytes composing it are offered by means of handshakes to thenode processor via the 18-bit DMA channel. If the message was not addressed to

the resident node, the bytes composing it are offered by means of handshakes to

the downstream node via the next segment of the ring bus.

Figure 2-2. A Ring Access Circuit on the IMS Ring

IMS employs a token message on each ring to ensure that only one node at atime writes messages to the ring. A token continuously traverses a ring. When a

node is ready to insert a message or a block of messages on a ring, it waits for theupstream node to offer a data byte that its receive logic recognizes as the first byte

of the token header. It delays accepting this byte (does not assert the data-takenlead) until it can insert its message or messages, byte by byte, on the ring. Then itaccepts and transmits the token message downstream, making it available to the

next node that has messages to write.

1 IMS has two types of broadcast messages-general broadcasts, which are read by everynode, and selective broadcasts,which are read by previously defined groups of nodes.Selective broadcasting-achieved by virtual addressing-allows such practices as paralleldownloading of data or code into similar node types.

12-bit ring bus12-bit ring bus

FIFO

RAC

18-bit

DMAchannel(write)

18-bit

DMAchannel(write)

RCVlogic XMITlogic

7/29/2019 172254





Ring Nodes

IMS has two classes of ring nodes-RPCNs and IUNs. RPCNs are nodes that

contain no user software and that interconnect the ring and the 3B21D. IUNs,which contain both IMS and user software, perform a variety of functions. The

class of IUNs has two subclasses-unextended IUNs, in which the node processorprovides the only processing resource, and extended IUNs, in which theprocessing function is supplemented by an attached processor. At present, all

unextended IUNs contain external user interfaces, but no extended IUNs do. Thiscondition, however, is arbitrary and therefore subject to change. Currently there is

one type of unextended IUNs; the basic IUNs. There are two types of extendedIUNs-direct link nodes (DLNs) and call processor/database nodes (CDN-I). All

ring nodes of either class have a ring interface and a node processor. In thisdocument the units of a node other than the ring interface and the node processorare called auxiliary components.

Ring node hardware utilizes very large scale integration hardware, housing thering-interface and the node-processor functions in a single integrated circuit pack.These are called integrated ring nodes (IRNs). There are two versions of IRNs:

the IRN/IRNB (UN303/UN303B) and the IRN2/IRN2B (UN304/UN304B).

Node processors are microcomputers composed of a CPU, memory, interrupt

logic, I/O ports, and DMA circuitry. They are supplemented in DLNs by anadditional microcomputer called the attached processor and in CDNs by an

additional minicomputer called the ring application processor. In unextendedIUNs, the node processor contains both IMS and user code. In extended IUNs,

user code resides only in the attached processor, whereas both node andattached processors contain IMS code. The content of user code is determined byuser needs. Typically it provides or contributes to such functions as controlling

user hardware resident in the node, managing the user's network, and providingreal-time user services such as protocol conversion and message addressing.

The code provided by IMS manages the ring-interface and node-processor

hardware. It includes code for initialization and automatic maintenance and forsuch switching functions as message formatting and temporary message storage.

It provides an operating system, boot monitor, memory, timers, andmeasurements. Except for the boot monitor, all code residing in node processorsand attached processors is downloaded from the 3B21D.

7/29/2019 172254




401-661-045

Ring Peripheral Controller Nodes

RPCNs allow messages to be passively exchanged between the ring and the

3B21D. The exchange is passive because the RPCNs contain no user code that

could provide processing of message substance. By contrast, direct link nodes(discussed below) provide active exchange of messages between the ring and the3B21D by supplementing certain real-time user functions housed in the 3B21D. Tominimize the consequences of a wide failure, RPCNs are distributed about the

ring with approximately equal numbers of IUNs between them. A minimumrequirement exists of two RPCNs per ring. Typically, large rings will have more.

In addition to a ring interface and a node processor, RPCNs contain the following

circuit packs:

s A duplex dual serial bus selector (DDSBS) serves as a termination point

between the ring and the dual serial channels of the 3B21D. It converts theparallel output of the ring to the serial format of the dual serial channels

and vice versa. The DDSBS is duplexed, with one DDSBS functionconnected to the dual serial channel of the on-line 3B21D control unit andone to the off-line control unit.

s A 3B21D computer interface (3BI) circuit pack serves as a buffer between

the node processor and the DDSBS. It also provides data conversionbetween the node processor's 16-bit data bus and the DDSBS's 36-bit databus. The 3BI communication occurs either via a DMA channel or a program

I/O utility of the 3B21D operating system. The DMA channel is ordinarilyused for standard message interchange. The program I/O is initiated and

used by the 3B21D to issue urgent commands to the RPCN or tosynchronize data transfers.

Basic IMS User Nodes

Basic IUNs interconnect the ring and the user's external system. In addition to a

ring interface and a node processor, a basic IUN contains an external userinterface. The external user interface and node processor communicate with one

another via a shared memory in the external user interface. The MDL circuit packdescribed below is available as an external user interface for these nodes; or, aswith Common Network Interface (CNI) link nodes, users may supply their own

interface.

7/29/2019 172254





Direct Link Nodes (DLN)

DLNs are designed to supplement real-time processing of user data in the 3B21D.

Like RPCNs, DLNs provide message transmission between the ring and the

3B21D. But unlike RPCNs, DLNs contain user code, the presence of whichenables them to reduce the processing demands upon the 3B21D by assumingsome user processing functions that cannot be performed by basic IUNs.

In addition to a r ing interface and a node processor that contains only IMS code,DLNs are composed of the following circuit packs:

s An attached processor that resides on the node-processor bus andcommunicates with the node processor via a dual-ported memory and

hardware interrupts. The attached processor contains both IMS and usercode.

s A 3B21D computer interface (3BI) and a duplex dual serial bus selector(DDSBS) that perform in the same way and serve the same functions as

they do for RPCNs, as described above.

Call Processor/Data Base Nodes (CDN)

The CDN handles the call processing functions of the FLEXENT™/AUTOPLEX ®

Wireless Network Systems. There are several versions of the CDN: CDN-I,

CDN-II, and CDN-IIx.

CDN-I

IMS offers an extended node for users who require more processing power in the

nodes than can be supplied by basic IUNs. The node is called a CDN-I[sometimes referred to as a standard multi-application real time node (SMARTnode or SN)]. It serves as an alternative to the 3B21D for the substantive

processing of user data. Currently, the CDN-I has only an interface to the ring. It iscapable, however, of having an external user interface, and it may have one in the

future.

In addition to a ring interface and a node processor that contains only IMS code, aCDN-I is composed of the following elements:

s An attached processor called a ring application processor (RAP). The RAPis a 3B15 computer mounted on an IMS backplane that has been

redesigned to conform with the design of IMS ring-node frames/cabinets

and the 3B15. The older version has 2 megabytes of memory and iscapable of growing an additional 94 megabytes. The newer version has 16

megabytes of memory and is capable of growing an additional 112megabytes. The following circuit packs compose the RAP:

7/29/2019 172254




401-661-045

— Central controller cache (CCC)

— Central controller support (CCS)

— Main store controller(s) (MASC)

— Main store arrays (MASAs)

s A power control interface and display (PCID) that provides manual-power,reset, and diagnostics controls and LEDs that indicate power and

diagnostic failures.

s A node-processor interface (NPI) that provides message exchange

between the node processor and the RAP.

CDN-II

The CDN-II (sometimes referred to as the Turbo CDN) creates a new node that is

used to replace the CDN-I. The CDN-II requires only two boards and fits in a

standard 3-node shelf or the new 5-node shelf.

The CDN-II provides a newer technology, higher performance CDN. Theperformance of CDN-II is about four times the performance of the CDN-I. CDN-II

has a fixed 80 Mbytes of memory and consists of the IRN2B (UN304B) and an AP(TN1630B).

CDN-IIx

The CDN-IIx has identical features to the CDN-II, but different hardware. It uses

the IRN2B (UN304B) and an AP (TN1720x) but can have up to 272 Mbytes ofmemory using multiple AP boards. A CDN-II can be upgraded to a CDN-IIx by

ordering a memory growth upgrade kit.

CDN-III

The CDN-III is an improved CDN that may be used to upgrade CDN-II or CDN-IIxtype nodes. The CDN-III consists of an IRN2 node core and AP60 attached

processor, providing greater processing and memory capacity than previousCDNs. The AP60 uses an MC68LC060 processor.

7/29/2019 172254





Interframe Buffers

Interframe buffers (IFBs) are required to extend the parallel ring buses where the

distance between adjacent ring nodes is greater than a few inches. In an IRN ring,

the distance is 24 inches or more. Such internodal distances occur at theboundaries of frames or cabinets where the two rings must be extended by twolengths of cable. At times they may also occur within frames/cabinets. At theseboundaries, an interframe-buffer circuit pack must be inserted at each end of the

parallel cables, between the cables and the nodes that are separated by thecables.

Interframe-buffer circuit packs are always employed in pairs. Each member of a

pair contains both send and receive circuitry. Therefore, the paired packs aremutually dependent, with each providing half of the buffering function for each

parallel ring bus.

The following graphic iilustrates the pairing of the interframe buffers.

Figure 2-3. Interframe Buffers

Thus, if either member of a pair fails, the pair fails.

In addition to providing necessary drive capability without slowing down theinternodal byte transfer rate, interframe buffers in padded form may be used toincrease the effective lengths of small rings, thereby permitting them to employ

longer messages. For this purpose, two pairs of 4104-byte buffers may beinserted in small IRN rings. The pairs should be placed diametrically on the ring to

minimize the possibility that both would be included in an isolation. If additionalinterframe buffers are needed, they should be of the standard 16-byte capacity.

The 16-byte capacity is adequate for use on large rings where employment of longmessages requires no buffer padding. Technicians should ensure that the actualsizes of their interframe buffers correspond to the sizes entered in equipment

configuration data (ECD). See `ÈCD Values for Interframe Buffers'' in AppendixB, Ring Maintenance Reference Material .

RAC 0

RAC 1

RI

SEND

RCV

IFB

RCV

SEND

IFB

RAC 0

RAC 1

RI

ring 0

ring 1

cable

cable

7/29/2019 172254




401-661-045

Node Names and Addresses

Ring nodes are named as members of the group in which they reside. A group is

composed of a maximum of 16 member nodes numbered 00 through 15. Node 00is always reserved for an RPCN. Nodes 01 through 15 are reserved for other

node-types. If a node position is unequipped, the member number is neverthelessreserved for the position.

Node names consist of a node-type identification followed by a 2-digit groupnumber followed by a 2-digit member number. IUN32 10, for example, is an IUN,

and it is member 10 (or the 11th node or node position) in group 32. RPCN00 0 isan RPCN, and it is member 0 (or the first node or node position) in group 00.

Member numbers and group numbers are assigned so that they increase in thedirection of traffic flow on ring 0. Unlike member numbers, however, groupnumbers do not necessarily increase by consecutive integers. Thus, a ring might

consist of groups 00, 01, 02, 32, 33, and 34, for example. In IMS usage, nodes are

identified by the formula RPCNa b or IUNa b , where a is the 2-digit group numberand b is the 2-digit member number.

In addition to names, nodes have identifications and physical addresses. (Nodesmay also have virtual addresses, but technicians will not encounter or use them.)The identification, a number between 0 and 1023, represents the physical location

of the node on the ring. The identification is calculated with the formula 16(a) + b where a is the group number and b is the member number. The identification

appears in decimal or hexadecimal form in various IMS output messages. It isalso the address that is strapped on the back of each node by grounding the node

ID pins. The pins, which are numbered 0 through 9, represent sequential binaryweights (ID 0 = 1, ID 1 = 2, ID 2 = 4, ID 3 = 8, and so on). The sum of the binaryweights of all grounded pins is the node identification.

The physical node address, a number between 3072 and 4095, is used in IMS

message headers to identify the source and destination addresses of messages.The physical address is calculated by adding 3072 (or in hexadecimal notation,

C00) to the node identification. The number 3072 corresponds to the two mostsignificant bits in the 12-bit source- and destination-address fields of message

headers, the lower 10 bits being the node identification. Tables in the referencechapter of this document provide translations of both identifications and physicaladdresses into node names. Technicians will encounter the hexadecimal form of

the physical node address in messages output in response to phase 1 and 2diagnostic failures.

7/29/2019 172254





Ring Message Format

The figure below illustrates the format of IMS messages as they appear on the

12-bit ring bus (the two control leads are not shown).

Figure 2-4. IMS Message Format

P C 7 6 5 4 3 2 1 0

source address word count

SR

word count

DC RR CF CC

source address

dest.address

dest.addressDR

data

last data

0

0

0

0

0

0

0

0

0

1

LEGEND

CC = Control CodeCF = Control FlagRR = Rac ResetDC = Destination ControlSR = Source Ring IDDR = Destination R

7/29/2019 172254




401-661-045

The illustration leaves blank fill bits and bits that are not examined by

ring-interface hardware. The first 8 bytes constitute the message header. The firstbyte contains a 7-bit control field from which the RAC learns how to respond to the

message. Within the first byte, the control code (CC) defines the message

function. Functions are token, software, destroy, set/clear quarantine, set/clearisolation, processor reset. The destination control (DC) identifies the

address-type. Types are normal address match, general broadcast, selectivebroadcast, and take message. In addition to the 8 data-bits, there is a ninth bit,

called the control or C-bit, which is always set to logic-one to identify thebeginning byte of every message. From association with this feature, the entire

first message byte is often referred to in documentation as the control or C-byte.The tenth bit is a parity bit which provides odd parity over the data byte and C-bit.When a RAC writes a message to the ring, it generates the C-bit and modifies the

parity bit from node-processor memory to include the C-bit. When a RAC reads amessage from the ring, the C-bit is removed and parity is changed back to its

original form before being written to node-processor memory.

The word count in the second message byte informs the RAC of the total numberof 32-bit words in the message. Each message contains 4N bytes, where N is thevalue of this 7-bit word count. All messages are padded out to contain an integral

number of 32-bit words. The longest possible message that can be placed on thering is limited to the maximum value of this word count, which is 127 32-bit words

(508 bytes) for rings that allow the short message and 543 32-bit words (2172bytes) for rings that allow the long message. For explanations of conditions that

permit short and long messages, see the discussion of interframe buffers above.

The third and fourth header bytes contain the source address, and the fifth and

sixth header bytes contain the destination address. The ring-interface hardwareperforms address matching on the 12-bit node address and the 1-bit ring id (that

identifies which of the two rings is used for the message). The lower 10 bits of the

ring address are referred to as the node identification. Each node is assigned aunique 10-bit node identification via the ID0-ID9 backplane straps.

This header information enables the RAC to determine message disposition and

the source and destination addresses, to check for errors in parity, format, andmessage length, and to perform hardware control functions required for ring

maintenance.

7/29/2019 172254





Reconfigurations

The types and number of nodes composing any ring are selected to meet the

requirements of a specific user. Thus, only a ring whose components are fully inservice may be thought of as properly configured. Yet rings must sometimes be

temporarily reconfigured for such reasons as the need to repair or replaceequipment. IMS reconfigures a ring by removing one or more nodes from service.Nodes that have been removed from service are ordinarily in one of two states.

They may be quarantined or they may be isolated.

Node Quarantine

Quarantining a node consists of electrically severing the node processor from itsassociated ring interface, an action that prevents the node processor from

communicating through or to the ring interface. However, the action does not

prevent the 3B21D or other nodes from limited communications with the nodeprocessor which they accomplish by setting registers in the ring interface. When anode is placed in quarantine, both RACs are set to forced-propagate mode, which

allows them to continue propagating messages on the rings but prevents themfrom reading messages from or writing messages to the rings. Quarantining is theappropriate response to a fault that occurs in a node processor or in any of the

auxiliary components of a node. Quarantining has the advantage over isolation inthat it disturbs the ring subsystem only slightly.

Throughout this document the term "quarantine'' is used solely to represent a

node that is in the state described above and that is in the active ring. Nodes inisolation or nodes during initialization or recovery sequences may have their nodeprocessors electrically severed from their ring interfaces, which are in

forced-propagate mode. Such nodes will not be called "`quarantined'' since theyare not in the active ring.

Node Isolation

Quarantining a node insulates the active ring from faults or activities in the nodeprocessor and in auxiliary components. Isolating a node insulates the active ringfrom the entire node. It is achieved by converting the ring subsystem from one

dual-ring structure to two single-ring structures. Of the two single-ring structures,one is the active segment that continues to transmit user messages, and the other

is the isolated segment that contains the isolated node or nodes. Isolatedsegments do not have a token message. The following figure schematically

represents an isolated ring.

7/29/2019 172254


7/29/2019 172254





Figure 2-6. Before (top) and After (bottom) Becoming a BISO or EISO Node

Because all nodes have this shunting capability, any node of any class can

perform as a BISO or an EISO node. The nodes actually selected to performthese functions are determined by the location of the node(s)-to-be-isolated. The

node selected to be the BISO node is ordinarily the first node upstream on ring 0of the node(s)-to-be-isolated (and therefore the next lower-numbered node), and

the node selected to be the EISO node is ordinarily the first node downstream onring 0 of the node(s)-to-be-isolated (and therefore the next higher-numbered

node). If more than one node must be isolated (a phenomenon called a multipleisolation), IMS software chooses to reconfigure the ring in such a way as to

DS = Data Selector

Selected ring path

Unselected ring path

Ring 0 DS

RAC 0

Ring 1DS

RAC 1

Ring 0DS

RAC 0

Ring 1DS

RAC 1

7/29/2019 172254




401-661-045

include the smallest number of nodes possible. Nodes included in a multiple

isolation, not because they contain faults, but because they lie between faultynodes, are called innocent victim nodes.

The BISO and EISO nodes also provide the means by which maintenancemessages are transmitted between the active and the isolated segments of an

isolated ring. BISO and EISO nodes have one RAC participating in the activesegment and one RAC participating in the isolated segment. Messages destined

for either ring segment may be read from the sending segment by the EISO orBISO RAC participating in it, transmitted via the node processor to the RAC

participating in the receiving segment, and then written to the receiving segment.It is by this means that diagnostic code is downloaded by the 3B21D into isolatednodes and diagnostic results are returned to the 3B21D.

Isolation is a more drastic means than quarantine for removing a faulty node from

service. It is an appropriate response to a fault in the ring interface or in themedium between ring interfaces (this may be a fault that prevents messages from

being propagated on the ring).

The Ring Config Module

When the ring is restarted or when an isolation is imposed or dissolved, the actionis performed by the IMS ring config module whose principal acts are:

1. to inhibit the services provided by the message switch, thus, preventing thenodes from writing to the ring, a condition known as ring silence

2. to set the data selectors of every node to positions that provide the desiredring structure

3. to test ring continuity, and-if continuity is good-s to issue one token message, when the ring contains an isolation, or

two token messages, when it does not

s to restart the message switch; or-if continuity is bad-

s to abort and return control to the process that initiated ring config.

The ring config module may be executed by IMS initialization software, by Error

Analysis and Recovery (EAR) software, by Automatic Ring Restoral (ARR)software, or by manual commands to change the structure of the ring. The

processes mentioned here are described at length later in this document.

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254





Level-4 IMS Initializations (FPI and Boot)

Level-4(FPI) initializations begin with a limited initialization of IMS in the 3B21D as

described above. Level-4(BOOT) initializations begin with a full initialization of IMS

in the 3B21D as described above. Both level-4s then proceed to initialize the ringwith the following sequence of events:

1. RPCNs are downloaded with new operational code and placed in

execution.

2. Each node is tested for the ability of its ring-interface hardware to

propagate messages on the ring and for the functionality of its dataselectors.

3. The ring config module is called to establish a ring structure based on theresults of these tests.

4. With the new ring structure in place, tests are made to determine the abilityof each unisolated IUN to read messages from, and write messages to, the

ring. Nodes that fail the tests are quarantined.

5. All unquarantined and nonisolated nodes are downloaded with operational

code and placed into execution. The downloading occurs by means ofselective broadcast messages that allow parallel downloading of similar

node-types. When downloading is done, the IMS initialization process isdone, and the ring is up. IMS level 4s are accompanied by ring silence.

Even if no nodes are operational, IMS level 4 initialization completes so thattechnicians can conduct diagnostics in an attempt to manually correct the

problem.

IMS initializations are reported on the ROP by the REPT IMSDRV INIT output

message. This message format will report first the completion of the critical stageof initialization and then the completion of the non-critical stage. Initialization ofthe ring and initialization or restarting of the IMS driver compose the critical stage.The noncritical stage consists of initializing such features in the 3B21D as display

pages, measurements, and certain craft state reports.

7/29/2019 172254




401-661-045

Audits

The following information about IMS audits is offered chiefly because output

messages concerning audits will occasionally appear on the ROP. Techniciansshould rarely have occasion to use the input commands that manually initiate

them.

Central Node Control Audit (AUD CNC)

This is a routine audit that runs according to a user-specified schedule. IMSrecommends a 15-minute interval. It also runs during level 0 and level 1A IMS

initializations and in response to manual requests. The purpose of the audit is tofind and correct inconsistencies in internal records that could interfere with theactions of automatic maintenance. The errors detected by this audit indicate

mutilated internal data or other software problems, which often occur as side

effects of other events, such as those reported by REPT IMSDRV FLT messages.The central node control audit attempts to correct an error by canceling themaintenance task associated with it. It does not verify that its action was

successful. To verify that the error was corrected, a technician must run the auditagain, using the AUD:CNC 1 input message.

If the central node control audit finds an error, it reports it in an AUD CNC outputmessage. If it does not find an error, no output message is printed, unless the

audit was manually requested. Problems in running the audit are reported in aREPT IMSDRV AUD message. Once started, the audit normally takes under 10

seconds to run.

Node State Audit (AUD NODEST)

This is a routine audit that runs according to a user-specified schedule. IMSrecommends a 15-minute interval. It also runs during level 0 and level 1A IMS

Initializations and in response to manual requests. Its purpose is to detect andcorrect errors in the node availability map, which is used by software modules

such as node audits to identify nodes whose major state is ACT (See thediscussion below of IMS maintenance states). The audit compares the data in thenode availability map with state data in the IMS driver and, when it finds

inconsistencies, modifies the map to conform to the state data.

The errors detected by the node state audit indicate mutilated internal data orother software problems, which often occur as side effects of other events, such

as those reported by REPT IMSDRV FLT messages. The audit's attempts tocorrect errors should always succeed. When the audit finds an error, an AUDNODEST output message is printed. When it does not find an error, no output

message is printed, unless the audit was manually requested. Problems inrunning the audit are reported in a REPT IMSDRV AUD message.

7/29/2019 172254





Node Audit

An automatic, internal audit of nodes allows maintenance software in the 3B21D

to continuously monitor the health of the ring and all ring nodes. The node audit is

run routinely every few seconds. By this means, the 3B21D verifies that eachactive node is operating correctly, checks the communication paths of both rings,and finds nodes that have quarantined themselves or that need to be quarantined.The work of the node audit is transparent to technicians and users of IMS, unless

it detects a problem that causes a node to be removed from service.

7/29/2019 172254




401-661-045

7/29/2019 172254


7/29/2019 172254




401-661-045

Contents

Manual Ring Maintenance 3-25

s Ring Maintenance Interfaces 3-25

Alarms 3-25

Critical Alarms 3-25

Major Alarms 3-25

Minor Alarms 3-26

Special IMS Indicators 3-26

Display Pages 3-28

Page 1105 The Ring Status Summary Page 3-28

Page 1106 The Ring Node Status Page 3-32

s Ring Diagnostics 3-36

Obtaining Diagnostic Results 3-37

Diagnostic Listings 3-38

Using Diagnostics 3-39

s Guide to Critical Ring Maintenance 3-39

IMS Input Messages 3-40

Critical Maintenance Procedures for Nodes 3-42

Critical Maintenance Procedures for Nodes in Isolation 3-47

Low-Phase Ambiguity 3-48

Guideline to Single-Node Isolations 3-51

Guideline to Multiple-Node Isolations 3-53

Responding to Ring Down 3-56

Employing Manual Ring Mode 3-58

Ring Application Processor Critical Maintenance Procedure 3-59

Recognizing and Finding Intermittent Faults 3-63

Other Suggestions for Troubleshooting 3-64

New Circuit Pack; Old Failure 3-64

Unconditional Restorals 3-65

Unexplained Loss of Token 3-65

Avoiding Trouble 3-65

Recording Trouble 3-65

New Installations or Ring Growth 3-66

Examples of Ring Maintenance 3-66

s Responses to Single, Ring-Related Faults 3-67

Automatic Recovery from a Transient Fault by EAR Level 0 3-67

Manual Recovery from a Hard Fault 3-70Automatic Recovery from a Transient Fault by ARR 3-75

Manual Recovery from a Hard Fault on a Small Ring 3-78

7/29/2019 172254


Issue 16.0 December 2000 3-iii

Contents


s Responses to Multiple, Ring-Related Faults 3-85

Manual Recovery from Multiple Hard Faults 3-85

Automatic Recovery from Two Intermittent Faults 3-101

7/29/2019 172254


3-iv Issue 16.0 December 2000


401-661-045

Contents

7/29/2019 172254


7/29/2019 172254




401-661-045

with nodes means the ring can respond to faults by removing nodes from service,

either by quarantining or isolating them. The type of reconfiguration chosendepends on the impact of the fault. If the impact is confined to the internal

operations of the node, then the node will be quarantined. But if the fault has

disrupted operation of the ring, then the node associated with the fault will beisolated. Automatic node quarantine occurs in response to instructions from the

node processor of the faulty node or from the 3B21D. Automatic node isolationoccurs when the ring config module is called with instructions to set the data

selectors in positions that create an isolated segment.

Reinstatement will succeed in response to most soft faults, while most hard faultsrequire reconfiguration. Soft faults are transient hardware problems or glitches insoftware, either of which is likely to be temporary. Soft faults may often be

corrected simply by resuming operation of the system or of the component theyhave disrupted. (Sometimes, however, the effects of soft faults are sufficiently

severe that recovery requires reconfiguration.) By contrast, hard faults are failuresin hardware or software which, once manifested, are likely to persist until they or

their causes are corrected.

Both reinstatement and reconfiguration provide rapid recovery, with the former

usually being faster but less rigorous. When confronted with a fault in the ringsubsystem, ring maintenance software must always choose to resume operation

by one of these two means. When its first choice is reinstatement, and that choicefails to achieve a stable and usable ring, it next tries reconfiguration. When, on the

other hand, its first choice is reconfiguration, reinstatement will not ordinarilyfollow, since reconfiguration, being the more thorough action, should succeed inall but the rarest cases.

Reconfiguration precipitates the third type of recovery action employed by ring

maintenance, node restoral. Node restoral occurs after operation of the

reconfigured ring has resumed. It begins with ring maintenance software testingquarantined or isolated nodes to determine how best to treat them. In somecases, it can and does return them to service by automatic means. When it cannotor does not return them to service, it alerts technicians to repair or replace them

and then to return them to service manually.

Reinstatement and reconfiguration occur automatically. The work of node restoralalso begins with automatic procedures, which give way to manual means only if

the automatic procedures fail repeatedly or if diagnostics reveal a hard fault. Thusthe usual role of technicians is to support ring maintenance by manually

completing tasks software has begun. In some instances, however, manualintervention in the automatic machinery may be indicated.

The organization of the next two chapters reflects the operational divisionbetween automatic and manual ring maintenance. The next chapter describes the

maintenance procedures that occur automatically, and the chapter that followsexplains the related responsibilities of technicians.

7/29/2019 172254




Ring Maintenance

Automatic Ring Maintenance

In the strategy of automatic ring maintenance described above, error analysis and

recover (EAR) software performs the nondeferrable task of reinstating orreconfiguring the ring, while automatic ring recovery (ARR) software performs the

deferrable task of node restoral. The following explanation of automatic ringmaintenance begins with EAR, and then proceeds to ARR.

EAR or Ring Recovery

This discussion of EAR describes events in the order of their occurrence. EAR

recognizes the existence of a fault from audits or by detecting errors in messageformat or message delivery. The work of error detection occurs chiefly in thenodes which report errors to EAR in the 3B21D. EAR in the 3B21D then analyzes

the errors to determine the type and location of the fault. Its analysis distinguishes

between ring-related faults that obstruct the transportation of messages on thering and node-related faults that prevent the processing and transmission ofmessages within nodes. Based on this information, together with its knowledge of

the current ring structure, it decides whether to reinstate or reconfigure the ring.Ring reinstatement and reconfiguration are achieved by overlapping mechanisms,and these mechanisms are also discussed below.

Error Detection

The ring assumes that faults will produce errors in message format or messagedelivery, so it searches for faults by looking for errors. Errors may occur as

messages are propagated on the ring that is, they may occur within ring interfaces

or in the medium between ring interfaces as messages are transmitted orprocessed by node processors or auxiliary components, or as messages are

transmitted between the ring and the 3B21D.

The task of detecting and reporting errors is assigned chiefly to the ring nodes. Bymeans of circuitry in their ring interfaces and software in their node processors,

nodes are usually able to detect errors internal to themselves. Moreover by meansof failures in message delivery, nodes can often detect external errors, errors

occurring in association with other nodes. When a node detects an error, it will, if itcan, report the error to the 3B21D for analysis.

An error associated with a fault that disrupts traffic on the ring is ordinarily firstdetected by the circuitry of the ring interface. Every ring interface contains circuits

for checking parity on the ring path as well as for detecting format errors in themessages it reads, writes, and propagates. When a ring-interface circuit detects

an error, it informs its node processor by means of an interrupt. The node

7/29/2019 172254




401-661-045

processor then interrogates the ring-interface hardware to determine the cause of

the problem and reports, if it can, the identity and location of the error to the3B21D via one or both rings.

An error associated with a fault that prevents the transmission or processing ofmessages within nodes will usually be detected by the node processor. Such an

error is typically caused by a fault in the node processor or by a node-processordetectable fault in one of the auxiliary components. From some errors of this type,

nodes can recovery immediately by means of local reinstatement. They may, forexample, be able to restart an attached processor that has incurred an error.

Usually, however, reinstatement is not possible, and the node processor respondsto the error by placing itself in quarantine, a condition that prevents it fromreporting its state to the 3B21D. Instead the 3B21D usually learns of the condition

from a report made by the first node that attempts to send a message to thequarantined node. During normal operation, messages are read from the ring by

the destination node. A node in quarantine, however, cannot read messages.Instead, a message addressed to it will, after traversing the entire ring, be

detected and removed from the ring by the sending node, which will understandthis condition as a SOURCE MATCH error and report it to the 3B21D. If a sourcematch fails to materialize, however, or if an injured node processor is unable to

quarantine itself, the condition will be detected by a node audit and reported to the3B21D which responds, if needed, by quarantining the disabled node.

Source-match errors are one of two means by which r ing nodes detect errors

external to themselves. The other is ring blockage. Blockage is the condition thatexists when an upstream node cannot propagate data to its downstream neighbor.Every node has a timer on the output of each of its two ring paths. The timer

expires if a byte of data being offered by the upstream node is not taken by thedownstream node within a specified interval. Expiration of the timer implies a

problem in the downstream node, for a node processor ordinarily reacts to an

error that implicates its ring interface by forcing blockage on its ring input path. Inthis context, all interconnections between nodes, including interframe buffercircuits, are considered part of the downstream node. When a node processordetects blockage, it immediately drains the ring of any remaining data, including

the token message, and reports the blockage to the 3B21D via the alternate ring.1

Errors may also be detected during the testing phase of ring initialization. Testing,which is more extensive in level-4 than in level-3 initializations, is in neither of

these levels of initialization so detailed as in diagnostics. Nevertheless, errors

1 The node that first detects blockage drains the ring to avoid confusing the 3B21D as towhich node is immediately upstream of the faulty node. If it did not drain the ring, masscongestion would ensue, causing many upstream nodes to experience and reportblockage. Even so, the initial blockage condition will often trigger two or three upstreamblockage reports before the ring can be drained.

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254


7/29/2019 172254


7/29/2019 172254




Ring Maintenance

EAR Ring Recovery Intervals and Output Messages

In this document error messages have been classified according to whether they

indicate a ring-related fault (a fault that obstructs the transportation of messages

on the ring) or a node-related fault (a fault that prevents the processing ortransmission of messages within nodes). A message of the first class is usuallyfollowed by ring restarts and, if restarts fail, by node isolation. A message of the

second class is usually followed by node quarantine. A third class of messagesexists that result in no change in ring or node connectivity.

All three message types (including the third class) are reported, usually by nodesto the 3B21D, which in turn formats them and sends them to the MCRT and ROP

as REPT RING TRANSPORT ERR messages. A descriptive list of thesemessages is included in Appendix B, Ring Maintenance Reference Material . The

most common ring transport errors, the error types that technicians shouldprobably know well, are:

s Blockage

s RAC Parity/Format Error

s Interframe Buffer Parity Error

s Source Match and SRC Match

s NAUD Failure, and

s Unexplained Loss of Token.

The outages that occur during ring recovery actions are chiefly the result of ringsilence. Ring silence is a condition imposed upon the nodes while the ring is

restarting, initializing, or reconfiguring to achieve an isolation. During ring silencethe nodes are not permitted to write to the ring. Although the actions of the IMS

ring config module to restart the ring or to achieve an isolation require only a briefperiod of ring silence, the periods of silence required by continuity tests aresignificantly longer. Nevertheless, most EAR ring recovery attempts will be

completed very rapidly. The lower levels of EAR escalative recovery actions arebrief. A level 0, 1, or 2 recovery attempt may take from to 1 second to complete,

while a level 3 attempt will usually take from 1.3 to 2 seconds. The soak periods oflevels 4 and 5 make them somewhat more expensive. Typically, a level 4 attempt

consumes 11 to 14 seconds and a level 5 attempt 90 seconds to 3 minutes,depending on ring size.

5 Overall system tolerance to these partial ring outages depends on the application. Whereapplications require very high availability of a particular user-node function, that functioncan be replicated on two or more nodes. By spacing these nodes equally around the ring,at least one member of the set should remain in the active ring segment for most cases ofmultiple ring faults.

7/29/2019 172254




401-661-045

The brevity of all but the longest of these ring recovery attempts mean that

technicians will ordinarily learn of them after they have completed. Moreover, withone exception, it is the practice of the 3B21D to queue error messages and send

them to the MCRT only after the recovery level to which they apply has completed

its attempt to return the ring to service. Technicians may infer, however, that ahigh-level recovery attempt is underway from previous output messages indicating

failed recovery attempts at lower levels, as well as from the blinking of the ``notoken'' lights on the circuit packs of all ring nodes, indicating that tests are

occurring.

The output messages concerning each ring recovery attempt will usually consistof the following items of information in the order shown:

1. A REPT RING CFR message announcing a specific level of EAR recoveryattempt.

2. If the attempt was successful, a REPT RING CFR message indicating thatthe ring has been configured and is identifying the new ring structure.

3. If the attempt was unsuccessful, an REPT RING CFR message indicatingthe reason for failure.

4. Separate REPT RING TRANSPORT ERR messages identifying each errorthat was received by the 3B21D in response to the fault that gave rise to

the recovery attempt.

Notice that REPT RING TRANSPORT ERR messages ordinarily appear on theMCRT and ROP following the REPT RING CFR messages to which they apply.Yet, because each of these message types is stamped in milliseconds by the real-

time clock, it is possible to confirm their relations. The real-time stamp on a REPTRING CFR message indicates the completion time of the attempt being reported.

The real-time stamp on a REPT RING TRANSPORT ERR message indicates the

time the report arrived at the 3B21D from a ring node. Remembering that, afterreceiving a ring transport error report that may lead to node isolation, the 3B21Dobserves a listening period of 100 milliseconds before analyzing its reports andacting upon them, technicians can reconstruct system events.

One exception exists to the rule that the 3B21D queues error messages until the

completion of the recovery attempt to which they give rise. If the 3B21D receives aloss-of-token report, then waits the 100-millisecond listening period without

receiving another error report, it immediately reports REPT RING TRANSPORTERR/UNEXPLAINED LOSS OF TOKEN to the MCRT and ROP before jumping to

a level-3 recovery attempt. Therefore, in this single case the 3B21D reports eventsin the order of their occurrence. There is no time stamp on messages announcingloss of token.

Though quarantining a node reconfigures the r ing, it is not accomplished by the

ring config module and, therefore, produces no REPT RING CFR outputmessage. Instead, technicians learn that a node has become quarantined from

7/29/2019 172254




Ring Maintenance

RMV RPCN or RMV IUN output messages and from indicators on display pages.

Also, when a node experiences a fault that leads to quarantine, it attempts to senda message to the 3B21D identifying the type of error that occurred. Currently EAR

does not use the message for fault analysis. It does, however, report the error on

the MCRT and ROP in the second line of a REPT ERROR output message. In theevent of an intractable problem, technicians should record and report this line. The

line will indicate, among other matters, whether the error was soft (requiring nosystem action), firm (requiring a restart), or hard (requiring a repump of the node

software).

ARR or Deferrable Node Recovery

Fundamental to the recovery strategy of automatic ring maintenance is thecomplementary action of ARR to EAR software. When EAR reconfigures a

suspected fault out of the ring, either by quarantining or isolating a node, ARRassumes its responsibility of either returning the node to service or, if it

determines that the node should not be returned to service, of directingtechnicians to repair or replace its faulty equipment and then returning it to servicemanually. ARR determines not to return a node to service when it has failed

diagnostics or when it has become a chronic problem. After either of these events,ARR immediately surrenders control of the node to technicians whose

responsibility it becomes to perform maintenance on it manually.

Overview of ARR Treatment of Out-of-Service Nodes

ARR can return nodes to service by restarting or restoring them. The two methodsare achieved under different circumstances and according to different rules.

Node restarts can occur only when a node has quarantined itself. Upon detectingan error in its node processor or in an auxiliary component, a node in the activering attempts to quarantine itself. It then, in response to most error-types, runs an

internal audit to test the integrity of its node-processor operational code and, if theaudit passes, attempts—with the assistance of the 3B21D—to restart itself. (If thenode is an extended IUN, it will audit the operational code of the attached

processor as well.) A restart is done without downloading code. Rather, the nodefinds a safe place in its current code and places it in execution. A successful

restart results in the node being returned to service almost immediately.6 On theother hand, if a node with a faulty node processor or auxiliary component is

unable to detect internal faults, unable to quarantine itself, unable to pass an

6 In response to a few error-types, however, a self-quarantined node does not attempt torestart itself but waits for the 3B21D to detect its state and to return it to service byrestoring it in the manner described below.

7/29/2019 172254




401-661-045

internal audit, or unable to restart after one attempt, the 3B21D will detect its

disabled condition, and if it is not already quarantined, quarantine it. Then ARR inthe 3B21D will restore the node to service.

ARR restores a node by downloading it with new operational code and placing thecode into execution. Nodes may be restored either unconditionally without being

previously diagnosed or conditionally by having their return to service depend ontheir passing all automatically-run diagnostic tests.

Maintenance States

ARR is driven to do its work by system indicators called IMS maintenance states .Maintenance states identify the operational mode of the r ing and the operationalmode, functionality, and condition of each ring node. They are determined and

announced by programs in the 3B21D, mainly by EAR software.

In addition to driving ARR to do its work, maintenance states serve as a primarysource of system information for IMS users and for technicians who should always

consult them before taking any manual action. Technicians may learn of currentmaintenance states from the IMS 1106 display page or from the OP:RINGcommand. They should keep in mind that because maintenance states represent

the central processor's knowledge of a distributed system, this knowledge undercertain conditions may be temporarily incorrect. A node processor, for example, is

allowed to quarantine itself if it detects certain irregularities in its software, but the3B21D may not learn of this change of state until it has conducted a node audit or

received a source match error.

The following are the different classes of maintenance states:

s Ring state

s Node major state

s Node minor state: ring position

s Node minor state: ring interface

s Node minor state: node processor

s Node minor state: maintenance mode.

These states are explained below.

Ring States

The ring state identifies the current operational mode of the ring. The followingstates are possible:

7/29/2019 172254




Ring Maintenance

s Ring Normal - This state represents the two-ring configuration, with one

ring serving as the active path that chiefly transmits user messages andthe other serving as a standby path that may also transmit administrative

and maintenance messages. A normal ring contains no isolated segment,

but it may contain quarantined nodes.s Ring Isolated - In this state the ring contains an isolated segment. The

nodes that bound the isolation are active and are identified as thebeginning-of-isolation (BISO) and the end-of-isolation (EISO) nodes. Any

node, including an RPCN, may act as a BISO or an EISO node. The ringcannot contain more than one isolated segment.

s Ring Restoring - When Ring Restoring appears as a transitory state, itindicates a condition that occurs very briefly during ring reconfiguration.

When Ring Restoring appears as an extended state, it indicates theresponses of automatic maintenance to a failed BISO or EISO node. When

a BISO or EISO node experiences a node-processor failure, critical noderecovery (CNR) software first attempts to conditionally restore it. (Restoral

software knows to run only those diagnostic phases that do not requireisolation.) If the conditional restoral fails, ring config extends the isolatedsegment to include the faulty node. Attending to a failed BISO or EISO

node is the highest priority activity of ARR/CNR.

s Ring Configuring - In this state the ring is initializing, restarting, beingreconfigured to isolate or unisolate one or more nodes, or engaged in oneor more levels of EAR escalative recovery action.

s Ring Down - Chief among conditions that cause the ring to go down arewhen the 3B21D cannot communicate with it through any RPCN or when it

is so fragmented by faults that EAR cannot define an active segment longenough to satisfy the criterion for minimum length. The first condition is

most likely to occur when, in a two-RPCN environment, one RPCN has

been manually taken out of service, after which the other experiences afailure in its 3B interface or duplex dual serial bus selector. During the timethe ring is down, it is possible in some applications of IMS that all IUNs willcontinue to receive and transmit messages on the ring.7 For a fuller

discussion of this matter, see the section ``Responding to Ring Down'' inthis chapter.

Node Major States

The node major state identifies the current operational mode of each node. The

following states are possible:

7 Technicians probably have no way of confirming this to be the case.

7/29/2019 172254




401-661-045

s ACT - Active. An active node is on-line and capable, unless the ring is

silenced or configuring, of performing all required functions. An active nodeis neither quarantined nor isolated. In this document, the expression ``to

return a node to service'' means to give it ACT status.

s OOS - Out of service. An out-of-service node is unavailable for certainuses. The uses depend upon whether the node is quarantined or isolated.If the ring position (see below) of an out-of-service node is NORM, then thenode is quarantined and can propagate messages on the ring, although it

cannot read, write, or otherwise process messages. If the ring position ofan out-of-service node is isolated, the node is entirely excluded from the

active ring. Nodes in either OOS state are ordinarily able to receive andtransmit only maintenance information and instructions.

s STBY - Standby. This designation is used for RPCNs only. It indicates thata healthy RPCN is prevented from doing its work by the circumstance that

the ring is down or configuring. It also appears as a transitional conditionwhen an RPCN is being grown and during system-wide initializations.

s INIT - Initializing. The attached processor of an extended node is beingrestarted or restored. The INIT state occurs as the second stage of

restarting or restoring extended nodes. In the first stage, the nodeprocessor is restarted or, in the case of restorals, downloaded with

operational code and set to executing. In the second or INIT stage, theattached processor is treated similarly. For DLNs the second stage alsoincludes tests of the DMA channel.

s OFL - Off-line. The node is quarantined out-of-service preliminary to beingassigned a role in the active ring. Nodes should not be allowed to remain

long in this condition, because their quarantined state prevents their nodeprocessor from fulfilling its important and unassignable role of error

detection and reporting.

s GROW - Grow. The node is physically being added to or removed from the

ring. During growth or degrowth, the node must always be isolated.

s UNEQ - Unequipped. Either the unequipped node has no hardware, or ring

connections physically bypass it. Still, a place holder for the node exists inIMS software.

Node Minor States: Ring Position

The ring position of each node indicates its function within the current structure of

the ring. The following are the four possible ring positions.

s NORM - Normal. The node is included in the active ring and is neither a

BISO nor an EISO node. A node in the NORM state may be quarantined; ifit is quarantined, its node major state will be OOS or OFL.

s BISO - The node is included in the active segment of an isolated ring andbounds the beginning of the isolated segment.

7/29/2019 172254




Ring Maintenance

s EISO - The node is included in the active segment of an isolated ring and

bounds the ending of the isolated segment.

s ISOL - Isolated: The node is contained in the isolated segment of an

isolated ring. Its node major state will be OOS or OFL.

Node Minor States: Ring Interface

This state characterizes for each node the current condition of its ring interface.

s USBL - Usable. This is the default state. In other words, IMS regards

ring-interface hardware as usable unless it has received an error message,a diagnostic result, or has detected a ring condition indicating otherwise.

s QUSBL - Quarantine-usable, that is, usable by the ring to propagate databut not usable by the node processor, which is insulated from the ring as in

the quarantine (OOS NORM) state. IMS sets ring-interface hardware of anynode to QUSBL when diagnostics find or suspects a fault in the ring

interface that does not prevent it from propagating messages on the ring. Anode that fails only diagnostic phase 10, for example, would be set to

QUSBL. When, under these circumstances, a ring interface is set toQUSBL, IMS unisolates the node if possible, quarantines it, and changesits maintenance mode (see below) to manual. Before performing

diagnostics or other maintenance functions on the ring interface of thenode, however, the node must be isolated.

IMS sets the ring interface of an IUN to QUSBL and the node processor toFLTY when, during a level-4 initialization, the node fails a communication

test of its ability to receive downloaded code. If this occurs, the ring willreturn to service with the node in question quarantined and in the

automatic maintenance mode.

IMS sets the ring interface of a node to QUSBL as a way of unisolating anode that is suspected of being faulty but that, as a member of an isolatedsegment, has passed phases 1 and 2 diagnostics without being subjected

to further diagnostic phases.

s FLTY - Faulty. The 3B21D has received information indicating that the

ring-interface hardware is faulty. Thus the node is, or is about to be,isolated.

s UNTSTD - Untested. The minor states of nodes are maintained in corememory only, not on disk or in ECD. Therefore, during a level 3 or level 4

initialization, the system loses knowledge of the ring-interface states ofout-of-service nodes and must retest them. The testing is done during

initialization, during which time their ring-interface states will briefly be

UNTSTD.

7/29/2019 172254




401-661-045

Node Minor States: Node Processor

This state characterizes for each node the condition of the node processor and/or

of the auxiliary components.

s USBL - Usable. This is the default state. In other words, IMS regards nodeprocessors and auxiliary components as usable unless it has received anerror message, a diagnostic result, or has detected a ring condition

indicating otherwise.

s FLTY - Faulty The node processor and/or one or more auxiliary

components is known or suspected to be faulty. The 3B21D sets thenode-processor state to FLTY when it receives error messages implicating

the node processor or an auxiliary component. It also sets the state toFLTY when it learns that a node has quarantined itself. Nodes ordinarily

quarantine themselves when they detect a problem in their nodeprocessors or in an auxiliary component. Thus the node-processor FLTY

state does not necessarily mean that a problem is in the node processor. It

could be in the node processor or in any of the auxiliary components of thenode.

s UNTSTD - Untested. Node minor states are maintained in current memoryonly, not on disk or in ECD. Therefore, during a level-3 or level-4 ring

initialization, the system loses knowledge of the node-processor states ofout-of-service nodes and must retest them. The testing is done during

initialization, during which time their node-processor states will briefly beuntested.8

Node Minor States: Maintenance Mode

The maintenance mode of a node is always either automatic or manual.

s AUTO - Automatic. In this mode a node is under control of IMS software.Nodes in the ACT state are always under automatic control. Nodes in the

OOS state are under automatic control as long as ARR software is actingupon them.

s MAN - Manual. This mode indicates that an out-of-service node is underthe control of technicians. Control will change to manual because of the

following:

8 If, during ring initialization, a fault occurs requiring an isolation that includes innocent

victim nodes, the node-processor hardware of the innocent victims might not have beentested before the isolation occurred and could not be tested during the isolation. In thiscase, the innocent victims would be quarantined, their ring-interface states set to usable,and their node-processor states set to untested. Then, when the isolation is dissolved,ARR, assuming that UNTSTD equals USBL, returns the nodes to service in accordancewith its standard algorithm which is explained below.

7/29/2019 172254


7/29/2019 172254




401-661-045

Three ARR Rules

In attempting to restore out-of-service nodes, ARR observes the following threerules:

s Restoral priorities rule

s One-restoral-at-a-time rule

s Fourth-time rule

Procedure 3-1. Restoral Priorities Rule

If several nodes are simultaneously out-of-service and still under automaticcontrol, ARR acts to restore them in the order shown below:

1. Inactive BISO and EISO nodes

2. Nodes whose ring-interface state is FLTY (isolated) (In 3.4 and later generics,

application-nominated critical nodes with faulty ring-interfaces are restored before

other nodes with faulty ring interfaces.)

3. Innocent victim RPCNs (isolated)

4. Application-nominated critical nodes with high priority (quarantined)

5. Other RPCNs (quarantined)

Faulty NP orauxiliary com-

ponent and

faulty RI

Isolate thenode

OOS ISOL FLTY FLTY AUTO

Needed to

begin an isola-

tion

Configure as

BISO node

ACT BISO USBL USBL AUTO

Needed to end

an isolation

Configure as

EISO node

ACT EISO USBL USBL AUTO

Untested NP Quarantine the

node

OOS NORM USBL UNTSTD AUTO

Table 3-1. Node Problems Mapped to Maintenance States and EAR Actions (Page 2 of 2)

NODEPROBLEM

EARACTION

NODESTATE

RINGPOSITION

RISTATE

NPSTATE

MAINT.MODE

7/29/2019 172254




Ring Maintenance

6. Application-nominated critical nodes with low priority (quarantined)

7. Innocent victim IUNs (isolated)

8. Other IUNs (quarantined)

Nodes awaiting ARR restoral efforts may be contained in the active ring segment;

or they may be contained in, or as BISO and EISO nodes associated with, theisolated segment. Because ARR's highest priority is to dissolve isolations, it deals

first with nodes contained in or associated with an isolated segment. First, itattempts to return to service any node that has become inactive after being

designated a BISO or EISO node.9 Next, it attempts to restore nodes that, byvirtue of having faulty ring interfaces, are responsible for the isolation. Then, itrestores healthy nodes that were victims of the isolation. Finally, having dissolved

the isolation by restoring all isolated nodes, ARR turns to restore any quarantinednodes. The restoral priority list does not apply to node restarts, however, which

occur independent of, and may occur in parallel with, node restorals.

The One-Restoral-at-a-Time Rule

When ARR undertakes to restore a node, whether conditionally or unconditionally,it cannot begin to restore another until any current restoral effort is completed or

terminated. To conditionally restore a node, ARR must request that the RTRMaintenance Input Request Administrator (MIRA) do the job.10 To unconditionally

restore a node, ARR does not use MIRA but performs the work itself.

Application-Nominated Critical Nodes. The rule that ARR cannot begin to restorea node until its previous restoral attempt completes has one exception. When an

application-nominated critical node requires restoral, ARR aborts an ongoing

restoral attempt in favor of the critical node, provided that the critical node ishigher on the restoral priority list than then node currently being restored.Application-nominated critical nodes occupy the fourth and sixth positions on the

list.

The Fourth-Time Rule

To prevent a transient problem from repeatedly disrupting the ring, ARR keeps aleaky-bucket count of the number of times it has restored a node to service. If,

within a 60-minute interval, ARR has restored a node to service three times and isthen called upon to restore it a fourth, it refuses to do so. Instead, it leaves it

9 These are termed IMS critical nodes. Their recovery efforts go by the special title criticalnode recovery (CNR), a title that may appear on IMS display pages.

10 Technicians may learn of the status of IMS requests at MIRA from the RTR OP:DMQcommand, as well as from IMS 1105 and 1106 display pages, which are discussed in thethis chapter.

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254




401-661-045

OOS NORM USBL FLTY 1st or 2nd time inhour

pump &return to

service

3rd time in hour isolate &

diagnose

(pass)

pump &

return to

service

isolate &

diagnose

(fail)

manual

mainte-

nance

4th time in hour manual

mainte-

nance

OOS NORM USBL UNTSTD n/a pump &

return to

service

OOS NORM QUSBL FLTY n/a isolate &

diagnose

(pass)

pump &

return to

service

isolate &

diagnose

(fail)

manual

mainte-

nance

OOS NORM USBL FLTY extended node isolate &

diagnose

(pass)

pump &

return to

service

isolate &

diagnose

(fail)

manual

mainte-

nance

OOS ISOL FLTY USBL 1st, 2nd or 3rd time

in hour

isolate &

diagnose

(pass)

pump &

return to

service

isolate &

diagnose

(fail)

manual

mainte-

nance


mainte-nance

Table 3-2. ARR Responses to Maintenance-States (Page 2 of 3)

NODESTATE

POSITION RISTATE

NPSTATE

CIRCUMSTANCE ARRACTION 1

ARRACTION 2

7/29/2019 172254




Ring Maintenance

ARR Recovery Intervals and Output Messages

ARR activities are reflected in the status information provided by the IMS 1105and 1106 display pages which are described in the next chapter of this document.In addition, results of ARR actions are reported by the following output messages.

OOS ISOL FLTY FLTY 1st. 2nd or 3rd timein hour

isolate &diagnose

(pass)

pump &return to

service

isolate &

diagnose

(fail)

manual

mainte-

nance


mainte-

nance

OOS ISOL USBL FLTY n/a quarantine manual

mainte-

nance

OOS ISOL USBL USBL isolation ends pump &

return to

service

ACT BISO USBL USBL isolation ends chg. BISO

to NORM

ACT EISO USBL USBL isolation ends chg. EISO

to NORM

Table 3-3. Output Messages that Report ARR Actions

ARR ACTION OR RESULT OUTPUT MESSAGE

Request to quarantine an RPCN RMV RPCN...

Request to quarantine an IUN RMV IUN...

Request to diagnose an RPCN DGN RPCN...

Request to diagnose on an IUN DGN IUN...

Request to diagnose and restore an IUN to

service

RST IUN...

Table 3-2. ARR Responses to Maintenance-States (Page 3 of 3)

NODESTATE

POSITION RISTATE

NPSTATE

CIRCUMSTANCE ARRACTION 1

ARRACTION 2

7/29/2019 172254




401-661-045

The time taken by ARR to return a node to service varies considerably, dependingon such factors as the type of restoral and the number of jobs waiting in MIRA's

queue. An unconditional restoral usually takes 30 to 90 seconds. A full andsuccessful diagnosis of a basic IUN or RPCN may take 5 to 8 minutes, while a

failing diagnosis usually takes somewhat longer. Diagnosis of an extended nodetakes longer still, perhaps as much as 15 minutes.

Request to diagnose and restore an RPCN

to service

RST RPCN...

Abortion of a diagnostics request because

of an error

DGN:AUDIT:RING...

Outcome of a request to reconfigure the

ring

REPT RING CFR

Abortion of an IUN pump REPT IUN PUMP...

Failure of an IUN restore REPT IUN RST...

Failure of RPCN initialization during a

restore or restart

REPT RPC INIT...

Start of an ARR recovery attempt REPT ARR AUTORSTa b FOR c STARTED

Success of an ARR recovery attempt REPT ARR AUTORST

a b FOR c SUCCEEDED

Failure of a diagnostic phase REPT ARR AUTORST

a b FOR c FAILED

Abortion of a diagnostic request REPT ARR AUTORST

a b FOR c ABORTED

Violation of the fourth-time rule REPT ARR AUTORST

RECOVERY THRESHOLD EXCEEDED FOR c

Time out of a restoral request REPT ARR AUTORST

TIMEOUT AWAITING MIRA FOR c

Inhibition of a restoral request REPT ARR AUTORST

a b FOR c STOPPED <INHIBITED>

Table 3-3. Output Messages that Report ARR Actions

ARR ACTION OR RESULT OUTPUT MESSAGE

7/29/2019 172254




Ring Maintenance

Manual Ring Maintenance

This chapter explains tools and procedures used in manual ring maintenance and

offers suggestions to technicians for solving hard problems and avoiding easymistakes.

Ring Maintenance Interfaces

Technicians who maintain the r ing are supported in their responsibilities by

various maintenance interfaces. The maintenance CRT terminal (MCRT) providesan interactive interface that outputs IMS and other system messages and status

information while accepting as inputs IMS and other system commands. IMS inputand output messages will be recorded on the maintenance read only printer(ROP), if it is turned on. In addition, various audible and visual alarms act to alert

technicians to important IMS events. These maintenance interfaces as they

pertain to IMS are explained below.

Alarms

The following alarms indicate trouble that may affect IMS equipment:

Critical Alarms

A critical condition or fault in or associated with the IMS ring will be indicated by anasterisk C (*C) preceding the ROP output message that identifies the problem. It

may also be indicated by an audible alarm and a red CRITICAL indicator on eachMCRT display-page header.

Major Alarms

A major condition or fault in the IMS ring is indicated by two asterisks (**)

preceding the ROP output message that identifies the problem. It may also beindicated by the following:

s An audible alarm

s A red MAJOR indicator on each MCRT display-page header, and

s A red lamp on the aisle containing the frame/cabinet where the fault orfailure occurred.

See the “Special IMS Indicators'' section in this chapter for descriptions of other

indicators that may appear with a major alarm.

7/29/2019 172254




401-661-045

If a major alarm is caused by a power failure, the POWER indicator on each

MCRT display-page header will show red, and display page 1111 will identify thetype and location of the problem. If the problem is a failed power converter circuit

pack in an IMS frame/cabinet, the lamp at the aisle containing the disabled frame/

cabinet will show red, and inside the frame/cabinet the power alarm light at thetop-left will show red also.

Minor Alarms

A minor condition or fault in the IMS ring is indicated by one asterisk (*) preceding

the ROP output message that identifies the problem. It may also be indicated bythe following:

s An audible alarm

s A red MINOR indicator on each MCRT display-page header, and

s A yellow lamp on the aisle containing the frame/cabinet where the fault or

failure occurred.

See “Special IMS Indicators'' below for descriptions of other indicators that mayappear with a minor alarm.

If a minor alarm is caused by a power failure, the POWER indicator on eachMCRT display-page header will show red, and display page 1111 will identify the

type and location of the problem. If the problem is a single failed fan in an IMSframe/cabinet, the lamp at the aisle containing the disabled frame/cabinet will

show yellow, and inside the frame/cabinet the power alarm light at the top-left willshow red.

Special IMS Indicators

A ring-quarantine (RQ) LED is located on IRN circuit packs. When the RQ LED

shows red, it indicates that the node containing the circuit pack is quarantinedfrom the ring.

A no-token (NT) LED is located on IRN circuit packs. The chief purpose of the NTLED is to indicate, by lighting red, when the node is isolated. The NT LED

mechanism works by detecting the absence of token messages. The ringinterfaces in IRNs, however, cannot make this distinction; so, during periods when

diagnostic are occurring, their NT LEDs will blink off and on as test messagespass. At other times, however, IRN NT LEDs on isolated nodes will show constantred. In addition, when all NT LEDs, of whatever type, in the ring are lighted, the

ring is down.

Each circuit pack in the ring application processors (RAPs) of CDN-1 is equippedwith an LED that indicates when the pack has failed a diagnostic phase. Some of

these LEDs also turn on when the RAP is initializing and then turn off when

7/29/2019 172254




Ring Maintenance

initialization tests confirm that the firmware within the pack is executing. The

nature and uses of these LEDs are explained in the section ``Ring ApplicationProcessor Critical Maintenance Procedure.''

The application-processor circuit pack in a direct link node (DLN) is equipped withgreen, red, and yellow LEDs. The green stays on during normal operation and

goes off when the node is taken out-of-service, when a hard panic occurs in thenode processor, or when diagnostic code begins to be downloaded, whichever

occurs first. The red and yellow LEDs come into play as either diagnostic oroperational code is downloaded. Diagnostic phase 41 begins with a firmware test.

During the test the red and yellow LEDs come on and stay on permanently if thetest fails. If the test passes, the red goes off briefly, then joins the yellow back onagain as the diagnostic proper begins. If the diagnostic fails, the yellow goes off

and the red stays on. If the diagnostic passes, the red goes off and the yellowstays on until the node processor receives the diagnostic results, at which time it

goes off. Then red and yellow come on and go off again as operational code isdownloaded, and the green comes on as the attached processor is placed in

execution. If technicians wish to consult support about the performance of a DLN,they might first observe the behavior of these LEDs so they can report it.

Output messages on the ROP are preceded, when appropriate, by an M or an A,indicating that the action described in the message is the result of a manual or an

automatic IMS request. Table 3-4 on page 3-27 shows the IMS output messagesaccompanied by the types of alarms.

Table 3-4. Alarms Associated with IMS Output Messages (Page 1 of 2)

MESSAGESEVERITY

CRT MAJ MIN

REPT DB INIT X

REPT ERROR X X X

REPT IMSDRV AUD X

REPT IMSDRV FLT X

REPT IMSDRV INIT X X

REPT IUN X

REPT MSDC FLT X

REPT OP_RTM FLT X

REPT PSDO_UMS>P FLT X

REPT RING GROWTH X

REPT RING INIT X X

7/29/2019 172254




401-661-045

Other IMS output messages are not accompanied by audible or visual alarms.

Display Pages

IMS provides technicians with two MCRT display pages, page 1105, the RingStatus Summary Page, and page 1106, the Ring Node Status Page. These pages

are similar in appearance and function to RTR display pages, and the procedureused to access them is also the same. The first three lines of the IMS pages,consisting of the standard header information that appears on all RTR display

pages, are omitted from the illustrations that follow. For more information onStatus Display Page(s), see 410-610-160, The FLEXENT™/AUTOPLEX ®

Wireless Networks, Executive Cellular Processor (ECP) Operations,Administration, and Maintenance Guide.

To access a particular display page, perform the following actions in the orderindicated.

1. Type the NORM/DISP key.

2. Place the MCRT in the command mode by typing the CMD/MSG key.

3. Type and enter 1105 or 1106 on the numeric key pad.

During ring initialization and configuration, indicators or data shown on display

pages may be invalid or out of date; and during disk independent operation, thedisplay page process is terminated.

Page 1105 The Ring Status Summary Page

The 1105 display page provides status information about the entire IMS ring.

Figure 3-1 is typical of an 1105 page for small IMS offices.

REPT RING TRANSPORT ERR X

REPT TDTP FLT X

AUD CNC X

AUD NODEST X

Table 3-4. Alarms Associated with IMS Output Messages (Page 2 of 2)

MESSAGESEVERITY

CRT MAJ MIN

7/29/2019 172254




Ring Maintenance

Figure 3-1. A 1105 Display Page

The 1105 page, as exemplified in the above figure, offers the following informationand capabilities: The first line contains, on the left, the CMD> prompt for command

entries and, on the right, the page title. To enter display commands, move thecursor to the CMD> prompt by typing the CMD/MSG key, then enter the command.

The next three lines identify, in square brackets, locations on the page where thetypes of information, shown within the square brackets, will appear, when

appropriate. The brackets themselves will not appear on display pages.

s [Ring Major State] appears at the location where the current ring

state will be displayed. One of the following states should always bepresent:

RING STATE ACTIVE

RING STAT ISOLATED SEGMENT

RING STAT CONFIGURING

RING STAT DOWN

RING STAT RESTORE

s [Ring Error Threshold State] is the location where a message willappear when the Ring Error Threshold has been exceeded. The thresholdis set by the user to indicate the number of faults per interval of time to bepermitted before the IMS practice of responding initially to ring-related

faults with EAR level-0 (restarting the ring) is discontinued and replaced by

CMD> -- 1105 RING STATUS SUMMARY --

[Ring Major State] [Ring Error Threshold State] CMD Function

400 OP Ring Detailed[ARR Restore; System Indicator; IMSRTS.P indicator]

[ARR Restart] [ACNR Restore or Restart]

00AAAOAAAiigAOO... 01.AAAAOOAA...AAAA 02.AAAAAAAAA...AAA

32AAAAAAAAOOOAAA.. 33.AAAAAAAAAAAAAAA 34.AOOOOOAAAAAAAAA

7/29/2019 172254




401-661-045

EAR level-1 (isolating the fault) or, in response to unexplained loss of

token, by EAR level-3 (ring continuity testing). After the threshold isexceeded, an error-free period of time the length of the threshold interval is

required before IMS returns to its normal practice concerning ring restarts.

When IMS returns to its normal practice, the Ring Error Threshold Exceeded tag will disappear from the 1105 page, and the location will be

blank.

s The information CMD Function/400 OP Ring Detailed appears

permanently on the 1105 page to remind technicians that the page alsoallows entry, at the CMD> prompt, of the 400 command, which produces the

same output as the input message OP:RING;DETD.

s [ARR Restore; System Indicator; imsrts.p Indicator]

appears at the location where a, b, or c, below, will appear:

— A node that ARR is currently attempting to restore, conditionally orunconditionally. The identification will read ARR followed by themethod of restoral (UCL for unconditional, COND for conditional)

followed by the node name in the form NODEa b. If ARR isattempting to restore an EISO or BISO node (see "Three ARR

Rules'' above), CNR will appear in place of ARR .

— One of the following system states of IMS:

s IMS FPI PROLOGUE (appears during the initial stage of anFPI initialization)

s IMS SYS BOOT (appears during the initial stage of level-3 or-4 BOOT initialization)

s IMS LVL3 INIT (appears during subsequent stages of a

level-3 initialization)

s IMS LVL4 INIT (appears during subsequent stages of alevel-4 initialization)

s IMS SYS CRIT SEQ CMPL (appears at the conclusion of a

level-3 or -4 FPI or BOOT initialization)

s IMS SYS ABORT (appears prior to a level-3 or level-4 BOOT

initialization)

s IMSRTS.P CREATED (see below)

— One of the following states of the imsrts.p process, which creates

the IMS display pages:

s IMSRTS.P DIED

s IMSRTS.P CREATED

If ARR is not currently attempting to restore a node and none of the system

or IMSRTS.P conditions exist, the location will be blank.

7/29/2019 172254




Ring Maintenance

s [ARR Restart] appears at the location where any node (other than an

application-nominated critical node) that ARR is currently attempting torestart will be identified. Node restarts that are initiated locally by the node

processor are not recognized nor recorded by this indicator.

s [ACNR Restore or Restart] appears at the location where anyapplication-nominated critical node (see ``Three ARR Rules'' above) thatARR is currently attempting to restore or restart will be identified.

s Because one ARR restart and one ACNR restart may occur in parallel andbecause one or both restarts may occur in parallel with a single restore, it is

possible to have all three node-activity indicators lighted simultaneously. Itis not, however, possible to have two restorals occurring simultaneously,since IMS can restore only one node at a time (see "Three ARR Rules''

above).

The next section of the display page, beginning in the above example with the fifthline, identifies all frames/cabinets in the IMS system, each node within each

frame/cabinet, and the major state of each node. The nodes that occupy a frame/ cabinet are called a group. The example shows six groups identified by their groupnumbers as 00, 01, 02, 32, 33, and 34. To the right of the group numbers are

characters representing the sixteen nodes or node positions within each group.Thus the first character represents the RPCN, and the next fifteen characters

represent IUNs. In the IMS numbering scheme, nodes are identified by theformula RPCNa b or IUNa b, where a is the two-digit group number and b is a

number between 00 and 15 that corresponds to the sequential location of thenode within its group on the downstream path of ring 0. Thus RPCNs are alwaysnumbered 00 and IUNs are always numbered 01 to 15.

The characters also identify, in accordance with the following formulas, the current

major state of each of the sixteen nodes. See Table 3-5 on page 3-31.

Table 3-5. 1105-Page Symbols of Node Major States

Active A

Standby s or S

Out of service, quarantined O

Out of service, isolated i

Grow g or G

Offline f or F

Unequipped . or blank space

Initializing b or B

7/29/2019 172254




401-661-045

In the instances that provide an alternative of an upper- or a lower-case letter, the

lower-case signifies that the node is isolated, and the upper-case signifies that thenode is in the active ring. In the example of an 1105 page above:

s RPCN00 00 is in the active node major state

s LN00 01 and LN00 02 are also active

s LN00 03 is out-of-service quarantined

s LN00 04, LN00 05, and LN00 06 are active

s LN00 07 and LN00 08 are out-of-service isolated

s LN00 09 is in the grow state and is isolated

s LN00 10 is active

s LN00 11 and LN00 12 are out-of-service quarantined, and

s LN00 13, LN00 14, and LN00 15 are unequipped12

Page 1106 The Ring Node Status Page

The 1106 display page provides status information about, and a commandinterface for, a technician-specified group of nodes. Figure 3-2 is typical of an

1106 page.

12 When a group contains any out-of-service nodes, IMS color-codes the entire group withred background on white lettering. For additional information on the node and ringmaintenance states, refer to the `ÀRR or Deferrable Node Recovery” section of thischapter.

7/29/2019 172254




Ring Maintenance

Figure 3-2. An 1106 Display Page

The 1106 page is composed of three areas. The area to the right, beginning with

and including the column of line numbers 01 through 16, displays the major andminor states of a group of up to sixteen technician-specified nodes. In thisdocument, this is called the display area. The area at the top left beginning CMD>

and ending ACNR Restore or Restart is the command-interface andsystem-status area. In this document, this is called the command area. The area

below the command area and to the left of the column of line numbers is anonselectable command menu. In this document, this is called the menu area.

The Menu Area. Entries in the CMS column of the menu area list the input formsfor commands identified under the FUNCTION column. These commands may betyped and entered at the CMD> prompt. The xx in the first, second, seventh, and

ninth commands represent a line number—not a node number—from the columnof numbers, beginning 01 and ending 16, at the center of the page. Each line

number is associated with the node to its right. In the above example, line 02represents IUN00 01; and to quarantine IUN00 01, a technician would enter 202

at the CMD> prompt. By contrast, the nn in the next-to-the-last commandrepresents not a line number but a group number. In the above example, to havethe nodes contained in group 32 displayed, a technician would enter 632. Below is

a listing of the results obtained from entering these 3-digit commands:2xx Quarantines the node identified on line xx.

3xx Unconditionally restores the node identified on line xx.

CMD> -- 1106 - RING NODE STATUS --

NODE> RING MAJOR RI NP MAINT

[Ring Status] NODE NAME POS STATE STATE STATE MODE[ARR Restore, etc.] 01 RPCN00 00 NORM ACT USBL USBL AUTO

[ARR Restart] 02 LN00 01 NORM ACT USBL USBL AUTO

[ACNR Restore or Restart] 03 LN00 02 BISO ACT USBL USBL AUTO

CMS FUNCTION 04 LN00 03 ISO OOS FLTY USBL MAN

2xx RMV node (line xx) 05 LN00 04 ISO OOS FLTY USBL MAN

3xx RST node (line xx)(UCL) 06 LN00 09 EISO ACT USBL USBL AUTO

400 BISO-EISO 07 LN00 14 NORM OOS USBL FLTY AUTO

401/402all non-ACT(next/prev) 08 LN00 15 NORM ACT USBL USBL AUTO

403/404 all Equipped(next/prev) 09

500 DGN Isolated Segment 10

5xx DGN node (line xx) 11

6nn Group nn 12

7xx RST node (line xx)(COND)13

14

TOTAL 15

16

7/29/2019 172254




401-661-045

400 Displays, if the ring has an isolated segment, currently isolated

nodes preceded by the BISO node and followed by the EISOnode. If the isolated segment is greater than 14 nodes, the

display will list first the BISO node, then the first seven isolated

nodes downstream of the BISO node, then the last seven isolatednodes upstream of the EISO node, then the EISO node. It can be

recognized from the Total line below the menu area that a portionof an isolated segment is missing (because the isolation contains

more than 14 nodes). After the 400 command is entered, thisdisplays a number that includes all currently isolated nodes plus

the BISO and EISO nodes. The count on the Total line updatesinteractively.

401 Initially provides in the display area a list of nodes in the ring thatare neither active nor unequipped. Thus it lists any nodes that are

in the out-of-service, standby, initializing, and grow states. Afterthe 401 command is entered, the total number of nonactive nodeswill be given on the Total line below the menu area and updated

interactively. If this number is greater than 16, technicians maypage forward and backward in the list by reentering 401 and 402,

respectively.

403 Entered the first time provides a list of nodes in the ring that are

equipped. Thus it lists all nodes that are in the active,out-of-service, standby, initializing, and grow states. After the 403

command is entered, the total number of equipped nodes will begiven on the Total line below the menu area and updatedinteractively. If this number is greater than 16, technicians may

page forward and backward in the list by reentering 403 and 404,respectively.

500 Runs diagnostic phases 1 and 2 on all RACs in the isolated ringsegment.

5xx Runs all automatic diagnostic phases on the node identified atline xx.

6nn Displays all equipped nodes in group nn, where nn is not the line

number but the group number. After the 6nn command is entered,the total number of equipped nodes within the group will be givenon the Total line below the menu area and updated interactively.

7xx Conditionally restores the node identified on line xx.

The Command Area. CMD> is the prompt for any of the 3-character commandslisted in the command menu. Entering a valid command here evokes an OKresponse. Entering an invalid command evokes an NG response. To enter a

command, manipulate the cursor with the CMD/MSG key until it is at the prompt.

7/29/2019 172254




Ring Maintenance

Then type and enter a 3-character command from the CMS column of the menu

area. The prompt also accepts as input display-page numbers to which thetechnician wishes to turn.

Node> is the prompt for a command that allows technicians to select thesequence of nodes displayed, after having entered a 401 or 403 command. To

employ this feature, enter 401 or 403, manipulate the cursor with the arrow keys tothe Node> prompt, and then type and enter the identification, in the form IUNa b

or RPCNa b, of the node you wish to form the starting point of the sequence. Thedisplay will be redrawn with the specified node as the last entry in the 401 display

and as the first entry in the 403 display. This feature is not available for the 400and 6nn commands where its reordering might be confusing.

[Ring Status] appears at the location where the current ring state will bedisplayed. One of the following states should always be present:

RING STATE ACTIVE


RING STAT RESTORING

RING STAT CONFIGURING

RING STAT DOWN

[ARR Restore, etc] [ARR Restart] [ACNR Restore or Restart]

provide the same information as they do for the 1105 display page, as explainedabove.

Because one ARR restart and one ACNR restart may occur in parallel and

because one or both restarts may occur in parallel with a single restore, it is

possible to have all three node-activity indicators appear simultaneously. It is notpossible, however, to have two restorals appear simultaneously, since IMS can

restore only one node at a time (see "Three ARR Rules'' above).

The Display Area. The display area lists up to 16 nodes and identifies their major

and minor maintenance states. Node major and minor states are explained abovein the `ÀRR or Deferrable Node Recovery'' section of this chapter. A listing of the

maintenance states follows:

s Node Major States

— ACT - Active

— OOS - Out of service

— STBY - Standby

— INIT - Initializing

— OFL - Off-line

7/29/2019 172254




401-661-045

— GROW - Grow

— UNEQ - Unequipped

s Node Minor States: Ring Position

— NORM - Normal

— BISO - Beginning of Isolation

— EISO - End of Isolation

— ISOL - Isolated

s Node Minor States: ring interface

— USBL - Usable

— QUSBL - Quarantine-usable

— FLTY - Faulty

— UNTSTD - Untested

s Node Minor States: node processor

— USBL - Usable

— FLTY - Faulty

— UNTSTD - Untested

s Node Minor States: Maintenance Mode

— AUTO - Automatic

— MAN - Manual

Nodes may be added to 401 and 403 displays by manipulating the cursor to any

vacant line in the display and typing and entering a node name in the form LNa bor RPCNa b. The display will provide status information for the node and alsodisplay the line number in reverse video, indicating its special status. The specialstatus node will disappear when a new command is entered at the CMD> prompt.

Prior to that time the node may be deleted manually by manipulating the cursor tothe line and then typing only the RETURN key.

Ring Diagnostics

IMS provides diagnostic tests for all circuit packs that reside in the ring nodeframes/cabinets except power supplies. These tests are submitted as requests toMIRA and performed in a manner similar to standard RTR diagnostics. They may

be initiated automatically by ARR or manually by technicians through inputmessages or display-page commands.

7/29/2019 172254




Ring Maintenance

Each IMS node-type is tested by a distinct diagnostic routine; each diagnostic

routine is composed of units of sequential execution called phases; and eachphase tests functionally-related hardware. Phases are automatic or optional

(available on demand). Automatic phases are executed when a diagnostic is run

at the request of ARR or in response to a manual request without the PH option.Optional phases are executed only in response to manual requests in which they

are specified in the PH option.

Phases are identified by the node-type on which they are executed and by phasenumbers. Node-types are further distinguished by their hardware composition.

The currently available node-types are IRN RPCNs, IRN2 RPCNs, IRN LNs(LIN-E/SS7), IRN LNs (LI4S/SS7), IRN DLNEs, IRN DLN30s, IRN CDN-Is, IRNCDN-IIs, IRN CDN-IIxs, CDN-IIIs, SS7NEs, DLN6os and IRN MDLs. Phase

numbers reflect the relative order in which phases are run within a routine.

Diagnostic phases 1 and 2 are special in two ways. They are common to allnode-types; and when full, automatic diagnostics are requested whether manually

or by ARR on any node (thus requiring that the node be isolated), phases 1 and 2test the entire path within the isolation as a preliminary step to testing thespecified node. Testing the isolated path requires par tial tests of all nodes and

interframe buffers within the isolated segment as well as tests of the isolatedRACs of the EISO and BISO nodes. Running phases 1 and 2 also has the effect

of clearing RAC status registers. RAC status registers may become improperly setas a consequence of a fault, of the node being powered down, or of the RAC

circuit pack being removed or reset.

Phase 40 is a critical juncture in IMS diagnostics. When a diagnostic request

includes only phases above 39, IMS quarantines the node before running thediagnostic phases on it. When, on the other hand, a diagnostic request includes

any phases below 40, IMS attempts to isolate the node prior to running

diagnostics on it. If, however, ring conditions do not permit the node to be isolated,IMS runs all requested phases that do not require the node be isolated while thenode is quarantined. These will include all requested phases above 40 and somerequested phases below 40.

Most IMS diagnostic routines terminate at the end of a phase in which a test fails.

A few terminate at the end of a failing test. Important exceptions to this statementare as follows: If phase 1 or 2 fails in any node-type, all of phases 1 and 2 are still

run. If either or both phases 1 or 2 fails in RPCNs, phases 10 through 27 are stillrun unless a test fails in these upper phases, in which case diagnostics terminate

at the end of the failing upper phase.

Obtaining Diagnostic Results

Included in Appendix B, Ring Maintenance Reference Material , are two groups oftables that provide IMS diagnostic information. Diagnostic Phase Tables, available

for each node type, identify and superficially describe the phases in each routine.

7/29/2019 172254




401-661-045

Diagnostic Fault Tables, also available for each node type, associate phases with

the circuit packs they test, thereby providing a list of suspect circuit packs for anyfailing phase.

Whether diagnostics are initiated automatically or manually, their results appearas output messages on the ROP. The DGN output message identifies failing

phases and failing tests for a faulty node. And the ANALY TLPFILE outputmessage provides a list of suspect circuit packs in the faulty node. The ANALY

TLPFILE message, invoked by the TLP option of the RST command, is alwaysincluded by ARR requests to restore a node. In the ANALY TLPFILE message,

each circuit pack associated with a diagnostic failure is assigned a numberbetween one and ten. The number represents the probability as calculated by IMSsoftware that the location of the fault is in the pack; the higher the number, the

greater the probability. The DGN and ANALY TLPFILE output messages areprimary sources of diagnostic information for technicians.

Diagnostic Listings

If the information provided by ROP output messages fails to identify faultyequipment, further scrutiny of the diagnostic results is possible using diagnosticlistings. A diagnostic listing is a document that describes a particular diagnostic

phase. Common Network Interface has available the diagnostic listings thatpertain to the CNI configuration of the ring. They consist of the listings for ring

peripheral controller nodes, link nodes, attached processors, and ring applicationprocessors.

A diagnostic listing is composed of a prologue and a statement sequence. Theprologue introduces the subject phase by explaining what it tests, how the testing

is done, and what hardware is involved. All lines in the prologue begin with the

character C, indicating they are comments. The statement sequence consists ofinformation, arranged into numbered statements, about each command within theseries of commands that constitutes the phase. Each statement contains a

statement number, a source-file version of the command, and an ASCIIrepresentation of the executable version of the command. The ASCIIrepresentation is on a line that begins with the string * adr, unless the command

generates a test, in which case the line begins with * test followed by the testnumber. Most statements are preceded by one or more comment lines that

explain the purpose of the command that follows. Statement numbers correspondto numbers that appear in early termination output messages and in DGN AUDIT

RING output messages. They are also used in the EX input message. Testnumbers correspond to the test numbers that appear in DGN output messages.

For technicians, test numbers are the most important information in diagnostic

listings.

7/29/2019 172254




Ring Maintenance

Some long diagnostic listings subdivide the statement sequence into program

units. Program units correspond to divisions of phases that serve explanatoryrather than programming functions. Each program unit is preceded by a prologue

that provides introductory information about the commands within the unit.

Using Diagnostics

IMS ring diagnostics serve three principal purposes to confirm faults, to locatefaults, and to verify repairs. When IMS software removes a node suspected of

being faulty from services, it sometimes employs diagnostics to confirm and tolocate the fault. After replacing or repairing equipment indicated as faulty,

technicians employ diagnostics manually to verify that the fault has beencorrected before returning the node to service.

Because conditional restoral requests of ARR always include the TLP option,technicians usually have no need to manually diagnose a node in order to confirm

or locate its fault. Instead, they should consult the diagnostic results on the ROPthat was generated by ARR's restoral attempt. If, however, a restoral attempt fails

for nondiagnostic reasons, technicians will ordinarily need to run diagnostics onthe node before performing maintenance on it.

Guide to Critical Ring Maintenance

This document uses the term "critical maintenance" for manual actions

undertaken to correct faults and to recover the ring. The faults are of the kind thatobstruct the transportation of messages on the ring (ring-related faults) or the kindthat prevent the processing or transmission of messages within nodes

(node-related faults). As applied to nodes and their components, the principles of

critical maintenance are essentially the same for all except the ring applicationprocessors (RAPs) of CDN-Is which require unique treatment. Therefore, amongthe maintenance procedures set forth below, there is a special one for RAPs.

Critical maintenance most often occurs with the ring subsystem in operation,however fragmented the total ring might be by out-of-service nodes. Occasionally,

however, critical maintenance is required when, because of r ing conditions, thering subsystem fails and cannot be recovered by automatic means. This state,

known as ring down, is also discussed in this chapter and addressed with its ownprocedure.

The section begins with a discussion of the IMS commands technicians will most

often employ in performing critical ring maintenance. The discussion is intended

to amplify information contained in the IMS Output Manual ; it is not to be used asreference material.

7/29/2019 172254




401-661-045

IMS Input Messages

IMS input messages allow technicians to practice critical maintenance by

manually controlling various maintenance functions associated with the IMS

ring.13

A descriptive list of frequently-used IMS input messages follows. Wherethe word NODE appears in the list, substitute RPCN or the user's name for an IUN(LN, for example).

RMV:NODE Quarantines the specified node. If the command is executed for anode that has been automatically quarantined, the maintenance

mode of the node will change to manual, and the node will remainquarantined until it is manually returned to service by a version of

the RST:NODE command.

Before entering RMV:NODE for an active node with an active

external user interface, remove from service the communicationlink or links that terminate in the node.

DGN:NODE Executes diagnostic phases on the specified node. If no phasesare specified, DGN:NODE with exceptions described in a and b

below

a. If a node is in the active segment of an isolated ring but not a

BISO or EISO node, DGN:NODE with no phases specifiedquarantines the node (if it was not already quarantined) and

runs all diagnostic phases that do not require the node beisolated.

b. If the node is a BISO or EISO node, DGN:NODE with nophases specified extends the isolation to include the nodeand runs all automatic phases on it. If, however, the extended

isolation would create an active ring that is too short to

support message transport, the extension is not allowed andthe subsequent action is that described in a. above.

13 These commands may conform either to the Program Documentation Standards (PDS) —except that terminal exclamation marks are supplied automatically by software —or to theMan-Machine Interface Language (MML). Technicians should select one or the other ofthese message conventions by setting the RTR ECD spooler flag to PDS or MML. For anexplanation of the PDS input-message format, consult 3B21D Computer, UNIX RTR Operating System, Input Message Manual, PDS ``Section 2, User Guidelines.” For acomplete description of PDS, consult the Bell Laboratories Program Documentation Standards Reference Manual . For an explanation of the MML input-message format,consult 3B21D Computer, UNIX RTR Operating System, Input Message Manual, MML

``Section 2, User Guidelines.” For a complete description of MML, consult the CCITT MMLRecommendations (Z.301-Z.341) which are available from OMNICOM, Inc. Vienna,Virginia.

To set the spooler flag, see the layout for the ECD splrinfo form in the RTR Operating System, Recent Change and Verify Manua l for the 3B21D Computer.

7/29/2019 172254




Ring Maintenance

If any phases below 40 are specified, DGN:NODE behaves as

above except that it attempts to run only the specified phases.

If only phases above 39 are specified, DGN:NODE runs the

phases on the node after quarantining it (if it was not already

quarantined).

If a node was active or quarantined prior to the request fordiagnostics, DGN:NODE attempts to quarantine it after

diagnostics have completed. If a node was in another state,DGN:NODE leaves the node in the state in which it found it,

provided that diagnostic results do not require a different state.(Technicians would ordinarily return a quarantined node that hadpassed diagnostics to service by unconditionally restoring it.)

Before entering DGN:NODE for an active node with an active

external user interface, remove from service the communicationlink or links that terminate in the node.

RST:NODE Entered unconditionally for an out-of-service node that is notsandwiched in isolation between nodes with faulty ring interfaces,unisolates and/or unquarantines the node—thus placing it in the

active ring, downloads operational code into it, places the code inexecution, then changes the major state of the node to active. If

the node is sandwiched in isolation, RST:NODE enteredunconditionally leaves the node isolated, while placing it under

ARR control so that it will be automatically restored when ringconditions permit.

Entered conditionally, RST:NODE completes the same actions asDGN:NODE with no phases specified, then restores the node,

provided that it passes diagnostics and is not sandwiched in

isolation. If it is sandwiched in isolation, RST:NODE leaves itisolated while placing it under ARR control so that it will be

automatically restored when ring conditions permit.

If a node fails diagnostics, RST:NODE leaves it isolated, if its

ring-interface state is FLTY, or quarantines it, if its ring-interfacestate is USBL or QUSBL and it is not sandwiched in an isolation.

If the RST:NODE command is followed by a resource failure thatprevents downloading or executing code, a REPT IUN RST

output message with failure code 43 will appear on the ROP.When this occurs, technicians should wait a few minutes and try

the restoral again.

Before entering RST:NODE conditionally for an active node with

an active external user interface, remove from service thecommunication link or links that terminate in the node.

7/29/2019 172254




401-661-045

After entering RST:NODE for a node whose communication link

has been manually removed from service, it may be necessary tomanually return the communication link to service.

OP:RING Produces an OP RING output message concerning the status or

generic identity of specified nodes, groups of nodes, or of thering.

CFR:RING

1. isolates or attempts to end the isolation of specified nodes or

2. initializes the ring if it is down.

Because the DGN and RST commands provide automatic

isolation and unisolation of nodes under most conditions, thiscommand is rarely used. The command is intended primarily for

use in the first sense when growing and degrowing nodes and inthe second sense when a new ring is being installed underManual Ring Mode, which is explained below. In daily operations,

the first version of the command might be used with the excludeoption to isolate a node whose ring-interface state is

quarantine-usable prior to changing the ring-interface or IRNcircuit pack. With the MOVFLT option the first version command

can be used to shift an isolation on a ring that is too small for theisolation to be extended.

Before the Exclude version of the CFR command is entered foran active node, the node must be removed from service with theRMV:NODE command.

Tables providing brief descriptions of commonly used versions of IMS output

messages appear in Chapter 5, Ring Critical Events .

Critical Maintenance Procedures for Nodes

Because of the automatic actions of IMS maintenance software, techniciansordinarily perform critical maintenance on nodes that ARR has attempted

unsuccessfully to restore. Most restoral attempts that fail do so because ofdiagnostic failure. A few fail either because the attempt timed out waiting a reply

from MIRA or because a recurrent error condition caused a node to violate thefourth-time rule, which prevents ARR from restoring the same node for a fourth

time within a 60-minute interval. When any restoral attempts fails, ARRannounces the event with a version of the REPT ARR AUTORST message on the

ROP and changes the maintenance mode of the node to manual, thereby,

directing technicians to perform maintenance on it.

This section contains three procedures for clearing faults in individual nodes andthree procedures for dissolving isolations. Of the procedures for clearing faults,

one is to be used when ARR has failed to restore a node, one when critical

7/29/2019 172254




Ring Maintenance

maintenance is manually initiated, and one when—these procedures failing to

clear a problem—it becomes necessary to consult diagnostic listings. Theinformation provided by these three procedures is entirely sufficient for the

maintenance of nodes that are quarantined. Maintenance of isolated nodes,

however, involves these issues and others as well. The section ends withprocedures for dissolving isolations. One is concerned with single-node isolations;

one is concerned with multiple-node isolations; and one, to be used in conjunctionwith the other two, is concerned with the problems associated with a fault in a

BISO or EISO node.

Procedure 3-2. Clearing Faults in Response to ARR Action

ARR turns a faulty node over to technicians isolated when diagnostics or errormessages indicate a ring-interface problem that prevents the node from

propagating messages on the ring. Otherwise, it turns a faulty node over totechnicians quarantined. Thus technicians sometimes do and sometimes do not

receive a node from ARR in the proper state for replacing the circuit packs thatdiagnostics have indicated as possibly faulty. Quarantined nodes with

ring-interface problems (ring interface QUSBL) and IRN nodes with nodeprocessor problems are turned over to technicians quarantined yet must be

isolated before their ring-interface circuit packs are replaced. Nodes requiringbackplane repairs must also be isolated.

IMS circuit packs are designed to be replaced while the power supply to the nodeis on.

1. Learn of the failure of an ARR restoral attempt from a REPT ARR AUTORST RST

RQST FOR a FAILED output message, where a is the node that failed. Confirm withthe OP:RING command or from the 1106 display page that the failed node is in the

manual mode.

2. Note the failing phases and tests from the DGN output message.

3. From the information concerning failing phases, compose a list of suspect circuit

packs using the ANALY TLPFILE output message, and obtain from the supply of

spare circuit packs one of each pack on your list.

Observing the circuit pack LEDs, ensure that the node containing the listed packor packs is in the proper state for having the pack(s) replaced.

7/29/2019 172254




401-661-045

The following Table describes the various LED indications.

Nodes should be isolated before having any part of their backplanes repaired.

4. Replace the first circuit pack on the list, then proceed as follows:

s If you replaced a ring-interface, a node-processor, or an IRN circuit pack in

any node-type other than an RPCN, restore the node conditionally withRST:NODEa,b command.

s If you replaced any circuit pack in an RPCN other than the DDSBS circuitpack, restore the node conditionally with the RST:RPCNa,b command.

s If you replaced the DDSBS circuit pack of an RPCN, first run all automatic

diagnostic phases with the DGN:RPCN command. If the automatic phasespass, next run optional diagnostic phase 14 with the commandDGN:RPCNa,b:PH 14,CU c where c is 0 or 1, indicating the off-line control

unit of the 3B21D. If the DDSBS circuit pack passed both optional andautomatic diagnostic phases, restore the node to service unconditionally

using the RST:RPCNa,b;UCL command.

s If you replaced an auxiliary circuit pack of any node other than an RPCN orCDN-I, enter the command DGN:NODEa,b:PHc where c is the range ofphases that test the circuit pack you replaced. If the unit passes all

specified diagnostic phases, restore the node unconditionally with theRST:NODEa,b;UCL command.

Table 3-6. Circuit Pack LED States

Circuit-PackType

Node Type State Indication

auxiliary any quarantined or iso-

lated

RQ LED red

IRN VLSI isolated NT LED red

IFB any isolate the adjacent

node in the same

unit as the IFB CP

NT LED red

NOTE:Before pulling any circuit pack in units not equipped with a connectorassembly, isolate all nodes serviced by the power supply associated with the

connector assembly. In 3-node units, the connector assembly is located at therear of the backplane at the RI\ 1 position in the two external nodes and isassociated with the nearest power supply. In two-node units, the connector

assembly is located at the rear of the backplane at the RI 1 position in bothnodes and is associated with the nearest power supply. In eight-node units the

connector assembly is located at the back of each power supply and isassociated with that power supply.

7/29/2019 172254




Ring Maintenance

s If you replaced the DDSBS circuit pack of a DLN, first run all automatic

diagnostic phases with the DGN:NODEa,b command. If the automaticphases pass, next run optional diagnostic phase 34 with the command

DGN:NODEa,b:PH 34,CU c where c is 0 or 1, indicating the off-line control

unit. If the DDSBS circuit pack passed both optional and automaticdiagnostic phases, restore the node to service unconditionally using the

RST:NODEa,b;UCL command.

s Consult the section ``Ring Application Processor Critical Maintenance

Procedure'' for instructions on diagnosing and changing auxiliary circuitpacks on a CDN-I.

s If to replace an interframe buffer you isolated an RPCN, restore the nodeconditionally with the RST:RPCNa,b command. If to replace an interframe

buffer you isolated any other node-type, run diagnostic phases 1 through13 with the DGN:NODE,b:PH 1-13 command and, if the phases pass,

restore the node unconditionally. If you permanently removed an interframebuffer or substituted a buffer with different capacity, change the ECD HV

field to reflect the change before restoring the node.

5. If the list of suspect circuit packs contained more than one entry and the node failed

to pass diagnostics after the first listed pack was replaced, reinstall the original pack,

replace the next pack on the list, then repeat the applicable portion of 4 and 5 above.

Continue in this fashion until either the node passes the specified diagnostic tests or

all circuit packs on the list have been replaced and tested. (If the node you are

troubleshooting is critically important or contributing to a multiple isolation, you may

wish to replace simultaneously all its circuit packs and then, at another time, reinstall

the original packs and test them individually to determine which pack was at fault.)

6. If you replaced all circuitpacks without the node passing diagnostics, visually inspect

the node and its housing. Look for unseated circuit packs, backplane damage, poorgrounding connections, and unseated cable connections. Before repairing the

backplane, isolate the node.

7. If the backplane is not at fault, consult the sections below on isolations and

trouble-shooting.

Procedure 3-3. Manually Initiated Maintenance of Nodes

In general, technicians should avoid manual intervention of any kind while EAR isattempting to recover the ring and should avoid manually intervening with a node

that ARR is attempting to restore.

7/29/2019 172254




401-661-045

IMS circuit packs are designed to be replaced while the power supply to the node

is on.

1. Before entering an RMV, DGN, conditional RST, or CFR:RING,NODExx

yy;EXCLUDE command for an active node with an active external user interface,remove from service the communication link or links that terminate in the node. After

entering an RST command for a node whose communication link was manually

removed from service, it may be necessary to manually return the communication

link to service.

2. Before manually initiating maintenance on a circuit pack or interframe buffer, remove

the resident or associated node from service. See Table 3-6.

Before replacing a power supply circuit pack in a 3-node unit, isolate the twonodes adjacent to the power supply. In a 2-node unit, isolate the node adjacent to

the power supply. In an 8-node unit, isolate the four nodes adjacent to the power

supply. In a 5-node unit, learn from the unit horizontal designation strip next to thepower supply in question the nodes serviced by the power supply, and isolateeither three or two nodes.

Nodes should be isolated before having any part of their backplanes repaired.

3. To quarantine a node, remove it from service with the RMV:NODEa b command.

This action has the effect of changing the maintenance mode of the node to manual,

thus preventing ARR from attempting to restore it.

4. To isolate a node, first remove it from service with the RMV:NODEa b command, and

then isolate it with the CFR:RING,NODExx yy;EXCLUDE command. This also has

the effect of changing the maintenance mode to manual.

5. If a quarantined or isolated node has not had a circuit pack replaced or reset, it may

be restored to service unconditionally.

6. If an isolated node has not had a circuit pack replaced but has been powered down

or had a circuit pack reset, run diagnostic phases 1 and 2 on it with the

DGN:NODEa,b:PH 1-2 command. If it passes it may be restored to service

unconditionally.

7. If a node has had a circuit pack replaced, observe the guidelines set forth in the fifth

step of the procedure ``Clearing Faults in Response to ARR Action.''

7/29/2019 172254




Ring Maintenance

Procedure 3-4. Using Diagnostic Listings

If the information provided by ROP output messages fails to identify faultyequipment, further scrutiny of the diagnostic results is possible using diagnostic

listings as explained below:

1. Note the failing phase and failings tests in the DGN output message.

2. Obtain the diagnostic listing(s) for the phase(s) that failed.

3. Read the prologue(s) to the failing phase(s) and, if one exists, the prologue to the

program unit in which failing tests appear. Pay particular attention to any

troubleshooting hints.

4. Read the individual comments on statements that contain failed tests.

5. If this information does not provide guidance on how to clear the fault, consult the

``Recognizing and Finding Intermittent Faults'' and the `Òther Suggestions for

Troubleshooting'' sections below for possible solutions.

6. If these sections provide no leads, seek assistance from the CTS.

Critical Maintenance Procedures for Nodes in

Isolation

Under circumstances described previously in this document, EAR may respond toconditions on the ring by creating an isolated segment that ARR cannot dissolve.

In these cases, dissolving the isolation becomes the responsibility of technicians.Generally, technicians should respond promptly to an isolation, since even a

singly-isolated node creates the potential of a massive isolation, in the event thatanother node must also be isolated.

Dissolving isolations sometimes requires that they be extended to include theBISO or EISO node. There are two reasons why this may need to be done. The

first involves the ambiguity IMS experiences in detecting certain types ofring-related faults. The second involves the way in which diagnostic code is

transmitted into an isolated segment.

The second can be stated simply. Messages, including messages containing

diagnostic code, are sent from the 3B21D to an isolated segment of the r ingthrough the BISO or the EISO node. BISO and EISO nodes have one RAC

participating in the active-ring segment and one RAC participating in the

7/29/2019 172254




401-661-045

isolated-ring segment. Messages destined for the isolated segment are read from

the active ring by the active-ring RAC, then transmitted by the node processor tothe isolated-ring RAC, which writes them to the isolated segment of the ring. A

fault in the isolated-ring RAC of either BISO or EISO node might go undetected,

since it would not affect the transportation of message on the active ring and couldshow up misleadingly as a diagnostic failure in the isolated node. Therefore,

technicians who find that they cannot clear a fault that appears to reside in theisolated node should extend the isolation to include the current BISO and EISO

nodes and run diagnostics again.

Low-Phase Ambiguity

The other reason for extending isolations concerns the ambiguity that IMSexperiences in detecting certain ring-related faults. Faults that prevent the

propagation of messages on the ring usually produce phase-1 and phase-2diagnostic failures. In the case of such failures, IMS often has the problem of

being unable to decide in which of two adjacent RACs a fault resides. Because

this problem is associated entirely with the parts of node hardware tested bydiagnostic phases 1 and 2, this document calls it ”low-phase ambiguity.''

Low-phase ambiguity does not usually result in the isolation of two nodesbecause, while one suspect RAC is isolated, the other suspect RAC may be

included in the isolated segment as the isolated RAC of the BISO or EISO node.The following figure illustrates the ring structure that permits this practice:

Figure 3-3. Isolated RACs of BISO and EISO Nodes

Notice that either RAC 1 of the BISO node or RAC 0 of the EISO could beincluded in the isolated segment as a suspect RAC.

IMS has difficulty acknowledging by customary means the fact that it has included

possibly faulty RACs in BISO or EISO nodes. A BISO or EISO node, being in theactive ring, cannot have its ring interface marked faulty. Therefore, if a RAC ofsuch a node is suspect, this fact will not be indicated in the minor state of the node

nor in the TLP information. It will, however, be reflected in tests 5 and 10 of theROP failure data for diagnostic phases 1 or 2, provided that the RAW option of the

RAC 1

RAC 0RAC 0RAC 0

Ring Interface Ring InterfaceRing Interface

RAC 1

RAC 0

EISO NodeBISO Node Isolated Node

RAC 1RAC 1RAC 1

RAC 0

7/29/2019 172254




Ring Maintenance

DGN command has been specified. (ARR does not specify the RAW option, so

the automatically output DGN failure data does not contain this information in full.It does, however, contain failing test 5, which is a sure indication that low-phase

ambiguity exists.)

The maintenance principle dictated by low-phase ambiguity is represented in the

following procedure:

Procedure 3-5. Determining the Nodes Involved in Low-Phase Ambiguity

1. After attempting to clear a fault in an isolated node that has failed test 5 ofdiagnostic phases 1 or 2, run verification diagnostics on the node with the

RAW option using the command DGN:NODEa,b;RAW, where NODEa,b isthe isolated node.

2. If the node passes all diagnostic phases, restore it to serviceunconditionally.

3. If the node still fails phases 1 or 2, consult the output message generatedby the DGN command with the RAW option, and determine whether it is

the BISO or EISO node that is suspected of being faulty. This is anexample of an output message when the RAW option of the DGN

command has been specified:

DGN LN32 1 PH 1 STF (14 X'00000000 x'00000000)

TEST MISMATCH ACTUAL MASK EXPECTED

001 X'00010000 N/A N/A N/A

004 X'FF012242 N/A N/A N/A

005 X'00000E01 N/A N/A N/A

006 X'00000044 N/A N/A N/A

007 X'0000002E N/A N/A N/A

008 X'00000E00 N/A N/A N/A

009 X'00000E04 N/A N/A N/A

010 X'00000E02 N/A N/A N/A

011 X'FF012242 N/A N/A N/A

7/29/2019 172254




401-661-045

Ignore everything except the mismatch data for test 005 and 010. If either

test 005 or test 010 appears in the DGN output message, the other willappear also, provided that the RAW option to the DGN command has been

specified. These tests will always identify two nodes as possibly faulty.

4. Using the physical node-address table in the reference chapter of thisdocument, translate the hexadecimal mismatch data for test numbers 005and 010 into the node names of two nodes. For example, in the above DGNoutput message, 00000E01 translates into IUN32 1 and 00000E02

translates into IUN32 2. These are the nodes suspected by IMS of beingfaulty. In the case of single-node isolations, one of the suspect nodes will

be the isolated node and the other will be the BISO or EISO node, thesuspect component of which will be the RAC 1 of the former or RAC 0 of

the latter.

5. When one suspect node is an EISO or BISO node, manually remove its

communication link (if it has an active one) from service, then remove thenode from service with the RMV:NODEa b command, thus extending the

isolation to include the suspect node in the isolated segment.

6. Perform maintenance on the newly isolated node.

Low-phase ambiguity has bearing on the procedures for treating single-and multiple-node isolations.

The procedures concerning isolations that follow are merely recommended. Whencircumstances, reason, or user practices dictate to act differently, do so. Theprocedures are not self-sufficient but build upon the three procedures discussed

above for clearing faults in nodes. The order of battle in these procedures is this:first perform maintenance on suspect nodes within the isolated segment. If this

fails to dissolve the isolation, next check to see if the isolated RAC of an EISO or

BISO node is suspected of being faulty. If so, perform maintenance on it afterincluding it in the isolation. Finally, if no isolated RAC in the EISO or BISO node issuspected of being faulty, extend the isolation to include the BISO and EISO

nodes, one at a time, and run diagnostics again on the chance that a fault in oneof their isolated RACs is being misread by diagnostic code.

7/29/2019 172254




Ring Maintenance

Guideline to Single-Node Isolations

Procedure 3-6. Responding to Single-Node Isolations

1. Recognize the existence of an isolated segment from output messages or from

information on 1105 or 1106 display pages. In some cases technicians will

themselves create an isolation, as for example when ARR turns over to technicians

a quarantined node that must be isolated before manual maintenance can be

performed on it.

2. If you are on-site, confirm that the node is isolated by checking its NT LED.

3. Follow the appropriate procedure for the isolated node from the procedures listed

below:

s Clearing Faults in Response to ARR Actions

s Manually Initiated Maintenance of Nodes

If test 5 of a phase-1 or phase-2 failure is indicated, verify your repair using theDGN command with the RAW option specified, thereby learning when the isolated

node still fails diagnostics whether the isolated RAC of the BISO or EISO node isalso suspected by IMS of being faulty.

4. If the procedure that you employed on the isolated node in step 3 failed to end the

isolation and test 5 and test 10 of a phase-1 and/or phase-2 failure is indicated,

extend the isolation to include the BISO or EISO node identified by the mismatch

data for test 10. Use the command RMV:NODEa, b, where NODE is the node name

of the node identified by test 10 mismatch data. On small rings you may have to shift,

rather than extend, the isolation by employing the MOVFLT option of the CFR:RING

command. (If the BISO or EISO node has an active communication link, remove the

link from service before removing the node.)

5. Follow the procedure “Clearing Faults in Response to ARR Actions'' for the newly

isolated node.

6. If:

BISO

Node

Isolated

Node

EISO

Node

7/29/2019 172254




401-661-045

a. the procedure that you employed on the isolated node in 3 failed to

end the isolation

b. and test 5 of a phase-1 and/or phase-2 failure is not indicated,

extend the isolation to include the BISO node with the command RMV:NODEa, b,where NODE is the BISO node. On small rings you may have to shift, rather thanextend, the isolation by employing the MOVFLT option of the CFR:RINGcommand. (If the BISO node has an active communication link, remove the link

from service before removing the node.)

7. With the former BISO node now in the isolated segment, again diagnose the

originally isolated node.

8. If the originally isolated node now passes diagnostics,

a. diagnose the former BISO node and, if it fails, perform maintenance

on it following the TLP instructions

b. but if it passes, change its ring-interface and node-processor circuit

pack(s), then conditionally restore it to service.

s If the former BISO node now enters the active ring (therebydissolving the isolation), unconditionally restore the originally

isolated node (which should now have become quarantined) toservice, and end this procedure.

9. But if the originally isolated node still fails diagnosticsafter the former BISO node has

been included in the isolated segment, reduce the isolation by unconditionally

restoring the former BISO node, thereby making it once again the BISO node. (You

may have to manually return its communication link to service.)

10. Extend the isolation in the other direction to include the EISO node, and treat the

former EISO node as you did the former BISO node above.

BISONode Isolated

Node

EISONode

FormerBISONode

Originally

BISONode

EISONode

FormerBISONode

IsolatedNode

Originally

7/29/2019 172254




Ring Maintenance

11. If the originally isolated node still fails diagnostics after the isolation has been

extended in both directions, or if the isolation repeatedly dissolves and returns,

attempt any appropriate procedures described in the section below on

troubleshooting. Then, if the isolation still persists, call the CTS.

Guideline to Multiple-Node Isolations

Isolations of more than two nodes will often contain innocent victims, that is,nodes that are included in the isolation, not because they are faulty, but because

they reside between faulty nodes. The ring interfaces and node processors ofsuch nodes will be classified as usable. Unless technicians manually remove

innocent victim nodes from service, they will remain in automatic maintenancemode, and ARR will automatically return them to service when the isolation isdissolved.

Procedure 3-7. Responding to Multiple-Node Isolations

1. Recognize the existence and extent of an isolated segment from output messages

or from information on 1105 or 1106 display pages.

2. Identify from DGN output messages the nodes within the isolation regarded by IMS

software as faulty. In nearly all cases the faulty nodes should be the isolated nodes

next to the BISO and EISO nodes. If an interior node is also indicated faulty, ignore

it until partial success in this procedure transforms it into a node next to an EISO or

BISO node.

3. If you are on-site, confirm that the nodes in question are indeed isolated by checking

their NT LEDs.

4. Choose to begin working on either the isolated node next to the BISO node or the

isolated node next to the EISO node. Base your choice on the followingconsiderations in the order shown:

a. If diagnostic failure data is given for only one of the two nodes, begin

with the node for which you have failure data.

BISONode

EISONode

Isolated Nodenext to theEISO Node

Isolated Nodenext to theBISO Node

InnocentVictimNode

7/29/2019 172254




401-661-045

b. If failure data is given for both nodes, begin at the end of the

isolation that includes the nodes most important to your operation.

5. For the node you have chosen, follow the procedure ``Clearing Faults in Response

to ARR Actions.'' If test 5 of a phase-1 or phase-2 failure is indicated for this node,verify your repair of the node using the DGN command with the RAW option

specified, thereby learning when the isolated node still fails diagnostics if the isolated

RAC of the adjacent BISO or EISO node is also suspected by IMS of being faulty.

6. If the procedure clears the fault of the isolated node next to the BISO or EISO node,

the ring shouldnow contain only a singly-isolated node, since both the repaired node

and the innocent victim nodes will have returned to the active ring. (An exception to

this statement occurs when the isolated segment contains three faulty nodes. In this

case, restoring one of the external faulty nodes will result in a smaller multiple

isolation. If this occurs, return to the beginning of this procedure and repeat the steps

up to here, then continue on.) Treat the singly-isolated node according to the

procedure for ``Responding to Single-Node Isolations,'' and end this procedure.

7. If, however, the procedure that you employed failed to reduce the isolation and test 5

and test 10 of a phase-1 and/or phase-2 diagnostic failure are indicated, extend the

isolation to include the BISO or EISO node identified by the mismatch data for test

10. Use the command RMV:NODEa, b, where NODE is the name of the node

identified by test 10 mismatch data. On small rings you may have to shift, rather than

extend, the isolation by employing the MOVFLT option of the CFR:RING command.

(If the BISO or EISO node has an active communication link, remove the link from

service before removing the node.)

8. Follow for the newly isolated node the procedure ``Clearing Faults in Response to

ARR Actions.''

9. If the procedure clears the fault of the newly isolated node, the ring should now

contain only a singly isolated node, since the repaired node, the isolated node next

to the original BISO or EISO node, and the innocent victim nodes will have returned

to the active ring. (An exception to this statement occurs when the isolated segment

contains three faulty nodes. In this case, restoring one of the external faulty nodes

will result in a smaller multiple isolation. If this occurs, return to the beginning of this

procedure and repeat the steps.) Treat the singly-isolated node according to the

procedure for ``Responding to Single-Node Isolations,'' and end this procedure.

10. If the previous step of this procedure fails to reduce the isolation or test 5 and test 10

of a phase-1 and/or phase-2 diagnostic failure were not indicated after failure in Step5 above, go to the other end of the isolated segment and repeat Steps 5 through 9

there.

7/29/2019 172254




Ring Maintenance

11. If these steps fail to reduce the isolation, extend the isolation to include either the

EISO or BISO node if one has already been extended, choose the other; if neither

has been extended,choose eitherwith the command RMV:NODEa, b, where NODE

is the EISO or BISO node. (If the EISO or BISO node has an active communication

link, remove the link from service before removing the node.

12. With the former EISO or BISO node now in the isolated segment, diagnose the

isolated node next to the former EISO or BISO node; and if the isolated node next to

the former EISO or BISO node now passes diagnostics, change the ring-interface

and node-processor circuit pack(s) of the former EISO or BISO node, then

conditionally restore the former EISO or BISO node to service.

13. If the former EISO or BISO node enters the active ring (thereby reducing the

isolation), treat the remaining isolation according to the procedure for single-node

isolations.

14. If, however, the isolated node next to the former EISO or BISO node still fails

diagnostics, unconditionally restore the former EISO or BISO node to the active ring.

(If you manually removed its communication link from service, you may have to

manually return it to service.) Then extend the isolation at the other end of the

isolated segment (unless you have done so previously), and treat that end in the

same way you have treated this end.

15. If both originally faulty nodes still fail diagnostics after the isolation has been

extended in both directions, or if the isolation returnsafter nodes havebeen restored,

follow any appropriate procedures described below in the section on

troubleshooting. Then if the problem still persists, call the CTS.

BISONode

EISONode

Isolated Nodenext to the

EISO Node

Isolated Nodenext to theBISO Node

InnocentVictimNode

EISONode

Former

Former

BISONode BISO

Node

Former Isolated Nodenext to the

BISO NodeFormer

InnocentVictimNode

Isolated Nodenext to theEISO Node

EISONode

7/29/2019 172254




401-661-045

Responding to Ring Down

IMS in the 3B21D and IMS in the ring are independent of one another to the

extent that either can fail while the other remains in operation. This section is

concerned with the problems that confront technicians when the ring subsystemfails because of ring conditions and cannot be recovered by automatic means.

The ring subsystem will fail when the 3B21D cannot communicate with the activering through any RPCN. This condition is most likely to occur in a two-RPCNenvironment when both RPCNs fail or when the active RPCN fails after the other

RPCN had been manually removed from service. In a multiple-RPCNenvironment, the condition is most likely to occur because of a condition in the

3B21D that would simultaneously disable all RPCNs.

The ring subsystem will also fail if the data length of the active ring becomesshorter than the maximum message length for which the system was engineered.

Small rings are susceptible to this problem. The problem is brought about by the

ring fragmentation associated with an isolation. An isolation that includes paddedinterframe buffers may shorten the active ring severely. Padded interframe buffers

are redundantly employed in pairs at opposite sides of the ring. Thus asingle-node isolation would not usually include both pairs. Still, interframe buffers

exist under a kind of quadruple jeopardy, because if either member of a pair fails,the pair fails and must be isolated, and because a pair must also be isolated if

either of the nodes adjacent to it fails. Thus while it is unlikely that both pairs willbecome isolated, they have.

Finally, a ring may go down and stay down because of an intermittent fault thatconfuses initialization tests, or a ring may repeatedly go down because of a fault

that is transparent during initialization tests but not during normal operations.

The following procedure for recovering a ring that is down is intended as aninstructional paradigm only. Technicians should freely depart from it ascircumstances, reason, or user practices suggest. In particular, technicians should

not manually intervene until they are certain that IMS software has exhausted allits efforts to recover a down ring. Such recovery efforts are ordinarily directed by

user software. Therefore, technicians should consult user documentation to learnhow to know when automatic recovery efforts have ended.

7/29/2019 172254




Ring Maintenance

Procedure 3-8. Ringdown Response Procedure

1. Following the termination of automatic recovery efforts, immediatelyattempt to bring

the ring up by submitting it to a level-3 and, if that fails, to a level-4 IMS initialization.

If it is important to the user that IMS in the 3B21D not abort itself should ring

initialization fail, initialize the ring at level 4 using manual ring mode, as explained

below.

2. If in response to level-4 initialization the ring fails to come up (as indicated bya REPT

RING INIT output message) or to stay up (as indicated bya REPT RING CFR output

message), determine the cause of its failure byexamining the outputmessages. The

REPT RING INIT messages in question are of two types. One type indicates the

reason the ring failed tocome up. These reasonsincludeno standby RPC nodes

available and no ring segment acceptable for active ring use,with the latter indicating either that no candidate for the active ring-segment contains

an RPCN or that no candidate is long enough to satisfy the requirement of minimum

length. In the absenceof the first message, the second messagemay be understood

to indicate that the problem is length. The second typeREPT RING INIT message

identifies nodes that tests conducted during initialization have determined to be

faulty.

3. If RPCN failure is the apparent cause, replace all circuit packs with known good

packs in an RPCN that was not isolated before the ring went down. Then initialize

IMS at level 4. If this attempt fails, replace all circuit packs with known good packs in

another RPCN.

4. If ring length is the apparent cause, identify faulty nodes by examining the second

type REPT RING INITmessage. Mentally construct the population and distribution

of nodes within the portion of the ring that is likely to become the isolated segment.

Ask yourself the following questions:

s Are any nodes adjacent to padded interframe buffers listed as faulty?

s If so, are they all external nodes (adjacent to the BISO or EISOnodes) within the portion of the ring likely to become the isolated

segment, or is one of them an internal node within that portion?

s If not, are they innocent victim nodes within the candidate for the

isolated segment?

7/29/2019 172254




401-661-045

5. If nodes adjacent to padded interframe buffers are faulty and one of them is likely to

be an external node in an isolated segment, replace (if you are in an emergency

situation) the ring-interface and node-processor circuit pack(s) on both nodes

adjacent to the interframe buffers and replace both interframe buffers. Then initialize

the ring at level 4.

6. If nodes adjacent to padded interframe buffers are internal nodes (either faulty or

innocent-victim) in the candidate for the isolated segment, approach the problem

following the procedure described above for responding to multiple isolations

(though of course under ring down conditions you will not be able to conduct

diagnostics). Then, if a node adjacent to padded interframe buffers becomes a

probable external node in a candidate for the isolated segment, treat it as in 5 above.

7. Study the MOVFLT option of the CFR:RING command. It may be useful in resolving

an isolation on a very small ring.

8. If none of the above approaches succeeds in recovering the ring, force faults byunseating various ring circuit packs and initializing at level 4. This is a desperate

attempt by trial and error to force an isolation in the hope of getting the ring up. Once

the ring is up, diagnostics can be run on the isolated portion.

Employing Manual Ring Mode

Manual ring mode allows the ring to be fully initialized without an accompanying

initialization of IMS in the 3B21D. Ordinarily full ring initialization occurs as a stagein level-4(BOOT) IMS initialization. Under certain circumstances and for certain

users, however, the disruption that IMS initialization entails in the operation of the3B21D may be unacceptable as, for example, when the ring is down or when ringhardware is being retrofitted to a system that has IMS as a subsystem. In these

cases, the ring may be initialized manually.

Procedure 3-9. Manual Initialization of the Ring

Before manual initialization, the ring must be down and enough hardware must be

in place to satisfy the requirement for minimum ring size. To initialize the ringmanually,

1. Consult ``Setting the ECD Flag for Manual Ring Mode'' in Appendix B, Ring Maintenance Reference Material .

7/29/2019 172254




Ring Maintenance

2. Set the ECD Manual Ring Mode flag as described in the above reference. IMS is

programmed to abort if, during initialization, the ring fails to come up. The ECD

manual ring mode flag inhibits this response.

3. If you are employing manual ring mode for a new installation, or if you areexperiencing ring down and no RPCNs are in the standby state, restore as many

RPCNs aspossible. When RPCNs are restoredwith the ring down, theywill be in the

STBY, not the ACT, state. This state is expected and sufficient for moving on to Step

4.

4. Enter the command CFR:RING

5. Expect to receive a form of the REPT RING INIT message indicating that the

initialization was or was not successful and a CFR RING COMP message indicating

that the program has completed. Forms of the REPT RING FLT message may also

appear to identify nodes that failed to participate in the initialization.

6. If the initialization was successful, reset the manual ring mode flag to null.

7. If the initialization was not successful, leave the ECD flag set for manual ring mode

and use the information you gained in Step 5 to troubleshoot the ring in the manner

described in ``Responding to Ring Down.''

Ring Application Processor Critical Maintenance

Procedure

The ring application processors (RAPs) of the CDN-I must be manually diagnosed

and maintained using special procedures. Automatically-initiated diagnostics ofthe RAP sometimes produce deceptive results. If RAP firmware is not executing,diagnostics run on RAP circuit packs (phases 42 through 53) will provide

erroneous data about phase and circuit pack failures; yet technicians cannot knowfrom ROP output that the data they are receiving is incorrect. They can, however,

receive correct data if, during diagnostics, they are present at the RAP housingand observe the RAP LEDs.

Each RAP circuit pack is equipped with an LED that turns on to indicate that thepack has failed a diagnostic phase. In addition, each of the LEDs on certain packs

turn on when the RAP is initializing and then turn off when initialization testsconfirm that the firmware is executing. The LEDs, thus, supply a means by which

technicians can observe the progress of RAP diagnostics and of RAPinitialization, provided they are present at the RAP housing as these actions

occur. And they can be present, because power and diagnostic switches located

7/29/2019 172254




401-661-045

on each RAP power control interface and display (PCID) board allow them to

control these functions locally. Thus RAP initialization and diagnostics may be runcentrally by the host or locally by means of PCID-board switches.

A RAP failure will usually be tested initially by central diagnostics at the request ofARR, and ROP output will indicate the phases that failed and the circuit pack(s)

suspected of being faulty. The procedure described below for fully diagnosing aRAP fault begins by tentatively accepting the results of the automatic diagnostics

and then proceeds to confirm them. (Notice in the procedure the requirement thata CDN be quarantined when its RAP circuit packs are diagnosed or replaced.)

Procedure 3-10. Manually Confirming RAP Diagnostic Results

1. Remove the CDN from service by quarantining it.

2. Turn off RAP power by toggling the top switch on the PCID board.

3. Replace the first circuit pack listed in the TLP.

4. Test as follows to determine that RAP firmware is capable of initializing the RAP:

Turn on RAP power, observing the LEDs on the following non-MASA circuit packs.

s The node processor interface (NPI) circuit pack.

s The central controller support (CCS) circuit pack.

s The central controller cache (CCC) circuit pack.

s All equipped main store controller (MASC) circuit packs.

When power is restored the LED of each pack should come on, go off, come backon, and finally go off; and this sequence of LED blinks should be completed for all

packs within [18 + (2 the number of MASA boards) +/-2] seconds for systems withthe 2-Mbyte memory and within [18 + (20 the number of MASA boards) +/-2]

seconds for systems with the 16-Mbyte memory. If an LED fails to come oninitially, turn off RAP power, replace the circuit pack, and repeat this step. If anyLED fails to follow the full sequence of blinks, or if all LEDs fail to complete the

sequence of blinks within the allotted time, go to Step 7 of this procedure.

5. This step manually diagnoses the node. The following information is helpful in

understanding it:

When diagnostics begin, the LED on each non-MASA circuit pack turns on andstays on until the pack has passed diagnostics. Moreover, diagnostics run on

non-MASA packs early-terminate. Therefore, when a non-MASA pack fails

7/29/2019 172254




Ring Maintenance

diagnostics, the diagnostic routine ends and the LEDs on the failed pack and on

all non-MASA packs that have not yet been diagnosed stay on. MASA LEDs, onthe other hand, may or may not come on when diagnostics begin, but they will

come on if their circuit packs fail diagnostics. Moreover, MASA diagnostics do not

early-terminate. Therefore, it is possible during a single diagnostic routine for aMASA pack to fail and for another pack perhaps a non-MASA pack further

downstream to fail as well.

Depress the DIAG switch on the PCID board. All non-MASA LEDs should comeon, then go off within 6 minutes for systems with the 2-Mbyte memory and within 4

minutes for systems with the 16-Mbyte memory. (If more than one MASC memorygroup is present, add 2 minutes and 40 seconds for each additional group.) If anyLED fails to come on initially, turn off RAP power, replace the circuit pack, and

repeat this step. If any LED fails to go off in the time indicated, turn off RAP power,replace the circuit pack, and repeat this step. If more than one LED fails to go off

in the time indicated, turn off RAP power, replace the first circuit pack in thefollowing list whose LED is on, and then repeat this step.

a. CCS

b. Memory group 0, that is, MASC_0 and all MASA packs associated

with it. (MASC diagnostics depend upon memory from the first—theMASA_0—memory board, so a fault in one pack may under some

circumstances cause the other to fail diagnostics. Therefore, if thesituation here or elsewhere indicates that either of these related

packs should be replaced but replacing it does not solve theproblem, try reinstalling the original pack and replacing the pack ofthe other.)

c. CCC

d. Each additional equipped memory group in numerical order.

e. NPI

If, upon repetition, a replaced circuit pack fails to pass diagnostics, leave RAP

power off, quarantine the node, and contact the CTS.

6. If Step 5 succeeded, unconditionally restore the node to service and end this

procedure.

7. Systematically search for the fault that is preventing initialization by following Steps 7

through 23.

Turn off RAP power. Reinstall the original circuit pack removed in Step 3.

8. Unplug the following circuit packs by opening their latches and pulling them out

about one inch:

s All MASCs packs except MASC_0

7/29/2019 172254




401-661-045

s The NPI pack

s All MASAs packs in memory group 0 except MASA_0.

9. Restore RAP power and observe the LED on the CCS pack. If it goes on, off, on, off

in 33 to 43 seconds, go to Step 24.

10. Turn off RAP power and replace the CCS pack.



12. Turn off RAP power. Reinstall the original CCS pack. Replace the CCC pack.



14. Turn off RAP power. Reinstall the original CCC pack. Replace the MASC pack.



16. Turn off RAP power. Reinstall the original MASC pack. Replace the MASA_0 pack.



18. Measure the voltage at each power converter (PWRB on the main unit and PWRC

on the growth unit) from + pin 056 to gnd pin 032. If the voltage is below the +5.1 to

+5.3 volt range, turn RAP power off and replace the appropriate converter.

19. Restore RAP power and observe the LED on CCS pack. If it goes on, off, on, off in

33 to 43 seconds, go to Step 24.

20. Steps 20-23 attempt to identify a problem that is not associated with the failure of a

circuit pack.

a. Turn off RAP power.

b. Reinstall the original MASA_0 pack.

c. Check backplane for shorted pins.

d. Check growth unit cables and bus terminators for proper installation,adjusting as needed.

e. Restore RAP power and observe the LED on the CCS pack. If itgoes on, off, on, off in 33 to 43 seconds, go to Step 24.

7/29/2019 172254




Ring Maintenance

21. If the RAP is not equipped with a growth unit, go to Step 23. Otherwise, turn off RAP

power and remove the basic-unit ends of the six growth cables, leaving them

hanging free. Remove the six terminator resistors from the growth unit and place

them in the positions formerly occupied by the basic-unit ends of the six growth

cables.


in 33 to 43 seconds, the problem is in the growth-unit backplane. Go to Step 24.

23. Leave the node quarantined, call the CTS, and end this procedure.

24. Manually diagnose the node as follows:

a. Depress the PCID DIAG switch.

b. Check that the CCS, CCC, and MASC_0 LEDs come on.

c. Check that the CCS LED goes off in 25 to 35 seconds for systems

with the 2-Mbyte memory and in 35 to 45 seconds for systems withthe 16-Mbyte memory.

d. Check that the following circuit packs all go off in the order listedwithin 2 minutes for systems with the 2-Mbyte memory and within 75

seconds for systems with the 16-Mbyte memory.

1. MASA_0

2. MASC_0

3. CCC Check that the yellow fail light on the PCID has goneout.

e. If the LED on any of the four circuit packs fails to go off on time or in

the indicated sequence, or if the PCID fail light fails to go off, turn offRAP power, replace the faulted pack, turn on RAP power, andrepeat this step. If the repetition is unsuccessful, leave the nodequarantined and call the CTS.

Recognizing and Finding Intermittent Faults

Faults that occur in IMS hardware may be hard, transient, or intermittent. Hard

faults permanently disable a component and are easy to find. IMS automaticmaintenance software dependably locates hard faults, removes them from thesystem, and directs technicians to repair them. One-time transient faults, if not

easy to find, are easy to deal with. They are caused by temporary hardwareproblems or glitches in software. Usually they are corrected by the IMS practice of

reinstating the ring or a component after a first failure. By contrast, intermittent orrecurring transient faults are often neither easy to find nor to deal with. If the

frequency of their occurrence is fairly short and fairly regular, IMS software can

7/29/2019 172254




401-661-045

usually locate them. But if their frequency of occurrence is long or very irregular,

they may escape the IMS net. In such cases, manual records kept by techniciansare the indispensable tool for identifying, finding, and correcting them.

How will an intermittent fault show up? In a ring interface or IRN node processor,an intermittent fault may appear in several guises as repeated losses of token, as

successful ring restarts following instances of blockage, as a node that EARisolates but ARR returns to service because it passes diagnostics, as a node that

ARR turns over to technicians because it has violated the fourth-time rule, or as acombination of these automatic responses. It could also appear as a repeated

failure of EAR recovery level 3 to find a fault that levels 1 and 2 had attemptedunsuccessfully to isolate. Again, the existences and histories of faults of this kindare likely to be caught only in the manual records of technicians.

On nodes suspected of having intermittent faults, enact the following checks:

s Inspect the node and its housing (Visually). Look for poorly seated circuitpacks, backplane damage or improper grounding, and poorly seated cable

connections.

s Run diagnostics on the node in the repeat mode.

s Tap on the front of the circuit packs and apply pressure to the backplane

with your thumb in an effort to stress cracks and in an attempt to stimulatean intermittent fault to recur.

s Move the circuit packs of a suspected node one-by-one to another locationto see which hardware (if any) have an intermittent failure follow. (Makesure you keep careful records of each move.)

IMS attempts to recover automatically from software faults. Thus no regular

software maintenance is required of the Craft. Intermittent faults are more likely to

be in hardware than in software. Nevertheless, when a troubled componentconsistently passes diagnostics, the fault could be in software.

Other Suggestions for Troubleshooting

The following are hints and advice based upon developer experience.

New Circuit Pack; Old Failure

Technicians are sometimes faced with the following anomaly. A node continues tofail diagnostics after its circuit packs have been replaced, yet no problem is visible

in the backplane or ring bus wiring. Faced with this problem, technicians should

consider that the fault might lie in the isolated RAC of the BISO or EISO node. Anexplanation follows:

7/29/2019 172254




Ring Maintenance

Messages, including messages containing diagnostic code, are sent from the

3B21D to an isolated segment of the ring through the BISO or the EISO node.BISO and EISO nodes have one RAC participating in the active-ring segment and

one RAC participating in the isolated-ring segment. Messages destined for the

isolated segment are read from the active ring by the active-ring RAC, thentransmitted by the node processor to the isolated-ring RAC, which writes them to

the isolated segment of the ring. A fault in the isolated-ring RAC of either BISO orEISO node might go undetected, since it would not affect the transportation of

message on the active ring and could show up misleadingly as a diagnostic failurein the isolated node, thereby, creating the maintenance anomaly described above.

Therefore, technicians who face this problem should consider extending theisolation to include the current BISO and EISO nodes and running diagnostics onthem.

Unconditional Restorals

Do not unconditionally restore a node unless you are certain it is without faults.

Even when you are certain, do not unconditionally restore a node that has beenpowered down, that contains a ring-interface circuit pack that has been reset, or

that exists in isolation with a node that has had a ring-interface circuit pack resetwithout first running diagnostic phases 1 and 2 on it. When a node or a circuit

pack has been powered down, the status registers of its ring-interface hardwaremay become improperly set, and an unconditional restoral of the node will likely

result in a ring transport error and an isolation. Diagnostic phases 1 and 2 reset allring-interface status registers to their proper positions.


Be aware that some correlation exists between unexplained losses of token andthe number of out-of-service nodes, because the node processors of quarantined

and isolated nodes cannot fulfill their important and unassignable role in errordetection and reporting.

Avoiding Trouble

Be careful not to leave the system unattended with ARR or CNR inhibited.

Recording Trouble

When troubleshooting a ring-related problem, frequently enter theOP:RING;DETD command as a way of providing, on the ROP output, sequentialrecords of ring status. Such records may be useful during postmortems. If a

problem is likely to be referred to developers at Bell Laboratories, save the currentRPTERR0 and RPTERR1 log files in /etc/log .

Keep records on all circuit pack replacements and failures.

7/29/2019 172254




401-661-045

Keep records on all indications of transient and intermittent faults identifying, if

possible, the locations where they occur. Remember that a transient fault may bean intermittent fault in its infancy.

New Installations or Ring Growth

New installations may wish to utilize the manual r ing mode which is explained

above. Avoid growing nodes on a live system that is experiencing unexplainedtransient failures. When installing a new IMS ring or growing a new node, verifythat the hardware specified in the ECD UCB hv field matches the hardware that is

physically present. Also execute full diagnostics (automatic and optional) on everynew ring node, resolving problems until diagnostics indicate ATP. If you encounter

troubles, be suspicious of cables. Look for poor or open connectors, for cablesconnected to the wrong place, and for improper backplane grounding.

Examples of Ring Maintenance

This chapter exemplifies some of the maintenance principles and practices thatwere formulated in the previous two chapters. Its purposes are to familiarizetechnicians with the IMS ROP output, to suggest ways for technicians to monitor

and interact with automatic maintenance, and to provide technicians with realisticexamples of both manual and automatic maintenance activities. Most of the

examples represent common scenarios. A few are special cases. Together theycompose an IMS tutorial.

Each example is preceded by an introduction. The examples themselves arecomposed of two elements. A literal reproduction of ROP output in the left column

of the page records maintenance-related events occurring in the ring subsystem.

A commentary in the right column of the page provides a gloss on the adjacentROP output. The gloss is selective and cumulative. It usually avoids explainingfeatures that previous entries have explained.

The examples composing this chapter incorporate two recently developedfeatures, ring restart and automatic TLP output. Readers whose systems do not

have ring restart should ignore the level-0 recovery efforts in the examples andbegin with the level-1s. Readers without the TLP feature may use the DGN output

messages to identify probable faulty equipment.

A convention of this chapter is that data in ROP output messages that is notordinarily used by technicians will be omitted and replaced by rows of periods.

7/29/2019 172254




Ring Maintenance

Responses to Single, Ring-Related Faults

The following four examples of ring recovery occur in response to single faults of

the kind that disrupt the transportation of messages on the ring.

Automatic Recovery from a Transient Fault by EAR

Level 0

IMS software responds to faults that disrupt the transportation of messages on thering with the EAR escalative recovery strategy. The first or 0 level of this strategy

consists of restarting the ring in conformity with its structure prior to the fault. Sucha response will usually recover the ring subsystem from a transient fault, as it

does in this example. Technicians should record the occurrence and, if possible,identify the location of transient faults.

7/29/2019 172254




401-661-045

This example occurs on the following ring:

REPT RING CFR

LEVEL 0 RING CONFIGURATION INITIATED BY EAR

NORMAL CONFIGURATION REQUESTED

0 1 4 3600000..........................................(4030614766)

Announces the onset of a level-0 recov-

ery attempt, stimulated by EAR’s receipt

of one or more error messages indicating

a ring-related fault. The onset time of the

attempt appears in milliseconds in paren-

theses on the bottom line. Other numbers

on the bottom line pertain to the ring error

threshold. The first digit indicates EAR’s

mode where 0 = ``threshold not

exceeded” and 1 = ``threshold

exceeded.” The second digit identifies the

number of ring errors that have occurred

within the current threshold interval. The

third digit is the user-specified number of

errors per threshold interval that causes

the threshold to be exceeded. And

3600000 is the user-specified threshold

interval in milliseconds. When the second

number equals the third, the threshold

has been exceeded.

REPT RING CFR

RING CONFIGURATION ESTABLISHED (455 ms)

NORMAL CONFIGURATION, NODE NODES ISO-

LATED

.................................(4030614777)(4030615120)

Announces a successful restart of the

ring. Thus no manual response is

required. 455 ms is the duration in milli-

seconds of ring silence resulting from the

configuration attempt, and in parentheses

are the times when the ring configuration

job started and was completed.


00AAAAAAAAAAAA.... 01................ 02................

30................ 31.AAAAAAAAAAAAAAA 32AAAAAAAAAAAA....

63.AAAAAAAAAAAAAAA

CMD FUNCTION

400 OP RING DETAILED

7/29/2019 172254




Ring Maintenance


RAC PARITY/FORMAT ERROR DETECTED, IUN31 11

RAC 0

.......................................................................

....................................................(4030614653)

IMS in the 3B21D received this and the

following two-ring transport error mes-

sages (at the times in parentheses) as a

result of the fault that stimulated the

above recovery attempt. This message(the first to arrive) identifies the error type

and the node and RAC associated with

the error. Notice that ring transport error

messages appear on the ROP following

the messages announcing the system

response to the error.


BLOCKAGE DETECTED, IUN31 9 RAC 0

.......................................................................

.....................................................(4030614663)

The fault spawned two instances of block-

age, one from this, the second node

upstream of the faulty node...

REPT RING TRANSPORT ERRBLOCKAGE DETECTED, IUN31 10 RAC 0

.......................................................................

.....................................................(4030614667)

and one from this, the first node upstreamof the faulty node. IUN 31 9 detected

blockage before IUN 31 10 could drain

the ring. IUN 31 10 must have detected

blockage prior to IUN 31 9, but IUN 31 9’s

ring transport error report reached the

3B21D first.

7/29/2019 172254




401-661-045

Manual Recovery from a Hard Fault

After a hard fault, EAR level-0 will ordinarily try unsuccessfully to restart the ring.

Then based upon its analysis of ring transport error messages, EAR level-1 will

attempt to locate and isolate the fault. If EAR succeeds, ARR will then attempt torestore the isolated node conditionally and, if it fails, will change the nodemaintenance mode to manual, thereby, directing technicians to perform

maintenance on it. This example is composed of the scenario just described.

REPT RING CFR



.....................................................(4030772385)

Prompted by a ring transport error report,

EAR level-0 requests that the ring config

module restart the ring.

REPT RING CFR

RING CONFIGURATION ATTEMPT FAILED 17

COULD NOT ESTABLISH A NORMAL RING CONFIG-

URATION

.......................................................................

(4030772397)(4030772536)

The continuity test run by the ring config

module failed, an indication that the fault

is probably hard.

REPT RING CFR


ISOLATION FROM IUN31 11 TO IUN31 11REQUESTED

0 2 4 3600000..................................(4030772561)

EAR level-1 requests that the ring config

module isolate the node indicated as

faulty by the ring transport error mes-

sages.

OP:RING;DETD

RING STAT: ACTIVE

00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

7/29/2019 172254




Ring Maintenance

REPT RING CFR

RING CONFIGURATION ESTABLISHED (658 MS)

BISO NODE = IUN31 10, EISO NODE = IUN31 12

(4030772580)(4030772942)

IUN31 11 is isolated with IUN31 10 acting

as BISO node and IUN31 12 acting as

EISO node.



RAC 0.

................................................(4030772270)


BLOCKAGE DETECTED, IUN31 10 RAC 0.

................................................(4030772278)


BLOCKAGE DETECTED, IUN31 9 RAC 0.................................................(4030772282)

REPT ARR AUTORST

ARR COND RST FOR IUN31 11 STARTED

ARR requests that MIRA conditionally

restore the isolated node. This is ARR’s

check that the removal and isolation of

the node was necessary. The attempt will

generate diagnostic data that the techni-

cian should use if called upon to perform

maintenance on the node.

RST TERM LN31 11 TASK 3 MSG STARTED RTR message announcing that ARR`s

restoral request is on the active queue

and being processed.

7/29/2019 172254




401-661-045

The 1105 display page now looks as follows:

RMV IUN31 11 STOPPED 5 RTR message announcing that it could

not remove IUN31 11 from service

(because EAR had done so previously).

DGN IUN31 11 PH 1 STF (9 X’00000000 X’00000000)

TEST

004...........................................................005 X’00000dfb................................................

006...........................................................

008...........................................................

009...........................................................

Indicates that during phase 1 diagnostics,

some tests (nine in all) failed and none

(X’00000000 X’00000000) were skipped.

IUN31 11 is not necessarily the node inwhich phase 1 failed, but the node speci-

fied in ARR’s diagnostic request. Since

phases 1 and 2 test all RACs in the iso-

lated segment, the fault that produces a

phase 1 or 2 failure may not reside in the

specified node. The failure of test 005

indicates that, in this instance, low-phase

ambiguity exists; in other words, that both

a RAC of the isolated node and a RAC of

either the EISO or BISO node is sus-

pected of being faulty. See the ̀ `Low-

Phase Ambiguity” section in this chapter.



ARR RESORE COND IUN31 11

00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

CMD FUNCTION


7/29/2019 172254




Ring Maintenance

DGN IUN31 11 PH 2 STF (10 X’00000000 X’00000000)

TEST

002...........................................................

004...........................................................

005 X’00000dfb................................................

006...........................................................

007...........................................................

Phase-1 diagnostics test the isolated

segment beginning at the BISO node and

phase-2 tests them beginning at the

EISO node. In the case of single-node

isolations, the two phases should reportfailure data for the same node(s), but in

the case of multiple-isolations they usu-

ally report failure data for different nodes.

DGN IUN31 11terminated at ph 2 stmnt 36 after test 17 Indicates the point in the diagnostic rou-

tine at which execution terminated.

ANALY:TLPFILE: IUN31 11 SUMMARY DATA MSG

STARTED

TLP: IUN31 11 PH=1....................................................

TLP: IUN31 11 PH=2....................................................

TLPFILE COMPLETED

Summarizes diagnostic failure data.

Phases cited are those that failed; but

because phases 1 and 2 are at issue,

IUN31 11 is not necessarily the location

of the failure.

DGN IUN 31 11 COMPLETED STF (19........................)

ANALY TLPFILE IUN31 11 TLPSRCH MSG IP

TLPFILE #983090

Short form of this message. The longer

form is next.

ANALY TLPFILE IUN31 11 SUSPECT FLTY EQUIP-

MENT

CODE GRP MEM CONT POS WT NOTE

UN303 31 11 -- -- 10 --

CABLE -- -- -- -- 10 3

This data is printed only after a test fails

and only if the TLP option was specified

in the DGN command (as it always is by

ARR). The entry lists in weighted (WT)

order equipment suspected of being

faulty. The “WT” is a number between 1and 10. The higher the WT the greater

the likelihood of the equipment being

faulty. Because ARR does not specify the

RAW optionof theDGN command, failure

data for test 010 is not given. (See the

``Low-Phase Ambiguity” section of this

chapter.)

RST IUN31 11 STOPPED 1 Because of diagnostic failure (error code

1).

DGN IUN31 11 STF..............................................MSG

COMPL

REPT ARR AUTORST

ARR COND RST FOR IUN 31 11 FAILED

Confirms that ARR’s restoral request hasfailed. Many IMS processes write to the

ROP, at times resulting in some redun-

dancy.

7/29/2019 172254




401-661-045

OP:RING;DETD Manual input message.

RING STAT: ISOLATED SEGMENT

BISO: IUN31 10 EISO: IUN31 12

. The subnumber 4 under the i in the above

output message indicates that the ring

interface of IUN31 11 is faulty. The num-

bers used in this way have the following

meanings:

1 = manual mode

2 = RI QUSBL or NP faulty or untested

3 = combination of 1 and 2

4 = RI faulty or untested



7 = combination of 1, 2, and 4

OP:RING, IUN31 11 Manual input message.

OP:RING IUN31 11 COMPL

IUN32 11: MJ = OOS; NM = MAN; RI = FLTY ; NP =

USBL

IN ISOL SEG

Like the TLP and OP:RING;DETD out-

puts above, this data does not reflect the

low-phase ambiguity.

Following the procedures, ̀ `Responding

to Single Node Isolations” and ``Clearing

Faults in Response to ARR Actions,” atechnician replaces circuit pack UN303 in

IUN 31 11...

RST:IUN31 11 and conditionally restores the node.

00AAAAAAAAAAAA.... 01................ 02................

30................ 31.AAAAAAAAAAiAAAA 32AAAAAAAAAAAA.... 4

63.AAAAAAAAAAAAAAA

7/29/2019 172254




Ring Maintenance

Automatic Recovery from a Transient Fault by ARR

In this example a fault triggers a level-0 recovery attempt that fails; EAR level 1

then isolates the apparently faulty node; and ARR's attempts to restore the nodesucceeds. Though the fault triggers two levels of EAR responses, no manualaction is required other than to record the occurrence and location of the problem

as a probable transient fault.


RST IUN31 11 TASK 4 MSG STARTED

RMV IUN31 11 STOPPED 5

DGN IUN31 11 COMPLETED ATP MESSAGE IN

PROGRESS

Repaired IUN31 11 now passes diagnos-

tics.

REPT RING CFR


NORMAL CONFIGURATION, NO NODES ISOLATED

The isolation is dissolved automatically

as IUN31 11 is restored.

(4031118365)(40311118740)

RST IUN31 11 COMPLETED IUN31 11 has been returned to the active

ring, pumped with operational code and

placed in execution.

DGN IUN31 11 ATP MESSAGE COMPLETE

OP:RING;DETD

RING STAT: ACTIVE

00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

7/29/2019 172254




401-661-045

REPT RING CFR


NORMAL CONFIGURATION REQUESTED.0 3 4 3600000................(4031349825)

REPT RING CFR


COULD NOT ESTABLISH A NORMAL RING CONFIGURATION

.....................................................

(4031349837)(4031350005)

REPT RING CFR

LEVEL 1 RING CONFIGURATION INITIALED BY EAR

ISOLATION FROM IUN31 11 TO IUN31 11 REQUESTED.

0 3 4 3600000.................(4031350030)

REPT RING CFRRING CONFIGURATION ESTABLISHED (695 ms)


(4031350049)(4031350422)


RAC PARITY/FORMAT ERROR DETECTED. IUN31 11 RAC 0.

........................................(4031349712)



........................................(4031349722)


BLOCKAGE DETECTED, IUN31 10 RAC 0.........................................(4031349727)



00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

CMD FUNCTION


7/29/2019 172254




Ring Maintenance


OP:RING;DETD

DGN IUN31 11 COMPLETED ATP MESSAGE IN PROGRESS

REPT RING CFR



(4031519404)(4031519780)

RST IUN31 11 COMPLETED

DGN IUN31 11 ATP MESSAGE COMPLETE

REPT ARR AUTORST

ARR COND RST FOR IUN31 11 SUCCEEDED

OP:RING;DETD

00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

RING STAT: ACTIVE

00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

7/29/2019 172254




401-661-045

Manual Recovery from a Hard Fault on a Small Ring

Small rings with padded interframe buffers are subject to ring fragmentation—a

condition that causes the ring to go down. Ring fragmentation will occur when an

isolation that includes padded buffers shortens an active ring below its minimumdata length. Padded buffers are employed redundantly in pairs at opposite sidesof the ring. Thus a single-node isolation on a small ring will never include both

pairs, while in many cases a two-node isolation will. Nevertheless, a single-nodeisolation on small rings can pose problems because of the common need, arisingfrom low-phase ambiguity, to extend isolations to include the BISO or EISO node.

(For a discussion of this issue, see the section ``Low-Phase Ambiguity'' in thischapter.) Isolations on small r ings often include one pair of padded buffers, and

extending the isolation would often include the other pair as well. The conditionsthat give rise to this problem are illustrated in the following two figures.

Figure 3-4. Manual Recovery - Method One

Padded Interframe Buffers

Isolated Ring

Active Ring

BISO NodeIUN32 1

RPCN32 0

RPCN00 0

EISO NodeRAC 0 RAC 1

RAC 1

RAC 1

RAC 0

RAC 0

RAC 1

RAC 0

Isolated Node

7/29/2019 172254


7/29/2019 172254




401-661-045

The following example occurs on the four-node ring just il lustrated:

REPT RING CFR



0 1 4 3600000.............................(242674464)

REPT RING CFR



URATION

.......................................................................

(242674474)(242674649)

REPT RING CFR


ISOLATION FROM RPCN32 0 TO RPCN32 0

REQUESTED

0 1 3 3600000.............................(242674676)

REPT RING CFR



(242674689)(242674963)



RAC 0.

......................................................................

............................................(242674346)

In this instance EAR did not receive ordid not report blockage.

REPT ARR AUTORST

ATT COND RST FOR RPCN32 0 STARTED

RMV RPCN32 0 STOPPED 5

7/29/2019 172254




Ring Maintenance

DGN RPCN32 0 PH 1 STF (11 X’00000000

X’00000000)

TEST..................................................................

002...................................................................004...................................................................

005 (X’00000e00)......................................................

006...................................................................

007...................................................................

The failure of test 5 means that low-

phase ambiguity exists in this case; in

other words, the IMS regards either RAC

1 in the BISO node or RAC 0 in the

EISO node, or both, as possibly faulty.

DGN RPCN32 0 PH 2 STF (11 X’00000000

X’00000000)

TEST..................................................................

002...................................................................

004...................................................................

005 (X’00000e00).........................................................

006...................................................................

007...................................................................

RPCN32 0 TERMINATED AT PH 27 STMNT 15 AFTER

TEST 8

ANALY:TLPFILE: RPCN32 0 SUMMARY DATA

TLP: RPCN32 0 PH=1....................................................

TLP: RPCN32 0 PH=2....................................................

T.PFILE COMPLETED

DGN RPCN32 0 COMPLETED STF (21 X’00000000X’00000000)

ANALY TLPFILE RPCN32 0 TLPSRCH

TLPFILE #917573

ANALY TLPFILE RPCN32 0 SUSPECT FLTY EQUIP-

MENT


UN122C 32 0 -- -- 10 --

UN123B 32 0 -- -- 10 --

CABLE -- -- -- -- 10 3

The extended TLP output message

does not identify equipment in the BISO

or EISO node as faulty, because the ring

interfaces of these nodes are necessar-

ily classified as usable.

RST RPCN32 0 STOPPED 1

DGN RPCN32 0 STF (21X’00000000 X’00000000)

7/29/2019 172254




401-661-045

REPT ARR AUTORST

ARR COND RST FOR RPCN32 0 FAILED

Failure of the ARR restoral attempt

results in the maintenance mode of the

node being changed to manual.

OP:RING;DETD

. The isolation in this small ring during a

time of heavy traffic creates an emer-

gency condition. Following the proce-

dures for ``Clearing Faults in Response

to ARR Actions” and ``Responding to

Single-Node Isolations,” the technician

elects to change both UN122C and

UN123B in RPCN32 0 but does not trou-

bleshoot the cable. It is possible, of

course, that the fault is in the cable, but

this being a situation involving low-phase

ambiguity, it is far more likely that the

fault, if it is not in the circuit packs of

RPCN32 0, is in the isolated RAC of

either the EISO or BISO node.

DGN RPCN32 0;RAW! Then, this being a phase 1 and 2 failure,

the technician diagnoses the node using

the RAW option so that if phase 1 or 2

still fails, an indication will be given as to

whether the isolated RAC of the BISO or

EISO node is suspected of being faulty.Of course, the problem could be in the

cable of RPCN32 0.

DGN RPCN32 0 TASK 5 MSG STARTED



00AA.............. 01................ 02................

30................ 31................ 32iA..............

5

7/29/2019 172254




Ring Maintenance

RMV RPCN32 0 STOPPED 5

DGN RPCN32 0 PH 1 STF (11X’00000000

X’00000000)

TEST MISMATCH........................

002...................................................................

004...................................................................

005 X’00000e00......................................................

006...................................................................

007...................................................................

008...................................................................

009...................................................................

010 X’00000e01......................................................

011...................................................................

016...................................................................

017...................................................................

The mismatch data for failing test 10

identifies both IUN32 1 and IUN00 1 as

suspect nodes. (Hexadecimal e01 is

translated by the ``Physical Node

Address Hexadecimal Representation”

table in the reference chapter of this doc-

ument as node 32 1 and hexadecimal

c01 is translated as node 00 1.) In this

situation, the standard procedure calls

for technicians to extend the isolation to

include IUN32 1 or IUN00 1 to perform

maintenance on it. Extending the isola-

tion to include IUN32 1 would in this

instance, however, bring the ring down,

because it would result in the isolation of

both pairs of padded interframe buffers.

DGN RPCN32 0 PH 2 STF (10X’00000000

X’00000000)

TEST MISMATCH

002...................................................................

004...................................................................

005 X’00000e00............................

006...................................................................

007...................................................................008...................................................................

009...................................................................

010 X’00000c01............................

011...................................................................

016...................................................................

017...................................................................

(See the illustration of the ring that

appears at the beginning of this section.)

Therefore, the first action (which to con-

serve space is not shown here) was to

extend the isolation to include IUN00 1

and to perform maintenance on it. This

action, however, did not find a fault in

IUN00 1, and so the isolation was

reduced to include once again only

RPCN32 0, and the MOVFLT option of

the CFR command was employed to shift

the isolation from RPCN32 0 to IUN32 1

as played out below.

DGN RPCN32 0 PH 10 ATP....................

DGN RPCN32 0 PH 11 ATP.....................

DGN RPCN32 0 PH 12 ATP.....................

DGN RPCN32 0 PH 13 ATP.....................

DGN RPCN32 0 PH 20 ATP.....................

7/29/2019 172254




401-661-045

DGN RPCN32 0 PH 23 ATP.....................

DGN RPCN32 0 PH 24 ATP.....................

DGN RPCN32 0 PH 26 ATP.....................

DGN RPCN32 0 PH 27 ATP..................... Unuseful output generated by the DGN

RAW option could have been stopped by

terminating DGN with the STOP:DMQ

command.

DGN RPCN32 0 TERMINATED AT PH 27

STMNT 15 AFTER TEST 3

DGN RPCN32 0 STF (21 X’00000000 X’0000000).........

RMV:LN32 1 In preparation for entering the CFR com-

mand, the node specified in the com-

mand must be removed from service.

RMV IUN32 1 TASK 0

RMV IUN32 1 COMPLETED

OP:RING;DETD

REPT RING CFR

WARNING: BISO AND/OR EISO NODE OOS

BISO NODE - IUN00 1, EISO NODE =IUN32 1

ACTIVE RING SEGMENT NOT LONG ENOUGH

Removing a BISO or EISO node from

service would ordinarily cause the isola-

tion to extend to include the out-of-ser-vice node. In this case it does not,

however, because IMS calculates that

doing so would shorten the ring below its

minimum data length.

RING STAT: RESTORING


00AA.............. 01................ 02................

30................ 31................ 32iO..............

51

7/29/2019 172254




Ring Maintenance

Responses to Multiple, Ring-Related Faults

The following two examples of ring-recovery actions occur in response to multiple

faults of the kind that disrupt the transportation of messages on the ring.

Manual Recovery from Multiple Hard Faults

Multiple faults have the potential of creating massive isolations. Because theyusually develop as extensions of single faults, they are best avoided by prompt

and effective attention to single faults. The history of the following massiveisolation is typical. In the first stage, a single node is isolated, diagnosed at the

CFR:RING,IUN32 1;MOVFLT! With the suspect IUN32 1 quarantined

out-of-service, the technician enters the

MOVFLT version of the CFR command

to shift the isolation to include IUN32 1.

REPT RING CFR


BISO NODE = RPCN32 0, EISO NODE = RPCN00 0

(243506608) (243506934)

REPT ARR AUTORST

CNR UCL REST FOR RPCN32 0 STARTED

ARR undertakes its highest-priority task,

the restoral of a node designated as a

BISO or EISO node.

CFR RING IUN32 1 COMPL The isolation shifted, the ring now has

the structure of the second illustration at

the beginning of this section, and the

probable fault in IUN32 1 may now be

corrected.


BISO: RPCN32 0 EISO: RPCN00 0

00AA.............. 01................ 02................

30................ 31................ 32Ai..............

5

7/29/2019 172254




401-661-045

request of ARR as RI faulty, and its maintenance mode changed to manual. Then,

before the technician can repair and return it to service, another ring-related faultoccurs on a distant part of the ring, with the result that the many nodes lying

between the two faulty nodes must be removed from service as victims of the

expanded isolation.

The first stage of this example is identical to the example recorded above in``Manual Recovery from a Hard Fault,'' except that the massive isolation

intervenes before the first fault can be repaired.


REPT RING CFR



.....................................................(4030772385)

Prompted by a ring transport error

report, EAR level-0 requests that the ring

config module restart the ring.

REPT RING CFR



URATION

.......................................................................(4030772397)(4030772536)

The continuity test run by the ring config

module failed, an indication that the fault

is probably hard.


00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

CMD FUNCTION


7/29/2019 172254


7/29/2019 172254




401-661-045

The 1105 display page now looks as follows:

RMV IUN31 11 STOPPED 5 RTR message announcing that it could

not remove IUN31 11 from service

(because EAR had done so previously).

DGN IUN31 11 PH 1 STF (9 X’00000000 X’00000000)

TEST

004...........................................................

005 X’00000dfb................................................

006...........................................................

008...........................................................

009...........................................................

Indicates that during phase 1 diagnos-

tics, some tests (nine in all) failed and

none (X’00000000 X’00000000) were

skipped. IUN31 11 is not necessarily thenode in which phase 1 failed, but the

node specified in ARR’s diagnostic

request. Since phases 1 and 2 test all

RACs in the isolated segment, the fault

that produces a phase 1 or 2 failure may

not reside in the specified node. The fail-

ure of test 005 indicates that, in this

instance, low-phase ambiguity exists; in

other words, that both a RAC of the iso-

lated node and a RAC of either the EISO

or BISO node is suspected of being

faulty. See the ̀ `Low-Phase Ambiguity”

section in this chapter.



ARR RESTORE COND IUN31 11

00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

CMD FUNCTION


7/29/2019 172254




Ring Maintenance

DGN IUN31 11 PH 2 STF (10 X’00000000

X’00000000)

TEST

002...........................................................004...........................................................

005 X’00000dfb................................................

006...........................................................

007...........................................................

DGN IUN31 11 terminated at ph 2 stmnt 36 after test

17

Phase-1 diagnostics test the isolated

segment beginning at the BISO node

and phase-2 tests them beginning at the

EISO node. In the case of single-node

isolations, the two phases should reportfailure data for the same node(s), but in

the case of multiple-isolations they usu-

ally report failure data for different nodes.

Indicates the point in the diagnostic rou-

tine at which execution terminated.

ANALY:TLPFILE: IUN31 11 SUMMARY DATA MSG

STARTED

TLP: IUN31 11 PH=1....................................................TLP: IUN31 11 PH=2....................................................

TLPFILE COMPLETED

DGN IUN 31 11 COMPLETED STF

(19...................................)

Summarizes diagnostic failure data.

Phases cited are those that failed; but

because phases 1 and 2 are at issue,

IUN31 11 is not necessarily the locationof the failure.

ANALY TLPFILE IUN31 11 TLPSRCH MSG IP

TLPFILE #983090

Short form of this message. The longer

form is next.


MENT


UN303 31 11 -- -- 10 --

CABLE -- -- -- -- 10 3

This data is printed only after a test fails

and only if the TLP option was specified

in the DGN command (as it always is byARR). The entry lists in weighted (WT)

order equipment suspected of being

faulty. The “WT” is a number between 1

and 10. The higher the WT the greater

the likelihood of the equipment being

faulty. Because ARR does not specify

the RAW option of the DGN command,

failure data for test 010 is not given. (See

the ``Low-Phase Ambiguity” section of

this chapter.)

RST IUN31 11 STOPPED 1 Because of diagnostic failure (error code

1).

DGN IUN31 11 STF..............................................MSGCOMPL

7/29/2019 172254




401-661-045

REPT ARR AUTORST

ARR COND RST FOR IUN 31 11 FAILED

Confirms that ARR’s restoral request has

failed. Many IMS processes write to the

ROP, at times resulting in some redun-

dancy.

OP:RING;DETD Manual input message.



OP:RING, IUN31 11 Manual input message.


IUN31 11: MJ = OOS; NM = MAN; RI = FLTY ; NP =

USBL

IN ISOL SEG

Like the TLP output above, this data

does not reflect the low-phase ambiguity.

REPT RING CFR


ISOLATION FROM IUN31 11 TO IUN31 11

REQUESTED.

0 1 4 3600000................(403082426)

Before the technician can respond to the

single isolation, another fault occurs.

EAR level-0 attempts to restart the ring

in conformity with its isolated structure

prior to the occurrence of the second

fault.

00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

7/29/2019 172254




Ring Maintenance

REPT RING CFR


COULD NOT ESTABLISH BISO NODE = IUN31 10,

EISO NODE = IUN31 12

......................................................................

(403082441)(403082625)

Ring config’s continuity test failed...

REPT RING CFR



REQUESTED.

0 2 4 3600000.................(403082654)

so the isolation must be extended to

include both nodes suspected of having

faulty ring interfaces.


RMV RPCN 32 0 RQSTD; RPC ISOLATION RPTD

...................................(403082796)

This messagenotifies the technician that

an innocent-victim RPCN is being

included in the extended isolation.

REPT RING CFR



(403082671)(403082031)

The multiple-node isolation is now estab-

lished.



RAC 0.

........................................(403082306)



......................................................................

........................................(403082316)



......................................................................

........................................(403082322)

REPT ARR AUTORST


Having failed previously (during the sin-

gle isolation stage) to restore IUN31 11,ARR now selects IUN32 6 for a condi-

tional restoral attempt.


7/29/2019 172254




401-661-045


DGN IUN32 6 PH 1 STF (9 X’00000000 X`00000000)

TEST....................................................................

004.....................................................................

005 X’00000dfb.........................................................

006.....................................................................

008.....................................................................

009.....................................................................

Phase-1 diagnostic tests begin running

from the BISO node. Therefore, they

identify IUN31 11 as faulty.

DGN IUN32 6 PH 2 STF (11 X’00000000

X`00000000)

TEST....................................................................

002.....................................................................

004.....................................................................

005 X’00000e06.........................................................

006.....................................................................

007.....................................................................

Phase-2 diagnostic tests begin running

from the EISO node. Therefore, they

identify IUN32 6 (e06) as faulty. The fail-

ure of test 005 of phase 2 indicates that

low-phase ambiguity exists surrounding

IUN32 6. Probably, though not certainly,

IUN32 5, whose ring interface is sus-pected to be faulty, is the node involved

in this instance of low-phase ambiguity.

DGN IUN32 6 TERMINATED AT PH 2 STMNT 36

AFTER TEST 17

ANALY:TLPFILE: IUN32 6 SUMMARY DATA

TLP: IUN32 6 PH=1........................................................

TLP: IUN32 6 PH=2........................................................

TLPFILE COMPLETED

DGN IUN32 6 COMPLETED STF (20..................)

ANALY TLPFILE IUN 32 6 TLPSRCH

TLPFILE # 1179716


MENT


UN303 31 12 -- -- 10 --

UN303 31 11 -- -- 10 --

CABLE -- -- -- -- 10 3

Contrast this output with the TLP output

when IUN32 11 was singly isolated. Both

then and now the ring interface of IUN31

12 was suspect. The difference is that

when the suspect RAC of IUN31 12 was

part of an EISO node, its ring interface

could not be set to FLTY. IUN32 6 is not

included because the TLP output

reflects only the first failing phase.

RST IUN32 6 STOPPED 1

DGN IUN32 6 STF (20 X`00000000 X`00000000)

7/29/2019 172254




Ring Maintenance

REPT ARR AUTORST

ARR COND RST FOR IUN32 6 FAILED

OP:RING;DETD

Notice that thesubnumbers produced by

the OP:RING;DETD command indicatethat, as a result of low-phase ambiguity,

four nodes are suspected of having

faults in their ring interfaces. Because

none of the four is now in the active ring

as an EISO or BISO node, each can

have its ring interface minor state

marked FLTY.

DGN:IUN31 11;RAW! In accordance with the procedures,

``Responding to Multiple-Node Isola-

tions” and ``Clearing Faults in Response

to ARR Actions,” a technician replaces

circuit pack UN303 in IUN 31 11 and

submits the node to automatic diagnos-tics with the RAW option.

DGN IUN31 11 TASK 8 MSG STARTED



00AAAAAAAAAAAA.... 01................ 02................

30................ 31.AAAAAAAAAAiiiii 32iiiiiiiAAAAA....

54 45

63.AAAAAAAAAAAAAAA

7/29/2019 172254




401-661-045


DGN IUN31 11 PH 1 (STF (10X’00000000

X’00000000)

TEST....................................................................

004.....................................................................

005 X’00000e05...........................................

006.....................................................................

007.....................................................................

008.....................................................................

009.....................................................................

010 X’00000e06........................................................

011.....................................................................

016.....................................................................

017.....................................................................

This output from the manual diagnostic

request with the RAW option shows

IUN32 5 and IUN32 6 as suspected of

having faulty ring interfaces, implying

that IUN31 11 and IUN31 12 have

passed phase 1, a condition that should

cause their ring interface states to

change to QUSBL.

REPT ARR AUTORSTR


Having failed to restore IUN31 11 and

IUN32 6, ARR now attempts to restore

IUN31 12. This automatic action occurs

at nearly the same time as the manual

diagnostic procedure.

RST IUN31 12 QUEUED TASK 0

DGN IUN31 11 PH 2 STF (11 X’00000000

X’00000000)

TEST....................................................................

002.....................................................................004.....................................................................

005 X’00000e06..........................................................

006.....................................................................

007.....................................................................

008.....................................................................

009.....................................................................

010 X’00000e05........................................................

011.....................................................................

016.....................................................................

017.....................................................................

DGN IUN31 11 TERMINATED AT PH 2

STMNT 36 AFTER TEST 17

7/29/2019 172254




Ring Maintenance

DGN IUN31 11 COMPLETED STF (21...........)

RST LN31 12 TASK 9 ARR restoral request on IUN31 12

started.


DGN IUN31 12 PH 1 (STF (10X’00000000

X’00000000)

TEST....................................................................

004.....................................................................

005 X’00000e05.........................................................

006.....................................................................

007.....................................................................

008.....................................................................

This is output from ARR’s restoral

request.

DGN IUN31 12 PH 2 (STF (11X’00000000X’00000000)

TEST..................................................................request.

004.....................................................................

005 X’00000e06.........................................................

006.....................................................................

007.....................................................................

008.....................................................................


AFTER TEST 17


TLP: IUN31 12 PH=1......................................................

TLP: IUN31 12 PH=2......................................................


MENT


UN303 32 6 -- -- 10 --

UN303 32 5 -- -- 10 --

CABLE -- -- -- -- 10 3

Only the extended TLP message explic-

itly identifies the node(s) within the isola-

tion that may have failed diagnostic

phases 1 and 2.

7/29/2019 172254




401-661-045

REPT RING CFR



(403041870)(403042272)

This action was triggered by the auto-

matic RST command, which concludes

with a request that as much as possible

of an isolated segment be included in

the active ring. The isolated segment isnow reduced to the two nodes whose

ring interfaces are still suspected of

being faulty.

DGN IUN 31 12 STF...................................................

REPT ARR AUTORST


REPT ARR AUTORST

CNR UCL RST FOR IUN32 4 STARTED

The new BISO node, having been an

innocent victim of the isolation, was out-

of-service. Restoring a BISO or EISO

node is the highest priority of ARR.

REPT ARR AUTORST

CNR UCL RST FOR IUN32 4 SUCCEEDED


REPT ARR AUTORST


Having previously attempted and failed

to restore IUN32 6, ARR now attempts

to restore IUN32 5. Consult the section

``Restoral Priorities Rule” in this chapter

for an explanation of ARR’s behavior in

the remainder of this example.



DGN IUN32 5 PH 1 (STF (10X’00000000 X’00000000)

TEST....................................................................

004.....................................................................

005 X’00000e05.........................................................

006.....................................................................

007.....................................................................

008.....................................................................

This is output from ARR’s restoral

request for IUN32 5.

7/29/2019 172254




Ring Maintenance

DGN IUN32 5 PH 2 (STF (11X’00000000 X’00000000)

TEST..................................................................request.

004.....................................................................

005 X’00000e06.........................................................

006.....................................................................

007.....................................................................

008.....................................................................


AFTER TEST 17


TLP: IUN32 5 PH=1........................................................

TLP: IUN32 5 PH=2........................................................

ANALY TLPFILE IUN31 12 / SUSPECT FLTY EQUIP-MENT


UN303 32 6 -- -- 10 --

UN303 32 5 -- -- 10 --

CABLE -- -- -- -- 10 3

RST IUN32 5 STOPPED 10

DGN IUN32 5 STOPPED COMPLETED

REPT ARR AUTORST


REPT ARR AUTORST

ARR UCL RST FOR RPCN32 0 STARTED

Having attempted to restore all nodes

whose ring interfaces are possibly faulty,

ARR now unconditionally restores the

innocent victim RPCN...

RST RPC32 0 COMPLETED

REPT ARR AUTORST

ARR UCL RST FOR IUN31 13 STARTED

and then the innocent victim IUNs. (The

ROP output concerning restoral of the

innocent victim IUNs is omitted from this

example.)

REPT ARR AUTORST

ARR UCL RST FOR IUN31 13 SUCCEEDED


OP:RING;DETD

7/29/2019 172254




401-661-045

OP:RING, IUN31 11


IUN31 11: MJ = OOS; NM = MAN; RI = QUSBL; NP =

USBL

IN ACT RING

OP:RING, IUN31 12


IUN31 12: MJ = OOS; NM = MAN; RI = QUSBL; NP =

USBL

IN ACT RING

Notice that IUN31 11 and IUN31 12 are

now quarantined and in the manual

mode. They are in the manual mode

because ARR previously failed to restore

them. They are quarantined—classified

as QUSBL—because no diagnostic

phases higher than 2 have been run on

them and, therefore, IMS cannot know

that their ring-interface hardware (except

for the hardware tested by phases 1 and

2—that is, the hardware that propagates

messages on the ring) is usable.



00AAAAAAAAAAAA.... 01................ 02................

30................ 31.AAAAAAAAAAOOAAA 32AAAAAiiAAAAA....

33 55

63.AAAAAAAAAAAAAAA

7/29/2019 172254




Ring Maintenance

RST:IUN32 6:TLP Following standard procedures, the tech-

nician now assigns priority to performing

maintenance on the remaining isolated

segment. Choosing IUN32 6 because it

was an external isolated node in themassive isolation, the technician

changes the circuit pack indicated in the

original TLP message and then condi-

tionally restores the node to service.

(Although manual restoral requests take

priority over automatically requested

conditional restorals, the former can

occur in parallel with automatically

requested unconditional restorals, such

as are occurring. Therefore, the techni-

cian felt free to conditionally restore

IUN32 6. If a conflict had existed, allow-

ing the rapid recovery of the many inno-

cent victim nodes to proceed withoutinterruption would usually make sense.

The decision to conditionally restore

IUN32 6 rather than to follow the some-

what slower procedure of running diag-

nostics on it with the RAW option was

dictated by the high probability that

IUN32 5 is the other node involved in this

instance of low-phase ambiguity.)

REPT ARR AUTORST


RST:IUN31 11 TASK 1

REPT ARR AUTORST



REPT ARR AUTORST



REPT ARR AUTORST


DGN IUN31 11 COMPL CATP (X’00000000X’40000000)

See the OM under DGN IUN, Bit 30,which indicates that all phases did not

run because the node under test was not

the only isolated node.

7/29/2019 172254




401-661-045

RST IUN31 15 COMPLETED ROP output concerning ARR’s uncondi-

tional restorals of the remaining innocent

victims is omitted from this example.



DGN IUN32 6 COMPL CATP (X’00000000 X’40000000)

REPT RING CFR



(403431319)(403431699)

That IMS is dissolving the remaining iso-

lation, returning the ring subsystem to a

two-ring structure, indicates the fault was

located in IUN32 6.


OP:RING;DETD!

OP RING COMP

RING STAT: ACTIVE



00AAAAAAAAAAAA.... 01................ 02................

30................ 31.AAAAAAAAAAOOAAA 32AAAAAOAAAAAA....

33 3

63.AAAAAAAAAAAAAAA

7/29/2019 172254




Ring Maintenance

Automatic Recovery from Two Intermittent Faults

In the following example of ring maintenance, two staggered intermittent faultsoccur at intervals that frustrate successive EAR recovery attempts by repeatedly

violating the 5-second confidence intervals. In this manner the faults drive EAR tolevel 4 before it can establish a stable, usable ring. The sequence of automatic

actions culminates in a restored system. It, therefore, requires the technicians toonly record the occurrences and locations of the two intermittent faults.

This episode occurs in the following ring:

RST:IUN31 12! Now the only task remaining for the tech-

nician is to conditionally restore the

remaining out-of-service nodes, none ofwhich will be handled by ARR, since

they are all in the manual mode. Proba-

bly none of the out-of-service nodes will

contain faults, since one has had its ring-

interface circuit pack replaced and the

other two were designated as possibly

faulty as a result of low-phase ambiguity.

Nevertheless, the technician restores

them conditionally to be certain that a

fault undetected in one of them does not

lead to another massive isolation. If

while diagnostics are run on these

nodes, a fault were to appear elsewhere

in the ring, IMS would avoid a massiveisolation by immediately returning the

node being diagnosed to the active ring.




REPT RING CFR



(403490173)(403490559)

The predictable action that concludes

this example is not reproduced.

7/29/2019 172254




401-661-045

REPT RING CFR



0 1 4 3600000.......................(4034364845)

A ring-related fault stimulates EAR to a

level-0 attempt (restart) to recover the

ring.

REPT RING CFR



(4034364857)(4034365210)

The restart succeeds initially, but...



RAC 0

.......................................................................

............................................(4034364730)



.......................................................................

............................................(4034364740)


00AAAAAAAAAAAA.... 01................ 02................


63.AAAAAAAAAAAAAAA

CMD FUNCTION


7/29/2019 172254




Ring Maintenance



.......................................................................

............................................(4034364745)

REPT RING CFR



REQUESTED

0 1 4 3600000.......................... (4034368158)

...another fault occurs less than 3 sec-

onds into the recovery, thereby, driving

EAR to escalate to a level-1 attempt to

isolate the faulty node.

REPT RING CFR



(4034368175)(4034368492)

The isolation succeeds momentarily,

but...



RAC 0

.......................................................................

............................................(4034368041)



.......................................................................

............................................(4034368051)



.......................................................................

............................................(4034368056)


UNEXPLAINED LOSS OF TOKEN REPORTED ON

BOTH RINGS.

...within the confidence interval the

3B21D receives notice that the token is

lost without receiving other error reports.

REPT TOKEN TRACK

TOKEN WAS LOST BETWEEN IUN32 5 AND IUN32 6

ON RING: 0

The token-track module reports the

probable location where the token left

the ring.

REPT RING CFR


0 1 4 3600000.............................(4034373503)

When unexplained loss of token occurs

during the confidence interval of levels 0

or 1, EAR jumps to level 3.

7/29/2019 172254




401-661-045

REPT RING CFR



(4034374032)(4034374330)

EAR level-3 tests for continuity in the

rings. Because the tests succeed, EAR

directs ring configuration to establish the

normal, two-ring structure. The success

of the ring continuity tests are the firstclear indication that the recent faults are

transient in nature.

REPT RING CFR


0 1 4 3600000..............................(4034376599)

But again the confidence interval fails, so

EAR escalates to level 4.

REPT RING CFR



(4034384478)(4034384790)

Level 4 also finds continuity in the rings

and directs ring configuration to estab-

lish the normal, two-ring structure. In this

instance the recovery out lasts the confi-

dence interval, thereby, ending this epi-

sode of EAR escalation. Evidently the

episode was triggered by two transientfaults. The location of one fault is sug-

gested by the short-lived, level-1 isola-

tion of IUN31 11. The location of the

other was identified by token track as

between IUN32 5 and IUN32 6. The

technician who witnesses these events

should record the occurrences and loca-

tions of the two intermittent faults and

perhaps should retain the ROP output of

this unusual episode.

7/29/2019 172254


7/29/2019 172254




401-661-045

ContentsLoading Memory 4-24

Reading Memory 4-24

Loading and Dumping RGRASP Utility Variables (UVARs) 4-25

Feature Activation 4-25Feature Deactivation 4-25

s Equipment Configuration Data (ECD) 4-25

s Recent Change Procedures 4-25

s Measurement 4-25

s Network Management Impact 4-26

s Maintenance/Troubleshooting Impact 4-26

s Recording 4-27

s Output Messages 4-30

s Audits 4-31

s Critical Events 4-31

s Support Tools 4-31s Related Documentation Cross-References 4-31

7/29/2019 172254




4

Ring and Ring Node MaintenanceProcedures

Introduction

This guide serves as an aid in performing ring and ring hardware maintenancefunctions. It contains procedures used in detecting, troubleshooting, and clearing

faults associated with the ring and ring hardware. The procedures detailed in thisguide are only guidelines for resolving ring-associated maintenance problems,

and are not the only methods that may be used in performing ring maintenance.

A system called trace provides a formal mechanism for embedding tracepoints

within application code for use in testing and debugging. The system collects andforwards the trace messages produced by individual tracepoints to one or more

destinations, including log files, ROPs and MCRTs. The tracepoints arecontrolled, so a related group scattered throughout the software can be turned on/

off at will. The parameters can also be set and changed using craft commands.The trace system is created automatically by during its initialization. Also, the usermay create it manually. The tracepoints are designed to generate little overhead

when disabled, but when used improperly, the trace system can consume largeamounts of system resources while yielding little useful information.

Craft commands allow one to totally inhibit all tracepoints, so that no trace

messages are generated and the trace system uses little overhead, or to enablesubsets of the tracepoints, thus restricting trace output to only that dealing with

selected portions of application code. ALW:TRACE and INH:TRACE provide thebasic on/off switch for trace. Until ALW:TRACE is invoked, no trace messages canbe generated and logged under any circumstances. Similarly, once INH:TRACE is

invoked, trace becomes totally dormant except for a certain amount of fixedoverhead. If trace is inhibited, the SET:TRACE command allows one to specify

7/29/2019 172254




401-661-045

which tracepoints are active once trace is again enabled or, if trace is active, the

command allows one to control the tracepoints during operation. The command,OP:TRACE, presents a summary of the current status of trace. The output

message, REPT TRACE, reports a tracepoint from a 3B21D computer process or

a node processor. The output message REPT TDTP indicates that the traceprocess has encountered a hardware or software fault. It should also be noted

that the trace process is terminated when the system enters disk independentoperation; see the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks

INPUT MESSAGES Message Manual or the 401-610-057 FLEXENT™/ AUTOPLEX ® Wireless Networks OUTPUT MESSAGES Manual.

Ring maintenance functions for a office serve to detect, troubleshoot, and clear allfault conditions associated with the ring and ring hardware. The most common

fault conditions associated with the ring are the following:

s Ring node out-of-service (OOS)

s Single ring node isolation

s Multiple ring node (RN) isolation

s Ring down.

Another less common fault condition on the ring is unexplained loss of token.

These fault conditions are discussed in the remainder of this section. Foradditional information on ring maintenance.

Direct link nodes (DLNs) follow the same guidelines as link nodes (s) in thissection. CDN-I nodes also follow these guidelines except for removing ring

application processor (RAP) circuit packs which require the power be turned offbefore circuit pack (CP) extraction.

7/29/2019 172254




Ring and Ring Node Maintenance Procedures

Ring Fault Conditions and Maintenance

Approach

The information contained in this guide provides a maintenance approach foreach ring fault condition listed above. These guidelines should be used only afterthe automatic ring recovery (ARR) has completed its attempt or has restoredfaulty ring nodes. For additional information concerning the use of ARR, refer to

the “Maintenance Description” section in the this Manual.

Ring Node Out-of-Service

A ring node can be removed from active service and placed in the Out-Of-Service(OOS) state for many reasons. An RN may be placed in either of the OOS

maintenance states (OOS-NORMAL or the OOS-ISOLATED state). When a nodeis placed in the OOS-ISOLATED state, the node is first removed from service

(OOS-NORMAL) and then isolated from the active ring (OOS-ISOLATED). Whena node is removed from service for maintenance or fault detected purposes that

does not interfere with the operation of system functions, the node may be takenOOS-NORMAL. The isolated node is not able to communicate or perform normal

node functions with the ring, but is capable of performing and handlingmaintenance functions. In the OOS-NORMAL state, the node is said to bequarantined. The OOS maintenance states may be observed from the

maintenance CRT (MCRT) on the 1106 display page. For additional informationconcerning OOS nodes in the quarantine state, refer to the “Maintenance

Description” section in this Manual.

Ring Node OOS Maintenance Approach

This maintenance approach provides information which aids in diagnosing,correcting faults, and restoring nodes to active service. When a node is

quarantined, it is not allowed to communicate with either the 3B21D computer, orthe ring. When a node is quarantined, the state of the ring interface is quarantine

usable (QUSBL). To verify this state, refer to the OP:RING command in the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUT MESSAGES

Message Manual or the 401-610-057 FLEXENT™/AUTOPLEX ® WirelessNetworks OUTPUT MESSAGES Manual. In cases where a node is in the OOS(quarantined) state, the most likely cause of this failure is the node processor (NP)

or link interface. Listed below are guidelines to be used in troubleshooting,correcting, and restoring quarantined nodes to service.

Assumption: An equipment malfunction has been detected, the fault recoverysoftware has removed the node from service and placed it in the OOS-NORMALmaintenance state, where xx and yy are active nodes. The ARR has attempted torestore the node to service and has failed (manual action is required).

7/29/2019 172254




401-661-045

Figure 4-1. Ring OOS Normal

Procedure 4-1. Ring Node OOS Maintenance Guidelines

1. Determine the reason(s) the node hasbeen taken OOSand placed in the quarantinestate. Diagnose the faulty (OOS-NORM) node. Use guidelines presented in Chapter

6, Diagnostic User's Guide.

Does the node remain OOS-NORMAL?

No—DONE.

Yes—Proceed to next step.

2. If the node remains OOS-NORMAL, then starting with the OOS-NORMAL node,

isolate and replace all RN CPs in the order of the NP, the link interface, ring interface

0 (RI0), and RI1, and then perform a conditional restore. For very large scale

integration (VLSI) RNs, replace the integrated ring node (IRN) circuit pack and thenthe link interface. If the trouble clears after replacing the CPs in the order listed,

when office traffic is minimal, the original CP(s) should be reinserted one at a time in

the node, and diagnostics should be run to determine the faulty CP(s). If the

diagnostics fail to detect the faulty CP(s), but the previous CP replacements cleared

the trouble, then the CP(s) should be saved, noting the failure conditions. Inform the

CTS of this condition.

3. After replacing the CP(s), if the node still remains OOS, then check the equipment

for shorts, loose wiring, bent or broken pins, etc., and correct any problems

discovered. Also, check to see if proper equipment has been used with the long

message option.

4. Diagnose node (xx) adjacent to the faulty node using guidelines in Chapter 6,

Diagnostic User's Guide .

If problems are located, correct and restore node (xx) to service.

OOS-NORMxx yy

7/29/2019 172254





NOTE:Perform an unconditional restore on the OOS-NORMAL node using the commandRST:nodexx y ;UCL

where:

For LN—

node = LN

x = node member numbery = node member number

UCL = restores the node without performing diagnostics.

For RPCN—

xx = group number

y = 0


! CAUTION:Do not perform an unconditional restore unless one of the following has occurred:

s A complete diagnostics has produced an all-tests-passed (ATP)response.

s A complete diagnostics has produced a conditional all-tests-passed (CATP) response and the RI and the NP minor states are both usable (USBL).

Does the faulty node remain OOS-NORMAL?

No—DONE.


5. Diagnose node (yy) adjacent to the faulty node.

If problems are located, correct and restore node (yy) to service.

NOTE:Perform an unconditional restore on the OOS-NORMAL node using the command

RST:nodexx y ;UCL

where:

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254




401-661-045

Figure 4-2. Single Node Isolation

Procedure 4-2. Single-Ring Node Isolation Maintenance Guidelines

1. Diagnose the isolated and faulty node using diagnostic guidelines listed in Chapter6, Diagnostic User's Guide . If the isolation still exists after using these guidelines,

proceed to next step.

2. If after diagnosing and troubleshooting the isolated node, the node does not restore

to activeservice (thereby eliminating the isolated segment), diagnosetheBISO node

using guidelines listed in Chapter 6, Diagnostic User's Guide .

If the ring is too small to allow the adjacent nodes to be isolated, the isolation must

be moved.

To diagnose the BISO node, the node must be excluded from the active ring. To

accomplish this, use the RMV command. See the 401-610-057 FLEXENT™/ AUTOPLEX ® Wireless Networks OUTPUT MESSAGES Manual. When the BISOnode is removed from service (OOS-NORM), it is automatically included in theisolated segment (OOS-ISOLATED). The application may restrict the RMV

request.

If the request is accepted, proceed with diagnostics as usual.

If the request is denied, it may be necessary to input the command to remove theapplication's node from service and to diagnose the node.

Put the signaling link (SLK) in the AVAILABLE-Manual Out-of-Service (MOOS)state, type the following message into the MCRT, and proceed with diagnostics as

usual:

CHG:SLK (a, b, [c, d] ); MOOS

where: a = group number (00 - 63)

isolatedBISO EISO

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254


7/29/2019 172254




401-661-045

to replace CPs, and to restore the r ing to an operational state. The second

approach (B) details guidelines that should be used when the load on the CNI isminimal. The first approach is not intended to be used as the total maintenance

approach, and should only be used when time does not allow for diagnostic

testing. Otherwise, approach ``B'' should be used whenever possible.

7/29/2019 172254





Multiple Node Isolation Maintenance Approach

A multiple node isolation occurs when there are two or more failures that occur on

the ring, causing a potentially large isolated segment. This maintenance approach

provides information which aids in testing, repairing, and restoring nodes inisolation to minimize the effect on service. When there is an isolated segment ofmultiple nodes, with an established BISO and EISO node, the most probable

faulty node(s) are the isolated nodes adjacent to the BISO and EISO nodes. Thisis assumed because both the BISO and EISO nodes of a multiple node isolationare most likely to be established adjacent to the faulty node when attempting to

recover from ring error conditions. Therefore, by troubleshooting the nodesadjacent to the BISO and EISO nodes, faults are corrected with the least amount

of time and service interruption. For a more complete explanation of BISO andEISO node information, refer to the “Maintenance Description” section in this

Manual.

Assumption: An equipment malfunction has been detected, the fault recovery

software has removed multiple nodes from service, reconfigured the ring, andformed an isolated ring segment around the faulty nodes. The ARR has attempted

to restore the nodes to service and has failed.

NOTE:If multiple nodes are isolated within a segment, the test approach is to diagnosethe isolated node adjacent to the BISO node first, and then the isolated node

adjacent to the EISO node. See Figure 4-5. Next, the nodes (xx and yy) must bediagnosed. After these nodes are diagnosed, the BISO and then the EISO nodes

are diagnosed. Nodes are diagnosed in this manner because the most probabletrouble nodes are established next to, or close to BISO and EISO nodes. There

may be other nodes within the isolated segment that are not faulty but are

included in the isolated segment because they are between the two faulty nodes.When performing maintenance on a multiple node isolation, one should attempt to

clear problems associated with either the BISO or the EISO end of the segment toform a single node isolation. Once the single node isolation has been established,

follow the single-node isolation test approach.

It has been determined that there are two or more faulty nodes in an isolatedsegment, and all faulty nodes have been removed from service and isolated fromthe active ring.

7/29/2019 172254




401-661-045

Figure 4-5. Two or More Faulty Nodes

The xx, yy, and zz represent nodes that are in the isolated segment and may or

may not be faulty.

Procedure 4-3. Multiple-Ring Node Isolation Maintenance Guidelines - A

This maintenance approach does not detail direct procedures, but insteadprovides the user with an understanding about what may be done differently from

Approach B to reduce time consumed in restoring the ring and ring hardware.

1. Have “tested good'' link node CPs available.

2. When a multiple fault occurs that isolates two or more nodes, causing innocent

nodes to become OOS and included in an isolated segment as depicted in the

diagram above (xx, yy, zz), then perform the following:

a. Replace all CPs within the node at either end of the isolatedsegment, and perform a conditional restore on the node. Be certainto place all replaced CPs in protected static packaging.

b. After problems are cleared at either end, and the isolation clears or

is reduced in size, then the innocent OOS nodes should restore toactive service automatically, possibly leaving only a single isolatednode at the other end.

3. Diagnose and correct all problems associated with the node left isolated.

Troubleshoot the node in this manner to avoid including innocent nodes in theisolated segment.

4. When office traffic is minimal, replace the original CPs in the faulty node where the

CPswere originally replaced, anddiagnose (troubleshoot) it until the faulty CP(s) are

located.

BISO iso 0 xx yy zz iso 1

7/29/2019 172254





5. Place all otherCPs in the original static wrapping, andstore them (the ` t̀ested good''

CPs) for possible, future faults.

Procedure 4-4. Multiple-Ring Node Isolation Maintenance Guidelines - B

1. Diagnose iso 0 using guidelines listed in Chapter 6, Diagnostic User's Guide .

NOTE:If the fault in iso 0 is corrected and the node is restored to service, then theisolated segment of the ring is shortened. This creates a new BISO node and

change from a multiple node isolation to a single node isolation, restoring all theinnocent OOS nodes.

Does the original isolation still exist, or is iso 0 OOS-NORMAL?

If an isolation still exists, but has been shortened, and iso 0 is OOS-NORMAL andknown to be usable, unconditionally restore iso 0 to service, and then proceed to

Step 6. Use one of the following commands to restore the node:

s For s, enter RST:xx y;UCL!

s For RPCN, enter RST:RPCNxx yy;UCL

where: xx = group number

y = node member number



s A complete diagnostics has produced an ATP response.

s A complete diagnostics has produced a CATP response, and the RI and the NP minor states are both USBL.

If iso 0 remains OOS-NORMAL, refer to ``Ring Node OOS Maintenance

Approach'' in this chapter.

If the original isolation still exists, proceed to next step.

2. Diagnose node xx using guidelines detailed in Chapter 6, Diagnostic User's Guide .

7/29/2019 172254




401-661-045

If node iso 0 is in the OOS-NORMAL state, and the original BISO node no longer

exists after diagnosing and repairing node xx, then refer to ``Ring Node OOSMaintenance Approach.''

If the above statement is true, and all problems are corrected concerning thesenodes, then a single node isolation may be formed, including a new BISO node,

iso 1, and the EISO node. If this occurs, then refer to ``Single Node IsolationMaintenance Approach'' for the remainder of these guidelines.

If the original isolation still exists after diagnosing node xx and correcting any

problems, then repeat Steps 1 and 2 using nodes iso 1 and yy. If the originalisolation still exists, then proceed to the next step.

3. Diagnose the BISO node.

NOTE:The BISO node is an active node on the ring. To diagnose the BISO node, the

node must be excluded from the active ring. See Figure 4-6. To accomplish this,use the RMV command. See the 401-610-057 FLEXENT™/AUTOPLEX ®

Wireless Networks OUTPUT MESSAGES Manual. When the BISO node isremoved from service (OOS-NORM), it is automatically included in the isolated

segment (OOS-ISOLATED).

Figure 4-6. New BISO Node

The RMV request may or may not be accepted. If the request is accepted,proceed with diagnostics as usual, using guidelines listed in Chapter 6, Diagnostic

User's Guide .

If the request is denied, it may be necessary to remove the node and SLK fromservice, and then diagnose the node.

To put the SLK in the AVAILABLE-MOOS state, type the following message into

the MCRT, and proceed with diagnostics as usual:

CHG:SLK (a, b, [c, d] ); MOOS


NEWiso 0 xx yy zz EISOiso 1BISO

OLDBISOiso

7/29/2019 172254





b = member number (01 - 15)

The following message should appear on the MCRT:

CHG SLK a b [c d]

NEW REQUESTED MINOR STATE = MOOS



c = LI4 circuit pack (0 - 1)

d = LI4 port (0 - 3)

NOTE:After diagnosing and clearing problems associated with the BISO node, if any are

located, restore the node to service using guidelines for restoring all other nodes.

After diagnosing the BISO node, if problems are found and corrected, and if anATP response is received, the BISO node may be deleted, leaving the iso 0 nodein the OOS-NORMAL state. If this occurs, restore iso 0 to service. Refer to ``Ring

Node OOS Maintenance Approach'' in this chapter.



s

A complete diagnostics has produced a CATP response, and the RI and the NP minor states are both USBL.

If problems are corrected with the BISO, iso 0 , and xx node, then the isolatedsegment of the ring should shorten, leaving only a single isolated node. If this

occurs, refer to ``Single Node Isolation Maintenance Approach'' in this chapter forthe remainder of this test.

If the SLK was manually removed from service, put it back in the AVAILABLE-IS or

AVAILABLE-STBY state by entering the following message at the MCRT:

CHG:SLK (a, b, [c, d] );{ IS | ARST}




7/29/2019 172254




401-661-045

CHG SLK a b [c d]

NEW REQUESTED MINOR STATE = e



c = LI4 circuit pack (0 - 1)

d = LI4 port (0 - 3)

4. If the original ring isolation still exists, startingwith node iso 0 , then xx, and finally the

BISO node, replace all RN CPs in this order: ring interface 0 (RI0), RI1, the NP, and

the link interface. Perform a conditional restore. For VLSI RNs, replace the IRN

circuit pack and then the link interface. If the trouble clears after replacing the CPs in

the order listed, the original CPs should be reinserted one at a time in the node and

diagnostics run to determine the faulty CP(s). If the diagnostics fail to detect the

faulty CP(s), but the previous CP replacement cleared the trouble, then the CP(s)

should be saved, noting the failure conditions. Inform the CTS of the condition.

5. If the original ring isolation still exists, visibly inspect affected equipment for shorts,

bent or broken pins, backplane faults, etc. Also ensure that proper equipment has

been used with the long message option. If problems are located, correct the

problems and perform a conditional restore on the affected equipment.

6. If the isolation still exists, or if all problems with the original BISO node, the iso 0

node, and node xx have been cleared, diagnose and attempt to correct problems

associated with nodes iso 1, yy, and the EISO node, using Steps 3 through 5 of

these guidelines. See Figure 4-7.

Figure 4-7. More Than One Faulty Node

NOTE:After correcting and restoring this portion of the isolated segment of the ring,

attempt to restore iso 0 , xx, and the BISO nodes if problems were not corrected inprevious steps.

NEWiso 0 xx yy zzBISO iso 1 EISO

OLDEISOiso

7/29/2019 172254


7/29/2019 172254




401-661-045

NOTE:For additional information on the initialization levels, refer to `Ìnitialization,'' Part 4of this manual.

Does the ring initialize?

Yes—Proceed to next step.No—Proceed to Step 5.

3. Are all nodes that were not previously OOS (except quarantined nodes) before the

ring down state restored to service?

Yes—Proceed to Step 8.

No—Proceed to next step.

4. For all nodes that were not previously OOS before the ring failure, perform an

unconditional RST. See Chapter 6, Diagnostic User's Guide, or the 401-610-055FLEXENT™/AUTOPLEX ® Wireless Networks INPUT MESSAGES Message

Manual or the 401-610-057 FLEXENT™/AUTOPLEX ® Wireless Networks

OUTPUT MESSAGES Manual.

Did all nodes previously not OOS prior to the ring failure restore?

Yes—Proceed to Step 8.No—Proceed to next step.

5. Attempt to reinitialize the ring. Perform a level-4 initialization (see the proper

application in the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks

INPUT MESSAGES Message Manual or the 401-610-057 FLEXENT™/

AUTOPLEX ® Wireless Networks OUTPUT MESSAGES Manual.).

NOTE:For additional information on the initialization levels, refer to `Ìnitialization,'' Part 4of Chapter 6, Diagnostic User's Guide .

Does the ring initialize?


No—Proceed to Step 9.

6. Are all nodes that were not previously OOS prior to the ring failure restored to

service?

Yes—Proceed to Step 8.

No—Proceed to next step.

7/29/2019 172254





7. For all nodes that were not previously OOS before the ring failure, perform an

unconditional RST. See Chapter 6, Diagnostic User's Guide .

8. Are there any other nodes OOS left on the ring?

No—DONE.Yes—Determine the ring condition (single node isolation, multiple node isolation,etc.) and proceed to that condition's maintenance approach presented in this

chapter.

9. If the system still doesn't initialize after the level-3 and level-4 initialization attempts,

call the CTS.

Ring Generic Access Package (RGRASP)

Feature Definition

RGRASP is a single-user utility system for the CNI ring nodes. RInteractions

! CAUTION:Care must be exercised when using the RGRASP tool. Improper use of RGRASP can result in program mutilation or excessive utilization of system

resources. Both of these consequences of improper use of the tool can lead to call processing downtime and therefore interrupt the operation of a node on the ring or the whole ring.

Feature Description

The RGRASP tool can:

s Set (allow) breakpoints (a breakpoint corresponds to the address of the

first byte of a target process instruction).

s Clear breakpoints.

s Report on current status for specified breakpoints.

s Inhibit breakpoints.

s Load a specified RGRASP utility variable (UVAR).

s Dump a specified RGRASP UVAR.

s Load a specified node with data.

7/29/2019 172254




401-661-045

s Dump the contents of a specified address in a given node.

s Direct the loading of an address.

s Dump the contents of a specified Application Processor or Node Processor

register.

Software Impact

This feature does not impact customer engineerable software resources on APs.

This feature could impact customer engineerable software resources on NPs,dependent on memory size.

Software Description

The software consists of the following processes:

RGP_KER This is a UNIX process kernel for the feature. It acts as theinterface between the AM (RG_CFT and RG_PRT) and the ring

node (monitor) processes.

RGP_CFT This UNIX process handles input commands from the craft shell.

It parses and performs some preliminary checking on the inputcommand. Then it relays the command to the RG_KER process

for further processing.

RGP_PRT This UNIX process handles printing of output.

monitor This system process performs the actual operations required tohandle breakpoints, memory dumping, and memory loading. It

communicates with the RGP_KER.

User Profile

This feature and its associated input commands are intended for use bytechnicians in conjunction with the CTS.

Description of Feature Operation

The following paragraphs describe how this feature can be used.

Initial Setup

First, determine the address in memory that requires investigation. This can bedone by using the latest PR/PK listings provided. This address may be provided

by the CTS.

7/29/2019 172254





Determine which processor should be looked at. In the case of the DLN, there is

an active and a standby processor. Use the OP:SLK or poke the 118 page todetermine this. As a precaution, it is a good idea to set breakpoints in only one

processor at a time.

Setting a Breakpoint

You can set a breakpoint in a program using the WHEN:RUTIL input command.

Before this can be done, the opcode (OPC) must be known. To verify the OPC,use the DUMP:RUTIL command to dump the memory at the breakpoint address.

If the expected OPC does not match the dump output, then the listings do notmatch the memory. This discrepancy should be cleared up before continuing the

procedure. One possible explanation is that the node software is out of date. Toeliminate this possibility, you can remove and restore the target node (node in

which breakpoint is to be set). Doing this will ensure that the newest version ofcode has been pumped from disk. You can use the RMV:LN and RST:LN

commands or 118 poke to achieve this. After the node has been pumped, trydumping the breakpoint address again. If it does not match up now, you know thelistings are out of date. In this case, you should stop and get a current l isting

before proceeding.

The WHEN:RUTIL command allows you to specify actions (commands) to beexecuted when the breakpoint you set fires. The input message manual page for

WHEN:RUTIL defines the actions. Up to 24 actions may be specified in the actionlist for a single breakpoint. The action list must be terminated by a END:WHENcommand. The action list can contain only the END:WHEN command, in which

case you will simply know whether a piece of code is being executed.

Only five breakpoints can be set in any one ring node processor.

Loading Memory

You can load memory with the LOAD:ADDR, LOAD:WORD, LOAD:SHORT orLOAD:BYTE commands within the WHEN:RUTIL command or with theLOAD:RUTIL command. Details on the use of these command are providedunder " Input Messages.''

! CAUTION:

Loading memory may drastically change program execution. If not done properly, this can interrupt or degrade service; for example, calls may be lost.

7/29/2019 172254


7/29/2019 172254





s LOAD:RUTIL

s OP:RUTIL or OP:RUTILFLAG

s WHEN:RUTIL command

Feature Deactivation

You can deactivate the feature; that is, clear all breakpoints in a specified nodewith the CLR:RUTIL command. You can clear a specific breakpoint in a specified

node with the CLR:RUTILFLAG command.

You can temporarily disable or inhibit all breakpoints in a specified node with theINH:RUTIL command. You can temporarily disable or inhibit a specific breakpointin a specified node with the INH:RUTILFLAG command.

Equipment Configuration Data (ECD)

ECD are not affected by the RGRASP feature.

Recent Change Procedures

Recent change procedures are not associated with the use of the RGRASP tool.

Measurement

No measurements are provided as part of the RGRASP tool.

Network Management Impact

If the RGRASP tool is used improperly, service interruption or degradation can

occur.

Maintenance/Troubleshooting Impact

The RGRASP tool is a debugging tool for CNI ring nodes. It is usable only atnodes that are active from an IMS viewpoint, such as the IMS ACT state. Nodes

that are quarantined or isolated cannot be accessed with RGRASP.

There are no new diagnostics related to this tool.

7/29/2019 172254




401-661-045

RGRASP breakpoints are affected by CNI initialization levels as follows:

Level Effect

O,1,FPI,2,3 None4 Clears all breakpoints

Recording

This tool has no impact on recording.

Procedure 4-6. Input Messages

The following input messages/commands are associated with the RGRASP tool.For more information about each of these messages, refer to the 401-610-055

FLEXENT™/AUTOPLEX ® Wireless Networks INPUT MESSAGES MessageManual or the 401-610-057 FLEXENT™/AUTOPLEX ® Wireless Networks

OUTPUT MESSAGES Manual.l.

! CAUTION:Incorrect use of these commands may interrupt operation of a node on the

ring or the whole r ing. READ EACH PURPOSE CAREFULLY.

1. ALW:RUTIL or ALW:RUTILFLAG

The first command allows all breakpoints in the specified node; the second allows

a specific breakpoint in the specified node.

2. CLR:RUTIL or CLR:RUTILFLAG

The first command clears all breakpoints in the specified node; the second clearsspecific breakpoints in the specified node.

3. DUMP:ADDR

Dumps the contents of the specified address in the given node. This command is

allowed only within a WHEN:RUTIL command <action-list >.

4. DUMP:REG

7/29/2019 172254





Dumps the contents of the specified Application or Node Processor register in the

given node. This command is allowed only within a WHEN:RUTIL command<action-list >.

5. DUMP:RUTIL

Dumps the contents of memory at the address range given at the specified node.

It can also dump the contents of memory starting at the given address for thespecified number of bytes.

Currently a maximum length of 468 bytes is allowed for a single dump operation.

A formatted output of the node's memory contents will follow this input command.

6. DUMP:UVAR

Dumps the contents of the specified RGRASP UVAR. This command is allowed

only within a WHEN:RUTIL command <action-list >.

7. INH:RUTIL or INH:RUTILFLAG

The first command inhibits all breakpoints in the specified node; the second

inhibits specific breakpoint(s) in the specified node.

8. LOAD:ADDR

Loads the specified address with the specified data. This command is allowedonly within a WHEN:RUTIL command <action-list> .

9. LOAD:BYTE

Loads the address in the given node with the specified data. This command is

allowed only within a WHEN:RUTIL command <action-list >.

10. LOAD:REG

Loads an Application or Node Processor register with the specified data in thegiven node. This command is allowed only within a WHEN:RUTIL command

<action-list >.

11. LOAD:RUTIL

Loads the address at the given node with the specified data. The maximumnumber of data items allowed for loading is 128 bytes or 32 4-byte words.

7/29/2019 172254


7/29/2019 172254


7/29/2019 172254


7/29/2019 172254





7. REPT RGP PRT

Prints when anomalies occur within the print process of the RGRASP tool.

Indicates the kind of anomaly that has occurred.

8. REPT RUTIL

This message has 40 formats. Formats [1] through [15] report an error conditionencountered by the RGRASP RGP_KER process. Formats [16] through [40] print

in response to the firing of a breakpoint.

9. WHEN RUTIL

Prints in response to a WHEN:RUTIL command.

Audits

The RGRASP tool does not affect any audits.

Critical Events

The RGRASP tool does not affect any critical events.

Support Tools

The RGRASP tool is a new support tool.

Related Documentation Cross-References

For more details about the use of each input command associated with RGRASP,

refer to the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUTMESSAGES Message Manual.

.

For more details about the use of each output message associated with RGRASP,refer to the 401-610-057 FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUT

MESSAGES Manual.

7/29/2019 172254




401-661-045

7/29/2019 172254


Contents



5

Ring Critical Events

Introduction 5-1

Critical Event Message Output 5-2s Logging Critical Events 5-2

s Short Form CNCE Message 5-3

s Long Form CNCE Message 5-3

s Using the CHG:CEPARM Command 5-4

CNCE Descriptions 5-4

7/29/2019 172254




401-661-045

Contents

7/29/2019 172254




5


Introduction

CCS Network Critical Events (CNCE) are predefined events that are consideredindicators of abnormal network operation. They are of importance to network

operation and to the proper functioning of the office. Both on-site and supportsystem personnel must be immediately aware of events affecting the CCS

network. CNCE messages are output as these critical events occur and arereferred to as on-occurrence autonomous messages.

CNCE messages are output as critical events occur in the office or as networkevents are recognized and acted upon. There are approximately 70 critical events

in a system. Some critical events pertain to the CCS network in general, whileothers have significance to the. A CNCE could represent an occurrence, the

beginning of some state, or the ending of some state. Events indicating thebeginning or ending of a state should occur in pairs. A critical event neverrepresents a length of time.

The naming convention used for critical events is similar to the naming convention

used for measurements. It is as follows:

s The mnemonic represents as closely as possible the actual event. The

mnemonic is derived from a set of abbreviations representing typicalsignaling events. These abbreviations are combined to describe the event.

s The suffix E means the state indicated by the mnemonic has ended.

s Names may include letters, digits, or special characters.

s Names are unique and contain no more than 12 characters.

7/29/2019 172254




401-661-045

The names given to critical events are used by the Measurement Output Control

Table (MOCT), which is described in the ``Measurement Output Control Table''section in the. At the end of this section are tables providing explanations of each

critical event by name.

Critical Event Message Output

The Critical Event Table (CET) in the MOCT controls the reporting of criticalevents. The critical event handler is responsible for sending the message to the

users specified in the CET. This table includes information indicating which usersare to be informed of which particular critical events. The CET also specifies that

messages should be recorded in a log file and designates what form of themessage the users receive: long or short. Each of these forms is discussed later.Automatic reporting of critical events is in real time.

Logging Critical Events

The recognition of critical events (the occurrences to be reported) takes place inthe central processor. The following information is provided to the centralprocessor:

s Identification of the event that occurred (the CNCE name)

s When the event occurred (may be set to network or local time)

s Identification of the peripheral units involved, if required.

The critical event handler immediately generates a CNCE message. The CNCE

message is generated in two forms: short form and long form (see the REPT

CNCE message in the ). The CNCE message is automatically recorded in the

CNCE log file, first, using the long form. Then, it is output to the appropriate usersin the forms specified in the CET. The CNCEs are output at the MROP locally andare sent to various support system centers over BX.25 links. For more information

on the CNCE message forms, see the REPT CNCE message in the 401-610-057FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUT MESSAGES Manual.

The CNCE log file is a circular file stored on disk (/etc/log/CNCELOG). The file

contains a minimum of 90 minutes of the most recent CNCE messages. Themessages in the log file can be retrieved. The file can be output using theOP:LOG:CNCELOG UNIX system Real Time Reliable (RTR) command (see the

401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUT MESSAGES

Message Manual or the 401-610-057 FLEXENT™/AUTOPLEX ®

WirelessNetworks OUTPUT MESSAGES Manual). Support system users cannot use thiscommand over BX.25 sessions.

7/29/2019 172254





Short Form CNCE Message

The short form (shown below) provides the critical event name, local or network

time, and identification of the associated hardware (by pointcode, link set, or

group-member number). The short form is intended mainly for support systemsthat have a reference database containing details on the hardware identified.

Figure 5-1 shows examples of long and shor t CNCE messages. Refer tosee the

401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUT MESSAGESMessage Manual or the 401-610-057 FLEXENT™/AUTOPLEX ® Wireless

Networks OUTPUT MESSAGES Manual, for a description of the fields in a CNCEmessage. A CNCE message cannot be generated by an input command.

Figure 5-1. CNCE Messages

For CNCE messages related to PBX links, both long and short forms may contain

circuit pack and port identification and diagnostic code.

Long Form CNCE Message

The long form (shown above) includes all the information specified for the shortform message. Since the long form is used by the maintenance work force, moredetailed information must be provided. In particular, the office identification (CLLI

code), the speed, link type, and the protocol of the link. If applicable, it alsoincludes the VFL identification, function number, or subsystem number.

Refer to tsee the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks

INPUT MESSAGES Message Manual or the 401-610-057 FLEXENT™/ AUTOPLEX ® Wireless Networks OUTPUT MESSAGES Manualfor a descriptionof the fields in a CNCE message. A CNCE message cannot be generated by an

input command.

REPT CNCE Short

C6EMRPO 14:00:36:59 32-00 form

REPT CNCE Long

C7LCABMIS 14:00:36:59 7 02-0 ATLN_GA_TL_MS2_06 56. A form

7/29/2019 172254




401-661-045

Using the CHG:CEPARM Command

The CHG:CEPARM command allows users to change the parameters that control

the reporting of certain critical events. It is primarily intended for use by support

system users but may be entered by on-site users through the MCRT. TheC7NOTRNS and C7MTPERR events currently are controlled in this manner. Bothevents have ``cycle time'' and ``number of occurrences'' parameters. For adescription of these events, refer to the table in the following part.

The command is input as follows:

CHG:CEPARM:REPT a, EVENT b, CYCLE c!

Where: a = Name of the autonomous message event: NOTRNS or MTPERR. b =Number of occurrences, or messages, per cycle: 0 to 100. c = Duration of the

cycle in seconds: 0 to 60.

Upon execution of the command, an output message is generated and thespecified parameter values are stored. The values are first written to the /cmp/stp/ odata/miscparm disk file and then are used to update the appropriate main

memory tables. Any future occurrences of the specified event are reported asindicated by the new parameter values. The above-mentioned file also contains

the default values for the parameters.

CNCE Descriptions

The event names appearing in CNCE output messages are derived from theMOCT and are defined as shown in Table 5-1. The descriptions are presented

alphabetically by event name. The table shows the information provided by theCNCE message. The field, shown in parentheses after the event name, is the

group-member number, the point code, or the link set.

Often, an occurrence not only causes a CNCE message but is also counted as ameasurement. Some of the critical events in the table can be better understood by

referring to corresponding measurements in the first part of this chapter. That partcontains a table with more detailed descriptions of certain events. Somemeasurement names should be similar to the critical event names.

NOTE:The “C6'' or “`C7'' at the beginning of a CNCE name identifies the event as either

CCIS6 or CCS7 link related. The ``CP'' or ``CT'' at the beginning of a CNCE nameidentifies it as PBX node/link related. Others are per office events.

7/29/2019 172254





Table 5-1. CNCE Descriptions (Page 1 of 14)

Name (data) DESCRIPTION

C6ACB (gg-mm) Change back from a failure that is not a declared failure. This is anautomatic change back to a link that previously did an automatic

changeover and then restored. The change back must normallyoccur within 3 minutes of the changeover. If the LI reports a long

key exchange is taking place, this time period is extended to 6minutes. This event occurs for all automatic change backsexclusive of the C6ACBFLD event. Refer to the L6ACO_

measurements for a description of the changeover/change backsequence. This event is usually preceded by a C6ACO_ event.

C6ACBFLD(gg-mm)

Automatic change back from declared failure. This event indicatesthat the link is declared failed, has recovered, and traffic has been

routed back to the link. This event is preceded by one of theC6FLD_ events (see those descriptions for more information ondeclared failure). Note that if a link is in the MOOS state and an

emergency condition automatically forces the link back intoservice (called preemption), the C6MCB event occurs rather than

this event.

C6ACOCOV

(gg-mm)

Automatic changeover initiated by the far end. A changeover

involves transferring signaling messages from the unavailable linkto some other link. For example, in the case of a B-link, thechangeover results in messages being routed to the mate link,

and in the case of an A-link, the changeover results in messagesbeing routed to a C-link. When the changeover message is

received from the far end, the following occurs:

1. The link is removed from service.

2. No new messages are given to the link. Newmessages are diverted to the mate link or C-link.

3. Messages remaining in the transmit buffers are

retrieved and an attempt is made to transmitthese messages on some other link.

7/29/2019 172254




401-661-045

C6ACOCOV

(gg-mm) (Cont.)

4. Only synchronization messages are sent to the

far end.

5. The link switches VFLs and attempts tosynchronize.

6. If acceptable, the link is proven in (from 3 to 15seconds required) and restored. Messages are

routed back (referred to as change back).

Both VFLs are tested alternately until one syncs. If the link cannot

change back within 3 minutes (or 6 minutes if a long key exchangeis involved), it is declared failed. Refer to the L6ACO_

measurements for more information.

C6ACOER(gg-mm) Automatic changeover error threshold has been exceeded. Theerror rate monitor in the LI maintains a "leaky bucket" count of thenumber of SUs received in error during normal operation and alsoa linear count of SUs received in error during prove-in. If either

count exceeds some threshold, the error is reported to the node.The node then reports this event, and alternate synchronization

and changeover messages are sent to the far end (the far endrecognizes this as a changeover request). Similar actions to those

described for the C6ACOCOV event are taken.

C6BOFX (gg-mm) Transmit buffer overflow begins (this occurs only for the telephonemessage transmit buffer). This event indicates that message(s)

have been discarded because the buffer is full. The message is

discarded and this event is reported on the first attempt to transmita message with the buffer full. As long as the buffer is full,messages may be discarded. This event is not reported again at

least until buffer overload ends (indicated by the C6BOLXE event).This event should be preceded by the C6BOLX event.



7/29/2019 172254





C6BOLX (gg-mm) Transmit buffer overload begins (only the telephone message

transmit buffer). The number of signal units in the buffer hasreached the threshold for congestion controls to be activated. Thisevent is reported only once when the threshold has been reachedand not again at least until the overload ends. When the overload

occurs, the node returns selected outgoing messages to theiroriginations. The originators of these messages in turn control

their traffic towards the node experiencing buffer overload. Thismechanism is called selected return, and consists of the following:

s Return some direct signaling messages.

s Discard all IAMs and COTs and return message refusal to

the sending office.

s Send a group signaling congestion message to all offices

that send messages on this link.

Every second, the node checks to see if buffer occupancy has

dropped to an abatement threshold (see the C6BOLXE eventdescription). When that occurs, the overload has ended. Should

the link remain overloaded for one minute, it is declared failed.

C6BOLXE (gg-mm) Transmit buffer overload ends. This event indicates that thenumber of signal units in the transmit buffer has dropped to the

abatement threshold after an overload. The node checks thebuffer occupancy once each second. When occupancy has

reached the abatement threshold, selective message return isended and this event is reported. Both overload and overflow are

considered ended when this event occurs.C6DOC0 (gg-mm) Broadcast the “remove dynamic overload controls” message.

These messages are in response to messages from end offices

requesting the application or removal of a particular DOC state.The corresponding C6DOC_ event occurs when the message is

received. The request results in a DOCx message beingtransmitted backwards for all bands that can send messages to

the congested office. The messages are sent on each "trigger"band to the far end offices. The request may be received on aCCS7 link if virtual links are assigned. Those far end offices then

apply the controls to all bands associated with the trigger bands.All DOCx messages are one signal unit in length. Two minutes

after receiving the last message, an end office automaticallyremoves the controls. The DOC0 broadcast is an explicit request

for the end office to remove the controls.



7/29/2019 172254




401-661-045

C6DOC1 (gg-mm) Broadcast the dynamic overload control 1 message. The least

severe control. DOC1 and DOC2 are progressive controls usedwhen the congested office is only slightly overloaded or isrecovering from a failure. They allow CCS messages to be slowlyrestored to (or removed from) the affected office. For a description

of the broadcast mechanism, refer to the C6DOC0 event.

C6DOC2 (gg-mm) Broadcast the dynamic overload control 2 message. Refer to the

C6DOC1 description.

C6DOC3 (gg-mm) Broadcast dynamic overload control message to a far end office.

The most severe control. Caused by an emergency restart due toa received processor outage. This DOC message is broadcastevery minute until congestion is relieved. It stops all CCS

messages to the congested office. See the C6DOC0 event for adescription of the broadcast mechanism.

C6EMR (gg-mm) Emergency restart (EMR) begins. The specified link failed at thenear end causing a complete failure of banded signaling between

this office and the other office. This affects banded signaling, but ifa particular office contains only one link, other types of signalingmay be affected. If another path is available, the signaling load is

transferred to the other link and an EMR condition is not triggered.When the last link in the C-link pool or set fails, emergency

restarts are triggered on many A, B, and D-links. Refer to theEMR_ measurement descriptions. Since signaling messages

cannot be routed over the affected link, alternate link messages

may be lost (such as banded messages). Selective return is usedso some direct signaling messages are returned to their

originators. The end of the EMR condition is indicated by theC6EMRE event.

C6EMRE (gg-mm) Emergency restart ends. The link restoral causes an automaticstatus update for the affected link, bands, and routes. This event

indicates that the end of the EMR condition on the specified link(regardless of what triggered the EMR).

C6EMRPO(gg-mm) Emergency restart due to processor outage begins. The specified

link receives a processor outage message from the far end whileits mate is unavailable. This results in DOC3 messages being

broadcast to all offices that could send messages to this link. See

the C6EMR event for further description.

C6FLDCOL

(gg-mm)

Declared link failure due to a 1-minute continuous receive buffer

overload. If there is not an EMR, a changeover is initiated. The linkis removed from service and is diagnosed.



7/29/2019 172254





C6FLDCOV

(gg-mm)

Declared link failure due to an automatic changeover initiated by

the far end. The changeover lasted more than 3 minutes (or 6minutes if a long key exchange is involved). Actions are taken asdescribed under the C6FLDCOL event except no diagnostics areattempted and the changeover (the C6ACOCOV event) precedes

this event.

C6FLDER (gg-mm) Declared link failure due to error threshold exceeded. This is

caused by an excessive number of received SUs in error. Actionsare taken as described under the C6FLDCOV event except the

changeover (the C6ACOER event) precedes this event.

C6FLDPCR(gg-mm)

Declared link failure due to continuous (lasting 30 seconds) farend processor congestion. This event occurs only on A-links.

Actions are taken as described under the C6FLDCOL event. TheC6PCR description (that event precedes this event) shows how a

processor congestion is detected.

C6FLDSNT

(gg-mm)

Declared link failure due to a sanity check failure. This failure is

due to either software or hardware problems causing abnormalnode operation. Automatic diagnostics then attempt to determinethe problem. Actions are taken as described under the

C6FLDCOL event.

C6MCB (gg-mm) Manual change back from manual changeover. This event occurs

either due to manually restoring the link or due to preemption ofthe MOOS state by an emergency condition. In the latter case, thisevent may be preceded by a C6EMR_ event on the mate link.

Refer to the L6MCO_ measurements for a description of thechangeover/change back sequence.



7/29/2019 172254




401-661-045

C6MCOF (gg-mm) Far end manual changeover request has been received. A

changeover involves transferring signaling messages from theunavailable link to some other link, usually due to a need for linkchanges or maintenance. For example, in the case of a B-link, thechangeover results in messages being routed to the mate link, in

the case of an A-link, the changeover results in messages beingrouted to a C-link, and in the case of a C-link, it results in

messages being load balanced over the other available C-links.The changeover request may be denied if the mate link is

out-of-service or the C-link pool is unable to handle the additionalload. When the request is received, the following occurs (if the

request is accepted):

1. A manual changeover acknowledgment is sent

to the far end, and the link is removed fromservice.

2. No new messages are given to the link. Newmessages are diverted to the mate link or C-link.

3. Messages remaining in the transmit buffers areretrieved, and an attempt is made to transmit

these messages on some other link.

Refer to the L6MCO_ measurements for more information.

C6MCON (gg-mm) Near end manual changeover due to local maintenance action.The maintenance and routing actions taken when this eventoccurs are similar to those taken for the C6MCOF event, except,

before diverting messages to the other link, a manual changeoverrequest is sent to the far end (not an acknowledgment). Upon

receipt of an acknowledgment from the far end, the link is removedfrom service and the diversion is done. Refer to the L6MCO_

measurements for more information.

C6PCR (gg-mm) Far end 1STP processor congestion event begins. This eventoccurs only on A-links. It indicates that the base call-processing

cycle of the congested office exceeded a specified value for threeconsecutive cycles. The node uses selective message return to

limit traffic to the congested office (described under the C6BOLXevent). If a congestion message is received at least every 8 to 10

seconds for 30 seconds, declare the link failed. The event occurs

once when the message is first received and not again at leastuntil congestion ends (indicated by the C6PCRE event).

C6PCRE (gg-mm) End of received processor congestion. If more than 10 secondselapse between congestion messages, consider the event ended.



7/29/2019 172254





C6POR (gg-mm) Adjacent processor outage begins (a PRO has been received).

This indicates that the far end office is undergoing initialization oris overloaded. The far end LI goes into the processor outage sendmode. In this mode, processor outage (PRO) signal units aretransmitted in a continuous stream. This end treats the problem as

a link failure (causes a changeover). DOC3 is broadcast every 60seconds on links to connected offices that go into EMR due to the

PROs being received on this link. The DOC message continuesuntil synchronism is restored on this link. This is indicated by no

more PROs. This event occurs once when the PRO is firstreceived, and not again until the outage ends. This is indicated by

the C6PORE event. The C6DOC3 event occurs every 60 secondsas shown above.

C6PORE (gg-mm) Adjacent processor outage ends. This event occurs when the farend stops sending PRO, synchronism is regained, and the link isrestored.

C7ALCIF (gg-mm) Automatic link check (ALC) failure. When a link is declared failed(a C7FLD_ event), the ALC is initiated. If the ALC is not successfulwithin 15 seconds from the link failure, this event occurs.

C7ACB00 (gg-mm) Change back from a failure that is not a declared failure. This is anautomatic change back to a link that previously did an automatic

changeover and then was restored. The change back mustnormally occur within 3 minutes of the changeover. If the LI

reports a long key exchange is taking place, this time period is

extended to 10 minutes. This event occurs for all automaticchange backs exclusive of the C7ACBFLD event. Refer to the

L7ACO_ measurements for a description of the changeover/ change back sequence. This event is usually preceded by a

C7ACO_ event.

C7ACBFLD

(gg-mm)

Automatic change back from declared failure. This event indicates

that the link is declared failed, has recovered, and traffic has beenrouted back to the link. This event is preceded by one of theC7FLD_ events (see those descriptions for more information on

declared failure). Note that if a link is in the MOOS state and anemergency condition automatically forces the link back into

service (called preemption), the C7MCB event occurs rather than

this event.



7/29/2019 172254




401-661-045

C7ACOCOV

(gg-mm)

Automatic changeover initiated by the far end. A changeover

involves transferring signaling messages from the unavailable linkto other links. These could be any links in the combined link set orC-links. In the case of a C-link failing, the changeover results inmessages being load balanced over the other available C-links.

The changeover message and the acknowledgment are both senton some other link in the specified link’s set. When the

changeover order is received from the far end, this event occursand either a changeover or emergency changeover is initiated. An

emergency changeover is done when the far end indicates thatmessages were received out of sequence or when the link node is

out-of-service.

The following is the changeover sequence:

1. The link is removed from service and no newmessages are given to the link node (message

handling pauses).

2. A changeover acknowledgment is sent to the far

end on some other link in the set. Messagesremaining in the transmit and retransmit buffers

are retrieved and are transmitted in sequence onother links. An emergency changeover does notattempt the retrieval from the retransmit buffer (if

the link node is out-of-service or the link faileddue to a near end PRO, no retrieval is done).

3. Message handling resumes with new messagesto the other links.

4. Only synchronization messages are sent on thislink.

In the case of an automatic changeover, the link changes back

when sync is regained. Then it is proven in (from 3 to 15 secondsrequired) and restored. CCS messages are routed back to therestored link. If the link cannot sync and change back within 3

minutes (or 10 minutes if a long key exchange is involved), it isdeclared failed.

C7ACOER (gg-mm) Automatic changeover error threshold has been exceeded. The

error rate monitor in the LI has reported excessive signal uniterrors. The monitor is described in more detail under theC6ACOER event. Similar actions to those described for theC7ACOCOV event are taken.



7/29/2019 172254





C7FLDCOL

(gg-mm)

Declared link failure due to a 1-minute continuous receive buffer

overload. This event is followed by a changeover (assuming it isnot denied due to a blocked path). The link is removed fromservice and is diagnosed.

C7FLDCOV

(gg-mm)

Declared link failure due to an automatic changeover initiated by

the far end. The changeover lasted more than 3 minutes (or 10minutes if a long key exchange is involved). Actions are taken as

described under the C7FLDCOL event except no diagnostics areattempted and the changeover (the C7ACOCOV event) precedes

this event.

C7FLDER (gg-mm) Declared link failure due to error threshold exceeded. This iscaused by an excessive number of received SUs in error. Actions

are taken as described under the C7FLDCOV event except thechangeover (the C7ACOER event) precedes this event.

C7FLDSNT (GG-mm)

Declared link failure due to a sanity check failure. This failure isdue to either software or hardware problems causing abnormal

node operation. Automatic diagnostics attempt to determine theproblem. Actions are taken as described under the C7FLDCOLevent.

C7LCABM1X(gg-mm)

Transmit buffer level 1 congestion ends. Buffer occupancy hasdropped below the threshold for level 1 abatement after transmit

buffer congestion. Messages are not being discarded.

C7LCABM2X

(gg-mm)

Transmit buffer level 2 congestion ends. Buffer occupancy has

dropped below the threshold for level 2 abatement after transmitbuffer congestion. The node reverts to level 1 discard.

C7LCABM3X

(gg-mm)

Transmit buffer level 3 congestion ends. Buffer occupancy has

dropped below the threshold for level 3 abatement after transmitbuffer congestion. The node reverts to level 2 discard.



7/29/2019 172254




401-661-045

C7LCDIS1X

(gg-mm)

Transmit buffer level 1 congestion discard begins. Buffer

occupancy has reached the threshold for level 1 discard to beinitiated. The SS7 discard strategy (for levels 1, 2, or 3) is asdescribed below: The node first checks the priority of a messagebefore transmitting it. The priority is contained in the service

information octet field and is compared with the congestion stateof the transmit buffer. If the priority is less than the congestion

level, the message is removed and a return message may be sent.The return message is sent only if the return indicator in the

received message is set. If the message to be transmitted is a unitdata type SCCP message, a UDS message is created and

returned to the originator. If the priority of the message is equal toor greater than the congestion level, it is transmitted. This eventdoes not occur again at least until buffer occupancy drops below

the level 1 abatement threshold (signaled by the C7LCABM1Xevent).

C7LCDIS2X(gg-mm)

Transmit buffer level 2 congestion discard begins. Bufferoccupancy has reached the threshold for level 2 discard to beinitiated. The C7LCDIS1X event describes the discard strategy.

C7LCDIS3X(gg-mm)

Transmit buffer level 3 congestion discard begins. Bufferoccupancy has reached the threshold for level 3 discard to be

initiated. At this point, all messages are being discarded. TheC7LCDIS1X event describes the discard strategy.

C7LCON1X

(gg-mm)

Transmit buffer level 1 congestion onset begins. The congestion

onset thresholds (levels 1, 2, or 3), are higher than thecorresponding abatement levels but lower than the corresponding

discard levels. At each onset level, the node reports thecongestion state to the central processor. Network management

messages (transfer controlled) are then broadcast to adjacentsignaling points to limit messages to the affected node. To avoid

further congestion of the transmit buffer, the far end initiates thediscard strategy used by nodes at the discard level (describedunder the C7LCDIS1X event).

If the node remains in the same congestion level (1, 2, or 3) for 60

seconds, it is taken OOS and diagnosed.

C7LCON2X

(gg-mm)

Transmit buffer level 2 congestion onset begins. Messages are

being discarded according to the level 1 strategy. The nodereports the level 2 congestion state to the central processor.Actions are taken as described under the C7LCON1X event.



7/29/2019 172254





C7LCON3X

(gg-mm)

Transmit buffer level 3 congestion onset begins. Messages are

being discarded according to the level 2 strategy. The nodereports the level 3 congestion state to the central processor.Actions are taken as described under the C7LCON1X event.

C7LSF (linkset) Link set failure begins. When the last available link in the set fails,

this event occurs. If the failure of the link set results in failure of theassociated combined link set, another C7LSF CNCE message is

output with the combined link set identification. The end of thisevent is signaled by the C7LSFE event. The CLF_ measurements

describe the various link set failure scenarios. If this failure causessome destination to become isolated from this office (for example,all signaling paths to a signaling point have failed), this event is

accompanied by a C7SPI event.

C7LSFE (linkset) Link set failure ends. When any link in the set restores, this event

occurs.

C7MCB (gg-mm) Manual change back from manual changeover. This event occurs

either due to manually restoring the link (at the near end or farend) or due to preemption of the MOOS state by an emergencycondition. When the link regains sync, a change back declaration

is sent to the far end. The link state is changed to OOS and newmessages are diverted back to the link. Until all acknowledgments

are received, these messages are not transmitted; messages arediverted to other links if the link fails to return to service. Note that

this event occurs before the link is made available.

C7MCOF (gg-mm) Far end manual changeover request has been received, usuallydue to a need for link changes or maintenance. The far end has

requested and permission has been granted to initiate achangeover. Either a changeover or emergency changeover is

initiated. The sequence is described under the C7ACOCOV event.

C7MCON (gg-mm) Near end manual changeover due to local maintenance action.The changeover could be denied if removing the link from service

would cause the far end to become inaccessible. This endrequests permission from the far end to initiate a changeover (the

far end recognizes a C7MCOF event). If the far end grantspermission, either a changeover or emergency changeover is

initiated. The sequence is described under the C7ACOCOV event.

C7POR (gg-mm) Adjacent processor outage event begins (the end of this event issignaled by the C7PORE event). Refer to the C6POR description.



7/29/2019 172254




401-661-045

C7PORE (gg-mm) Adjacent processor outage event ends. Refer to the C6PORE

description.

C7SPI (pointcode) An adjacent signaling point isolation begins due to local failure. Alink failed causing a complete failure of all signaling paths to theindicated destination from this office. This condition is usually

accompanied by a C7LSF event. The end is indicated by theC7SPIE event. See the SPI_ measurements for more detail.

C7SPIE (pointcode) Adjacent signaling point isolation ends. Some failed path to theindicated destination has restored due to a local link set recovery.

This event indicates that the destination is no longer isolated fromthis office.

C7SPIPO

(pointcode)

An adjacent signaling point isolation begins due to a far end

processor outage. A link failed due to receiving PROs from the farend causing a complete failure of all signaling paths to the

indicated destination from this office. See also the C7SPIdescription. The end of this condition is indicated by the C7SPIE

event.

C7SSAF(subsystem)

Received a subsystem allowed message. Receiving an SSAmessage indicates that the subsystem (either local or nonlocal),

has become allowed. SSA messages sent by the far end are inresponse to subsystem status test messages. This event (and the

C7SSPF event described below) occurs only if both of thefollowing two conditions are met:

s

Indicated subsystem is in the same region, ands It is simplex, or duplex with the mate subsystem prohibited.

C7SSPF(subsystem)

Received a subsystem prohibited message. SSP messages sentby the far end are in response to signaling messages destined forthe indicated prohibited subsystem. Receiving an SSP message

indicates that the subsystem (either local or nonlocal), hasbecome prohibited causing it to be blocked. The C7SSAF

description details certain conditions for the generation of thisevent.

CPARSFLD(PBX Link)

Automatic return to service from a declared failure.

CPALCIF(PBX Link)

Automatic link check (ALC) failure on the specified link. When alink is declared failed (the CPFLD or CPFLDNS event), the ALC isinitiated. If the ALC is not successful within 15 seconds from the

link failure, this event occurs.



7/29/2019 172254





CPDSERVF

(PBX Link)

A SERV message exchange has failed on the specified D-channel

link. The SERV message is sent several times and, if noacknowledgment is received (the T321 timer expires), this eventoccurs. This indicates that either a layer 3 protocol problem, aprovisioning problem, or a hardware failure other than facility

failure. This event occurs when a link attempts to transition to theIS state. Note that since the SERV message exchange is not done

for standby links, a standby link could have latent layer 3problems.

CPDSTBY (PBXLink)

A duplex D-channel link has transitioned to the standby state. Ifthe link was in declared failure, this event indicates that it hasrecovered.

CPDUMOOS (PBXLink)

The mate D-channel link fails while the indicated link is in themanual out-of-service (MOOS) state. No switchover occurs until

manual action removes the MOOS state. If the link remains inMOOS, the system attempts to recover the mate link normally.

This event is a warning of possible service outage.

CPFLD (PBX Link) Declared link failure (this only applies to PBX links withdiagnostic). The link state is changed to OOS and the central

processor is informed. For a D-channel link failure, this eventindicates that a signaling path failure; therefore, any associated

B-channels are removed from service. There are various reasonsfor the failure, including:

s Layer 1 protocol down (probably failure of DS0 or DS1, no

explicit indication of L1 failure)s Layer 2 protocol down (protocol exceptions and inability to

establish link within 90 sec.)

s DDS code received

s Disconnect message received from far end

s Level 2 error threshold exceeded (usually facility problems).

CPFLDNS(PBX Link)

Nonsignaling declared link failure of a mated link. The signalingpath is still available on the backup link. The link state is changed

to OOS. For the reasons for this event, see the CPFLDdescription.

CPMOOS(PBX Link) Manual out-of-service (MOOS) begins.

CPMOOSE(PBX Link)

Manual out-of-service ends.



7/29/2019 172254




401-661-045

CTREDAL

(PBX Node)

Red alarm declared (near end DS1 facility failure). This is the

second most severe trouble condition for a PBX node. This eventobstructs sensing of the yellow alarm condition. Note that thismeans that there may be no explicit clearing of any yellow alarm inprogress (normally indicated by the CTYELALC event).

CTREDALC(PBX Node)

Red alarm cleared. Any yellow alarm in progress is also cleared.

CTYELAL (PBXNode)

Yellow alarm declared.

CTYELALC(PBX Node)

Yellow alarm cleared.



7/29/2019 172254


Contents



6

Diagnostic User’s Guide

Introduction 6-1

Overview 6-1s Diagnostics 6-1

s Hardware and Interfaces 6-2

s System Maintenance Interfaces 6-5

Performing Diagnostics 6-6

s Diagnostic Message Structure 6-6

s System Diagnostics 6-7

Use of DGN Commands 6-8

Obtaining the Status of Diagnostics 6-9

Node Diagnostic Phase Descriptions 6-9

Circuit Pack Trouble Location Guide 6-24

Diagnostic Listings 6-39

Clearing Troubles Using the Diagnostic Listings 6-40

LNs with Unequipped LI Boards - MV Updates 6-41

Ring Node Addressing 6-41

Automatic Diagnostics and Restorals 6-53

Manual (Unit) Diagnostics 6-54

Manual Diagnostics Using the 1106 Display Page 6-57

Manual Diagnostics Using the DGN Command 6-59

Manual Diagnostics Procedure Using the RST Command 6-63

CDN-I Fault Isolation 6-66

Panic Messages 6-66

RAP Diagnostic Firmware 6-67

Interactive Diagnostics 6-68

s Denied Diagnostic Requests 6-70

s Inhibiting Diagnostic Requests 6-71

7/29/2019 172254




401-661-045

Contentss Diagnostic Aborts and Audits 6-71

Aborts 6-71

Audits 6-72

Audit Failures 6-72

Operating System Diagnostics 6-73

7/29/2019 172254




6


Introduction

This chapter serves as an aid for performing diagnostics on ring nodes (RNs) in aCommon Network Interface ring-based office. When diagnostics are performed,

see the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUTMESSAGES Message Manual or the 401-610-057 FLEXENT™/AUTOPLEX ®

Wireless Networks OUTPUT MESSAGES Manual should also be used.

Diagnostics are performed both automatically and manually. Automatic

diagnostics are performed by automatic ring recovery (ARR). For moreinformation concerning ARR, refer to the "“Maintenance Description” section in

this manual. Manual diagnostics are performed with the aid of input messages atthe Maintenance CRT (MCRT).

Overview

Diagnostics

Diagnostics serve two major purposes. First, diagnostics are run for fault detectionand resolution, and are invoked by manual requests. Diagnostics are also invoked

by error analysis programs as part of the automatic ring recovery (ARR) of a nodethat has been removed due to a fault condition. Secondly, diagnostics are invokedfor the purpose of repair verification.

7/29/2019 172254




401-661-045

The CNI diagnostics provide diagnostic testing for the system. These diagnostics

are performed in a manner similar to those of the 3B21D computer system, butdiagnose totally different equipment. For a complete list and details on 3B21D

computer diagnostics and UNIX system RTR, refer to the UNIX System RTR

3B20/3B21 Operator’s System Maintenance Manual , 254-303-106.

Hardware and Interfaces

The CNI utilizes the 3B21D computer as the central processing unit. The function

of the CNI is to receive messages from incoming applications and route them toan outgoing application. It utilizes a ring communication bus to totally interconnectall application terminations and the 3B21D computer. The ring is a dual bus

configuration, and is designed such that faulty circuits can be eliminated from theactive system for an indefinite period of time.

The CNI diagnostics primarily test ring node hardware that is contained in the ring

node frame/cabinet (RNF/C). The types of RNs are:

s Ring Peripheral Controller Nodes (RPCNs)

s Link Nodes (both LIN-E/SS7 and LI4S/SS7 nodes)

s Direct Link Nodes (DLNs)

s DLN30 nodes

s DLN60 nodes

s CDN-I, CDN-II, CDN-IIx, and CDN-III nodes

s MDL nodes

s Ethernet Interface Node(s) (EINs)

Very large scale integration (VLSI) is used for RNs. The VLSI ring node combinesthe two RIs and the NP of the ring node into one circuit pack called the IRN.

The CNI utilizes a link interface to provide an interface between the ring and anyoffice in the network, thus the name Common Network Interface. The CNI

diagnostics primarily test this l ink interface.

The following is a description of the ring nodes and their contents.

NOTE:Parentheses () have been used throughout these circuit pack listings to designate

that more than one type of circuit pack may exist for a particular ring node,depending upon which generic is being used (although it is preferred that the most

7/29/2019 172254





current circuit packs be in operation). For more information, refer to SD 3F019-02

(Application Schematic for (CNI) and for features provided by each circuit pack.

s IRN RPC node

— Integrated ring node (IRN) UN303() (VLSI)

— Dual duplex serial bus selector (DDSBS) TN69B

— 3B computer interface (3BI) TN914.

s IRN2 RPC node

— Integrated ring node (IRN2) UN304()


— 3B computer interface (3BI) TN914.

s IRN link (LIN-E/SS7) node

— Node processor (NP) TN922

— Integrated ring node (IRN) UN303() (VLSI).

— not encrypted TN916 or encrypted TN917() or memory data link(MDL) TN1317.

s IRN link (LI4S/SS7) node

— Integrated ring node (IRN) UN303() (VLSI).

— 4-Port Link Interface 0 (LI4 0) TN1316 (LI4S) (the TN1316 has an

APA 12A CP, rear mount).

— IRN DLNE node

— Integrated ring node (IRNB) UN303B (VLSI).


— 3B computer interface (3BI) TN914

Table 6-1. Discontinued Availability CP Listings

MD CIRCUIT BACK UNIT NAME UPDATE CIRCUIT BACK

UN122, UN122B RIO UN122C

UN123 RI1 UN123B

TN913 NP TN922

UN303 IRN UN303B

TN917 LI-E TN917B

TN1506 LI-E TN1803

7/29/2019 172254




401-661-045

— Attached processor (AP) TN1630

s IRN2 DLN30 node

— Integrated ring node (IRN2B) UN304B


— 3B computer interface (3BI) TN914


s IRN2 DLN60 node


— TN918

— TN1803

— TN1508


s IRN CDN-I node

— Integrated ring node (IRN) UN303 ()

— Node processor interface (NPI) TN1349

— 3B15 computer line of boards:

s Central controller cache (CCC) UN237(1) or UN626 forthe 16-Mbyte memory board option

s Central controller support (CCS) UN236(1) or UN625 forthe 16-Mbyte memory board option

s Main store controller (MASC) UN95(1-6) or UN507(1)

for 16-Mbyte memory board option

s Main store array (MASA) TN56(1-48) or TN1398(1-8)for 16-Mbyte memory board option

s Power control interface and display (PCID) TN1128.

s IRN2 CDN-II node


— Attached processor (AP) TN1630B

s IRN2 CDN-IIx node


— Attached processor (AP) TN1720x

7/29/2019 172254





NOTE:The x represents boards lettered TN1720A through TN1720H depending upon theamount of memory installed. Each board has 32 Mbytes of memory.

s IRN2 CDN-III node

— Integrated ring node (IRN2B) UN304

— TN918

— TN1803

— TN1508


s IRN MDL node (includes CSN, DSN, and ICN)

— Integrated ring node (IRN) UN303()/UN304

— MDL TN1640

s IRN2 EIN node Integrated Ring Node (IRN) 2

— UN304B

— TN4016

— Paddleboard, 9822EB

— ED3F064-37 G80 cable.

An RPCN is a node where packetized information is removed from the ring and

transferred to the 3B21D computer for processing, or reenters the ring afterprocessing. It is the node on the ring where packetized information enters or exitsa transmission facility. Both the RPCN and the DLNs are located in the RNF/C.

DLNs function like s but have DMA capability. They contain the same circuit packsas an RPCN plus an attached processor (AP). CDN-I nodes are located in the

RNF/C too. They are basically a VLSI with a modified 3B15 computer as the userapparatus circuit.

The Underwriters Laboratories (UL) listed RNF/C provides ring bus connectionsbetween the RNs, access to analog and digital facilities and access to the 3B21D

computer via the RPCNs.

System Maintenance Interfaces

Local maintenance access and status information for the 3B21D computer isobtained through video terminals and receive-only printers (ROPs). The

Maintenance Terminal (MCRT) - provides the primary interface and

7/29/2019 172254




401-661-045

communications for system control and display (C&D), input and output

messages, and the 3B21D computer emergency action interface (EAI) control anddisplay. Inputs entered at the MCRT are monitored via the CTS.

The ROP provides hard copies of the MCRT input and output messages, reportstatus information, fault conditions, audits, and diagnostic results.

If remote maintenance is provided, it has the same terminal access and terminal

capabilities as the on-site user. Because both the remote and local users havesimultaneous access to the 3B21D computer, it is advised that diagnostic input

requests be coordinated through the on-site MCRT user.

Performing Diagnostics

When performing manual RN diagnostics, input and output messages are entered

and interpreted from the maintenance terminal. For this reason, basic terminalfamiliarization and operating knowledge is required. An understanding of inputmessages and knowledge of the message data fields and formats are also

important.

UNIX system Real Time Reliable (RTR) or UNIX system RTR Very Large Main

Memory (VLMM) provides assistance to users for entering input messages. It canbe used to complete or correct errors caused by the user. Invalid values are

rejected and accompanied by an appropriate error acknowledgment. Further helpcan be obtained by entering a question mark (?). A prompting mode can be used

to lead the user through the input message. When a complete input message hasbeen constructed, the user may either execute it or cancel it. The help session isthen completed; that is, help is provided for only one input message at a time.

Diagnostic Message Structure

Listed within the following paragraphs are basic guidelines for understanding thePDS input message format. For a detailed explanation of the message structure,

also refer to see the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless NetworksINPUT MESSAGES Message Manuall

An input message can contain 96 characters, separated by colons (:) into fields.The fields of an input message are identified as the action, identification, and the

data field, with each field being variable in length. These fields are brieflyexplained below:

s Action Field: An action verb (keywords) identifies the action the systemshould perform. This is a verb such as diagnose (DGN), inhibit (INH),

remove (RMV), or restore (RST).

7/29/2019 172254





s Identification Field: Consists of one, two, or three fields called subfields.

These subfields are separated by semicolons (;) with each containing oneor more keywords. The identification field aids in structuring the message

to permit a complete specification, or provides other information further

identifying the object of the action.s Data Field: This field is either null or composed of additional variable

information pertaining to the message. This information is in keywordformat with keywords separated by commas.

A general format for input messages and some output messages can be seen in

the following format in Figure 6-1 on page 6-7.

Figure 6-1. General Format for Input/Output Messages

A typical diagnostic input message and format varies in length and field identifiers.The sample message below provides field separation and identification. Each field

is separated by a colon (:) and square brackets [ ] indicate optional information.

DGN:NODExx y [;[RPT n][,RAW][,UCL]][:PH n [,TLP] | :TLP]

where:

DGN: = the action field

NODE = LN or RPCNxx y[;[RPT n][,RAW][,UCL]][: = the identification field

PH n [,TLP] | :TLP] = the data field.

ACTION IDENTIFICATION DATA

subfield subfield subfield

(verb) : (object) ;(object) ; (action option) :

7/29/2019 172254




401-661-045

System Diagnostics

Diagnostics may be performed manually. However, when the system detects a

fault(s), diagnostics are performed automatically (ARR). The diagnostics in this

section cover only the manual portions of system diagnostics, and presentinformation to familiarize the user with the various diagnostic (DGN) inputcommands, phase descriptions, message interpretation, and other diagnosticinformation. For more information concerning ARR, refer to the "“Maintenance

Description” section in this manual.

DLNs and CDN-I use the same commands as LNs for diagnostics.

Use of DGN Commands

The manual command, DGN, is used to perform diagnostics on ring nodes. TheDGN command has several formats, and some are detailed in the table “DGN

Message Input Variations.” commands and variations.

The term nodexx y used with the DGN commands in the following table andthroughout this document, is used to identify any node and its group and member

number. Insert appropriate node type before using commands from this manual.DLNs and CDN-I are treated like s for diagnostics.

Table 6-2. DGN Message Input Variations

COMMANDS FUNCTIONS

DGN:nodexx y Runs all automatic phases on nodexx y.

DGN:nodexx y:PH a Runs only the specified phase (a) on nodexx y.

DGN:nodexx y:PHa-b

Runs all automatic phases within the specified range (athrough b) on nodexx y.

DGN:nodexx y;RPT n Runs all automatic phases on nodexx y and repeats execution

"n" times, where n<_255.

DGN:nodexx y;RAW Runs all automatic phases on nodexx y and prints the

diagnostic results of every phase at the MROP.

DGN:nodexx y;UCL Runs all automatic phases on nodexx y.Early terminations builtinto data tables are ignored.

DGN:nodexx y:TLP Runs all automatic phases on nodexx y and executes the

troublelocating process at the conclusion of thediagnostics.This process prints at the MROP and MCRT a listof possible faulty equipment.

7/29/2019 172254





Obtaining the Status of Diagnostics

When performing ring node diagnostics, it may be necessary to obtain visual

status of the system, the ring itself, or the status of a particular node. One manner

that a status report can be obtained is with the use of the OP input message.Listed in the “OP:RING Input Message Variations” table are formats for most ofthe OP commands which produce status reports that can aid in status report

interpretation. If other information or formats pertaining to the OP command aredesired, refer to the 401-610-055 Input Message Manual.

Another means of obtaining a status report of the system is by calling up the 1105

or 1106 display page from the MCRT. See the “Trouble Indicators, Error Analysis,and Display Pages” in this manual.

Node Diagnostic Phase Descriptions

The diagnostic routines for ring nodes are broken down into phases. These

phases are described in the “Diagnostic Phases” tables for each type of node.Phases are arranged to test functionally related groups of hardware. Each phase

may test all or part of the hardware on a single CP, or several CPs. Also, each ringnode is diagnosed by its own set of diagnostic phases. Certain hardwarecomponents, such as the NP, are used by every type of ring node. Therefore, the

phases that correspond to these hardware components are also used by every

type of ring node. DLNs use the IRN LN phases plus the DLN phases. The CDNsuse the IRN LN phases plus the CDN phases.

Table 6-3. OP:RING Input Message Variations

INPUT MESSAGE FUNCTION

OP:RING,nodexx y Provides status information for the specified node (RPCN orLN).

OP:RING,GRP xx Provides status information for all nodes on a specified frame/

cabinet (GRP xx).

OP:RINGOP:RING;S

UM

Provides summary information for the ring.

OP:RING;DETD Provides detailed status of the r ing.

OP:00S Provides status information for all equipment which isout-of-service.

OP:RING,nodexx

y;GEN

Requests generic information for the specified node (RPCN or

LN).

7/29/2019 172254




401-661-045

Table 6-4. IRN and IRN2 RPCN Node Diagnostic Phases

PHASE PHASE DESCRIPTION

01 Tests that a message can be relayed from the BISO node to the EISOnode via the isolated segment over ring 0. Phase 1 also tests that

any interframe buffers and all IRN boards in the isolated segment areequipped in accordance with ECD data, and that any interframe

buffers in the isolated segment exhibit the proper data storagecapacity.

02 Tests that a message can be relayed from the EISO node to the BISO

node via the isolated segment over ring 1. Phase 2 also tests thatany interframe buffers and all IRN boards in the isolated segment are

equipped in accordance with ECD data, and that any interframebuffers in the isolated segment exhibit the proper data storage

capacity.

10 Tests the interface between the Dual Serial Channel (DSCH) and theDDSBS.

11 Tests interface between the DDSBS and the 3BI.

12 Verifies that RAC0 can detect bad parity in a ring message.


14 Runs off-line CU to DDSBS tests (Demand phase only).

20 Tests the NP RAM memory, NP parity checker, and generator

circuitry.

21 (IRN only) Tests everything but the memory in the node-processor function.THIS

PHASE IS NOT VALID FOR IRN2.

30 Tests part of both RAC circuits, and the RAC to the NP interface.Partially tests interface between both RACs and the ring bus.



7/29/2019 172254





Table 6-5. IRN LN (LIN - E/SS7) Node Diagnostic Phases


01 Tests that each node in the isolated segment is able to set and clear itsdata selector via hardware commands at RAC0. Phase 1 also tests that

a message can be relayed from the BISO node to the EISO node viathe isolated segment overring 0, and that any interframe buffers in the

isolated segment are equipped in accordance with ECD data andexhibit the proper data storage capacity.

02 Tests that each node in the isolated segment is able to set and clear its

data selector via hardware commands at RAC1. Phase 2 also tests thata message can be relayed from the EISO node to the BISO node via

the isolated segment overring 1, and that any interframe buffers in theisolated segment are equipped in accordance with ECD data and

exhibit the proper data storage capacity.

10 Tests part of both RACs, the RAC to the NP interface, and the interfacebetween both RACs and the ring bus. Checks the capacity of the

interframe buffers associated with the node under test.



20 Tests the NP RAM memory, NP parity checker and generator circuitry.

21 Tests the NP programmable master and slave interrupt controllers and

associated circuitry.It also tests the NP programmable interval timer

circuitry.39 Verifies the ability of the node to read, write and propagate a

maximum-length long message (demand only phases for transitionload).

40 Tests hardware in the LI board or the LI-NP interface.

41 Tests the sanity of the microprocessor and the ROM.

47 Tests the 2.4 and 4.8 data service units, along with their respectiveVFLA or DSA units. CCS7 will ATP by default.

48 Ensures that the firmware and the hardware on the LI board will

function as a whole.

7/29/2019 172254




401-661-045

Table 6-6. IRN LN (LI4S/SS7) Node Diagnostic Phases (Page 1 of 2)


01 Tests that each node in the isolated segment is able to set and clear itsdata selector via hardware commands at RAC0. Phase 1 also tests that a

message can be relayed from the BISO node to the EISO node via theisolated segment overring 0, and that any interframe buffers in the

isolated segment are equipped in accordance with ECD data and exhibitthe proper data storage capacity.


data selector via hardware commands at RAC1. Phase 2 also tests that amessage can be relayed from the EISO node to the BISO node via the

isolated segment overring 1, and that any interframe buffers in theisolated segment are equipped in accordance with ECD data and exhibit

the proper data storage capacity.


interframe buffers associated with the node under test.




21

(IRN Only)

THIS PHASE IS NOT VALID FOR IRN2 Tests the NP programmable

master and slave interrupt controllers and associated circuitry .It also

tests the NP programmable interval timer circuitry.39 Verifies the ability of the node to read, write and propagate a maximum

length long message (demand only phases for transition load).

50 Tests the LI4 0 local RAM and the Dual Port RAM from the Node

Processor. The LI4 is held reset.

51 Tests the NP-LI4 0 interface and DPRAM from the NP view while the

microprocessor on the Link Interface board is running. This phase isdownloaded to the LI4 0 via the NP.

52 Tests the 8086 microprocessor on theLI4 0 board. A subset of theinstruction set of the 8086 is exercised to verify that the microprocessoroperates properly. This phase is downloaded to the LI4 0 via the NP.

53 Tests the DPRAM and the parity check circuit. This phase is downloadedto the LI40 RAM via the NP.

7/29/2019 172254





54 Tests the Programmable Interrupt Controllers and the Programmable

Interval Timers.This phase is downloaded to the LI4 0 RAM via the NP.

55 Tests the DMA, Serial Communications Chip (SCC), part of theProgrammable Interrupt Controller, timers, and the formatting chipsofLI4 0 when the LI4D is tested (TN1315).

56 No tests are run; ATPs are by default. If TLP is run, the APA13 and theDSA (Z2556L1A/2) are noted but no tests are run. Thus, when link

maintenance is performed, this equipment must be taken intoconsideration.

Table 6-6. IRN LN (LI4S/SS7) Node Diagnostic Phases (Page 2 of 2)


7/29/2019 172254




401-661-045

Table 6-7. IRN DLNE Node Diagnostic Phases


01 Tests that each node in the isolated segment is able to set and clear itsdata selector via hardware commands at RAC0. Phase 1 also tests

that a message can be relayed from the BISO node to the EISO nodevia the isolated segment overring 0, and that any interframe buffers in

the isolated segment are equipped in accordance with ECD data andexhibit the proper data storage capacity


data selector via hardware commands at RAC1. Phase 2 also teststhat a message can be relayed from the EISO node to the BISO node

via the isolated segment overring 1, and that any interframe buffers inthe isolated segment are equipped in accordance with ECD data and



interframe buffers associated with node under test.

12 Verifies that RAC0 can detect bad parity in a r ing message.



21 Tests the NP programmable master and slave interrupt controllers and

associated circuitry.It also tests the NP programmable interval timer

circuitry.30 Tests the interface between the DSCH and the DDSBS.

31 Tests the interface between the DDSBS and the 3BI.

32 Tests the ability of NP to go insane and set the “Interrupt Request Flag”

when the 3BI has an error.

33 Tests the interface between the 3BI and the NP.

34 Runs off-line CU to DDSBS tests. (Demand phase only.)

35 Cooperates with the 3B21D driver to test the DMA capability viathe 3BI.

40 Tests the hardware in the LI board or the LI-NP interface.

41 Tests the sanity of the microprocessor and the ROM.

42 Tests the interface between DMA and 3BI.

7/29/2019 172254





Table 6-8. IRN2 DLN30 Node Diagnostic Phases (Page 1 of 2)


01* Tests that each node in the isolated segment is able to set and clearits data selector via hardware commands at RAC0. Phase 1 also

tests that a message can be relayed from the BISO node to the EISOnode via the isolated segment overring 0, and that any interframe

buffers in the isolated segment are equipped in accordance with ECDdata and exhibit the proper data storage capacity.

02* Tests that each node in the isolated segment is able to set and clear

its data selector via hardware commands at RAC1. Phase 2 alsotests that a message can be relayed from the EISO node to the BISO

node via the isolated segment overring 1, and that any interframebuffers in the isolated segment are equipped in accordance with ECD

data and exhibit the proper data storage capacity.

10* Tests part of both RACs, the RAC to the IRN2 interface, and theinterface between both RACs and the ring bus.

12* Verifies that RAC0 can detect bad parity in a ring message.


20* Tests the IRN2 RAM memory, IRN2 parity checker and generatorcircuitry.

30 Tests the interface between the DSCH and the DDSBS.

31 Tests the interface between the DDSBS and the 3BI.

32 Tests the ability of NP to go insane and set the “Interrupt Request

Flag” when the 3BI has an error.

33 Tests the interface between the 3BI and the NP.

34 Runs off-line CU to DDSBS tests. (Demand phase only)

35 Cooperates with the 3B21D driver to test the DMA capability viathe 3BI.

40* Tests the shared static memory in the AP30 from theIRN2 side.

7/29/2019 172254




401-661-045

41* Tests the shared static memory from the AP30 side, the local parity

error snapshot register, and the main 16 Megabytes of DRAM on theAP30.

42* Tests the DMA capability via the 3BI.The DMA is from the 3B21D to/ from the AP Dual Port Memory (DPM).

43† Tests the 4 D-channel data links on the AP30.

* Automatic† Demand-Only

Table 6-8. IRN2 DLN30 Node Diagnostic Phases (Page 2 of 2)


7/29/2019 172254





Table 6-9. IRN2 DLN60 Node Diagnostic Phases


01*

* Demand-only

Tests that each node in the isolated segment is able to set andclear its data selector via hardware commands at RAC0. Phase 1

also tests that a message can be relayed from the BISO node tothe EISO node via the isolated segment overring 0, and that any

interframe buffers in the isolated segment are equipped inaccordance with ECD data and exhibit the proper data storagecapacity.

02 Tests that each node in the isolated segment is able to set andclear its data selector via hardware commands at RAC1. Phase 2

also tests that a message can be relayed from the EISO node tothe BISO node via the isolated segment overring 1, and that any

interframe buffers in the isolated segment are equipped inaccordance with ECD data and exhibit the proper data storagecapacity.

10 Tests part of both RACs, the RAC to the IRN2 interface, and theinterface between both RACs and the ring bus.



20 Tests the IRN2 RAM memory, IRN2 parity checker and generator

circuitry.

40 Tests the shared static memory in the AP60 from the IRN2 side.41 Tests the shared static memory from the AP60 side, the local

parity error snapshot register, and the main 32 Megabytes ofDRAM on the AP60.

7/29/2019 172254




401-661-045

Table 6-10. IRN CDN-I Diagnostic Phases (Page 1 of 2)


01 Tests that each node in the isolated segment is able to setand clear its data selector via hardware commands at RAC0.

Phase 1 also tests that a message can be relayed from theBISO node to the EISO node via the isolated segment

overring 0, and that any interframe buffers in the isolatedsegment are equipped in accordance with ECD data andexhibit the proper data storage capacity.

02 Tests that each node in the isolated segment is able to setand clear its data selector via hardware commands at RAC1.

Phase 2 also tests that a message can be relayed from theEISO node to the BISO node via the isolated segment

overring 1, and that any interframe buffers in the isolatedsegment are equipped in accordance with ECD data andexhibit the proper data storage capacity.

10 Tests part of both RACs, the RAC to the NP interface, andthe interface between both RACs and the ring bus. Checks

the capacity of the interframe buffers associated with nodeunder test.



20 Tests the NP RAM memory, NP parity checker and generator

circuitry.

21 Tests the NP programmable master and slave interrupt

controllers and associated circuitry .It also tests the NPprogrammable interval timer circuitry.

40 Tests the NPI from the IRN side.

42 Tests the CCS board.

43 Tests the MASC 0 memory group.

43 (16 meg) Tests the MASC 16 memory group.

44 Tests the CCC board.

45 Tests the NPI from the RAP side.


7/29/2019 172254











53 Tests a comprehensive end-to-end test.

54 (16 meg) Tests the MASA 0.

55 (16 meg) Tests the MASA 1.

56* (16 meg) Tests the MASA 2.






* Demand-only

Table 6-10. IRN CDN-I Diagnostic Phases (Page 2 of 2)


7/29/2019 172254




401-661-045

Table 6-11. IRN2 CDN-II/CDN-IIx Diagnostic Phases (Page 1 of 2)


01* Tests that each node in the isolated segment is able to set andclear its data selector via hardware commands at RAC0.


overring 0, and that any interframe buffers in the isolatedsegment are equipped in accordance with ECD data and exhibitthe proper data storage capacity



overring 1, and that any interframe buffers in the isolatedsegment are equipped in accordance with ECD data and exhibitthe proper data storage capacity.

10* Tests part of both RACs, the RAC to the IRN2interface, and theinterface between both RACs and the ring bus.



20* Tests the IRN2 RAM memory, IRN2 parity checker and

generator circuitry.

40* Tests the shared static memory in the AP30’ from the IRN2 side.41* Tests the shared static memory from the AP30’ side, the local

parity error snapshot register, and the main 16 Megabytes ofDRAM on the AP30’.

43 Tests the 4 D-channel data links on the AP30’.

44 Tests the overall functionality of the mezzanine memory.

45 For CDN-II, tests the 1st 32 Mbytes of the mezzaninememory.For CDN-IIx, tests the 1st 32-Mbyte block of the

mezzanine.

46 For CDN-II, tests the 2nd 32 Mbytes of the

mezzaninememory.For CDN-IIx, tests the 2nd 32-Mbyte blockof the mezzanine.

47 For CDN-IIx only, tests the 3rd 32-Mbyte block of the

mezzanine.

7/29/2019 172254





NOTE:For APX6.1 prior to Software Update that includes diagnostics for CDN-IIx,

Phases 43 and 45 through 52 are demand-only phases; Phase 44 is an automaticphase.

For APX6.1 with the Software Update that includes diagnostics for CDN-IIx andfor APX7.0, Phase 43 does not apply; and Phases 44 through 52 are automatic

phases.

48 For CDN-IIx only, tests the 4th 32-Mbyte block of the

mezzanine.

49 For CDN-IIx only, tests the 5th 32-Mbyte block of themezzanine.


mezzanine.


mezzanine.

52 For CDN-IIx only, tests the 8th 32-Mbyte block of themezzanine.

* Automatic.

Table 6-11. IRN2 CDN-II/CDN-IIx Diagnostic Phases (Page 2 of 2)


7/29/2019 172254




401-661-045

Table 6-12. IRN2 CDN-III Diagnostic Phases


01 Tests that each node in the isolated segment is able to set and clear itsdata selector via hardware commands at RAC0. Phase 1 also tests

that a message can be relayed from the BISO node to the EISO nodevia the isolated segment overring 0, and that any interframe buffers in

the isolated segment are equipped in accordance with ECD data andexhibit the proper data storage capacity.


data selector via hardware commands at RAC1. Phase 2 also teststhat a message can be relayed from the EISO node to the BISO node

via the isolated segment overring 1, and that any interframe buffers inthe isolated segment are equipped in accordance with ECD data and


10 Tests part of both RACs, the RAC to the IRN2interface, and theinterface between both RACs and the ring bus.



20 Tests the IRN2 RAM memory, IRN2 parity checker and generatorcircuitry.

40 Tests the shared static memory in the AP60 from theIRN2 side.

41 Tests the shared static memory from the AP60 side, the local parity

error snapshot register, and the main 32 Megabytes of DRAM on theAP60.

44 Tests the database memory control circuits.

45*

* Demand-only.

Tests the 1st 128 Mbytes of the AP60 0.5 Gbyte database memory

array.

46* Tests the 2nd 128 Mbytes of the AP60 0.5 Gbyte database memory

array.

47* Tests the 3rd 128 Mbytes of the AP60 0.5 Gbyte database memoryarray.

48* Tests the 4th 128 Mbytes of the AP60 0.5 Gbyte database memoryarray.

7/29/2019 172254





Table 6-13. IRN2 EIN Node Diagnostic Phases


01*

* Automatic.

Tests that each node in the isolated segment is able to set andclear its data selector via hardware commands at RAC0.


overring 0, and that any interframe buffers in the isolatedsegment are equipped in accordance with ECD data and exhibit

the proper data storage capacity



overring 1, and that any interframe buffers in the isolated

segment are equipped in accordance with ECD data and exhibitthe proper data storage capacity.

10* Tests part of both RACs, the RAC to the IRN2 interface, and theinterface between both RACs and the ring bus.



20* Tests the IRN2 RAM memory, IRN2 parity checker andgenerator circuitry.

40* Tests the shared static memory in the AP30’ from the IRN2 side.

41* Tests the shared static memory from the AP30’ side, the localparity error snapshot register, and the main 16 Megabytes of

DRAM on the AP30’.

43 Tests the 4 D-channel data links on the AP30’.

44 Tests the overall functionality of the mezzanine memory.

45 For CDN-II, tests the 1st 32 Mbytes of the mezzaninememory.For CDN-IIx, tests the 1st 32-Mbyte block of the mezzanine.

46 For CDN-II, tests the 2nd 32 Mbytes of themezzaninememory.For CDN-IIx, tests the 2nd 32-Mbyte blockof the mezzanine.

47 For CDN-IIx only, tests the 3rd 32-Mbyte block of themezzanine.

7/29/2019 172254




401-661-045

* Automatic

Circuit Pack Trouble Location Guide

On the following pages are check lists for probable or suspected faulty circuit

packs to be used when a diagnostic phase has failed for a particular ring node.These listings are ordered from the most to the least probable cause of failure.When diagnosing ring nodes, if the diagnostic result returned is some-tests-failed

(STF), refer to the “Trouble Location CP List” tables for the location of the faulty or

suspected faulty CP(s). The TLP option delivers the same information as thesetables and can also be used in identifying faulty or suspected faulty CPs. The TLPoutput is valid only for the first failing phase and only when all phases are run.

Table 6-14. IRN MDL (SCN, DSN, ICN) Diagnostic Phases


01* Tests that each node in the isolated segment is able to set and

clear its data selector via hardware commands at RAC0. Phase 1also tests that a message can be relayed from the BISO nodetothe EISO node via the isolated segment overring 0, and that anyinterframe buffers in the isolated segment are equipped in

accordance with ECD data and exhibit the proper data storagecapacity.

02* Tests that each node in the isolated segment is able to set andclear its data selector via hardware commands at RAC1. Phase 2

also tests that a message can be relayed from the EISO nodetothe BISO node via the isolated segment overring 1, and that anyinterframe buffers in the isolated segment are equipped in

accordance with ECD data and exhibit the proper data storage

capacity.

10* Tests part of both RACs, the RAC to the NP interface, and theinterface between both RACs and the ring bus. Checks the

capacity of the interframe buffers associated with node under test.



20* Tests the IRN2 RAM memory, IRN2 parity checker and generatorcircuitry.

40* Requests download of diagnostic driver code to the IRN2 and

initiates its execution to diagnose the Ethernet interface hardware.Testing ends at the loopback relay on the ELI circuit pack, CP

TN4016.

7/29/2019 172254





The TLP capability has been enhanced to provide more extensive on-line

interpretation of the isolated segment diagnostic failure (phases 1 and 2). Thisassists in the direct localization of ring faults to nodes (or circuit packs) within a

multinode isolated segment other than the node being diagnosed.

Visual indicators in the form of LEDs located on the CPs can also be used to

locate faulty CPs too. For more information on visual indicators in this manual.

NOTE:Parentheses () have been used throughout these circuit pack listings to designatethat more than one type of circuit pack may exist for a particular ring node,

depending upon which generic is being used (although it is preferred that the mostcurrent circuit packs be in operation). (For more information, refer to "SD

3F019-02, the Application Schematic for CNI" for features provided by each circuitpack.)

Table 6-15. Discontinued Availability CP Listings

MD CIRCUIT PACK UNIT NAME UPDATED CIRCUIT PACK

UN122, UN122B RI0 UN122C

UN123 RI1 UN123B

TN913 NP TN922

UN303 IRN UN303B

TN917 LI-E TN917B

TN1506 LI-E TN1803

Table 6-16. IRN and IRN2 RPC Trouble Location CP List (Page 1 of 2)

DIAGNOSTIC PHASE PROBABLE/SUSPECTEDFAULTY PACK

UNITNAME

PHASE TABLE

01 UN303()/UN304B IRN/IRN2

TN915/TN918 IFB

TN1508/TN1803 IFB

Ring Bus Cable RNF/C

02 rpc02.I Same asPhase 01

Same asPhase 01

7/29/2019 172254




401-661-045

10 rpc10.I TN69B DDSBS

KBN15 (3B21D) DSCH

11 rpc11.I TN914 3BI

TN69B DDSBS


UN303()/UN304B IRN/IRN2



14 rpc14.I TN69B DDSBS

(Demand onlyPhase)

KBN15 (3B21D) Off-Line

DSCH

20 rpci20.I UN303()/UN304B IRN/IRN2

21 rpci21.I UN303()/UN304B IRN

30 rpci30.I UN303()/UN304B IRN/IRN2

32 rpc32.I UN303()/UN304B IRN/IRN2

TN915/TN918 IFB

TN1508/TN1803 IFB

33 rpc33.I Same asPhase 32

Same asPhase 32

Table 6-16. IRN and IRN2 RPC Trouble Location CP List (Page 2 of 2)


UNITNAME

PHASE TABLE

7/29/2019 172254





Table 6-17. IRN LN (LIN-E/SS7) Trouble Location CP List (Page 1 of 2)

DIAGNOSTIC PHASE PROBABLE/SUSPECTED

FAULTY PACK

UNIT

NAMEPHASE TABLE

01 iuin01.I UN303() IRN

TN915/TN918 IFB

TN1506/TN1508/TN1509 IFB


02 iun02.I Same asPhase 01

Same asPhase 01

10 iuni10.I UN303() IRN

12 iun12.I UN303() IRN

TN915/TNTN918 IFB

TN1506/TN1508/TN1509 IFB

13 iun13.I Same as

Phase 12

Same as

Phase 12

20 iuni20.I UN303() IRN

21 iiuni21.I UN303() IRN

39 iun39.I UN303() IRN

40 cBph0.40.I TN916 LI-NE

TN917() LI-E

UN303() IRN

41 cBph1.41.I Same as

Phase 40

Same as

Phase 40

7/29/2019 172254




401-661-045

47 cBph7.47.I* TN916 LI-NE

TN917() LI-E

TN919 VFLA

2024-A, 2048-A Data Sets

TN922 NP

LINK Cabling

48 cBph8.48.I TN919 (CCS6) VFLA

2024-A, 2048-A (CCS6)

TF9 (CCS7) Facility

Int.

Z2466L1A/2† (CCS7) Data Sets

TN916 LI-NE

TN917() LI-E

TN922 NP

Link Cabling

* Phase 47 - CCS7 will ATP by default.

† Phase 48 - test 47 will fail if Z24556L1A/2 is in Local Loop (LL).

Table 6-18. IRN LN (LI4S/SS7) Trouble Location CP List (Page 1 of 2)

DIAGNOSTIC PHASE PROBABLE/SUSPECTED UNITNAME

PHASE TABLE FAULTY PACK

01 iun01.l UN303()/UN304B IRN/IRN2

TN915/TN918 IFB

TN1506/TN1508/TN1509 IFB


Table 6-17. IRN LN (LIN-E/SS7) Trouble Location CP List (Page 2 of 2)


UNITNAME

PHASE TABLE

7/29/2019 172254





02 iun02.l Same as

Phase 01

Same as

Phase 01

10 iuni10.l UN303()/UN304B IRN/IRN2


TN915/TN918 IFB

TN1508/TN1803 IFB

13 iun13.l Same asPhase 12

Same asPhase 12


21 iuni21.l UN303() IRN

50 LI4ph0.50.l UN303() IRN

TN1316 LI4S 0

51 LI4ph1.5i1.l Same asPhase 50

Same asPhase 50

52 LI4ph2.52.l TN1316 LI4S 0

53 LI4ph3.53.l TN1316 LI4S 0

54 LI4ph4.54.l TN1316 LI4S 0

55 LI4ph5.55. TN1316 LI4S 0

56 LI4ph6.56.l ATPs are

by default(APA13 and the DSA

(Z2556L1A/2) are notedbut no tests are run.

Table 6-18. IRN LN (LI4S/SS7) Trouble Location CP List (Page 2 of 2)



7/29/2019 172254




401-661-045

Table 6-19. IRN DLNE Trouble Location CP List (Page 1 of 2)

DIAGNOSTIC PHASE PROBABLE/SUSPECTED UNIT

NAMEPHASE TABLE FAULTY PACK


TN915/TN918 IFB

TN1508/TN1803 IFB



Same asPhase 01

10 iuni10. UN303()/UN304B IRN/IRN2


TN915/TN918 IFB

TN1508/TN1803 IFB

13 iun13.l Same as

Phase 12

Same as

Phase 12



30 iun30.l TN69B DDSBS

KBN15 (3B21D) DSCH

31 iun31.l TN914 3BI

TN69B DDSBS





7/29/2019 172254





34 iun34.I TN69B DDSBS

(Demand onlyphase)

KNB15 (3B21D) Off-line

DSCH

35 iun35.I Same as

Phase 33

Same as

Phase 33

40 ap68.40.I TN1340 (2 Meg) AP

TN1641 (8 Meg) AP

TN1630 (4ESS Only) LI4E

41 ap68.41.I TN1340 (2 Meg) AP

TN1641 (8 Meg) AP


42 ap68.42.I UN1340 (2 Meg) IRN

TN1340 (2 Meg) AP

TN1641 (8 Meg) AP


Table 6-19. IRN DLNE Trouble Location CP List (Page 2 of 2)



7/29/2019 172254




401-661-045

Table 6-20. IRN2 DLN30 Trouble Location CP List (Page 1 of 2)



01* iun01.l UN304 IRN2

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8


02* iun02.l Same as

Phase 01

Same as

Phase 01

10* iuni10. UN304 IRN2


TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8

13* iun13.l Same asPhase 12

Same asPhase 12

20* iuni20.l UN303()/UN304B IRN/IRN2

30 iun30.l TN69B DDSBS

KBN15 (3B21D) DSCH


TN69B DDSBS


UN304 IRN2


UN304 IRN2

7/29/2019 172254





34 iun34.I TN69B DDSBS

(Demand onlyphase)

KNB15 (3B21D) Off-line

DSCH

35 iun35.I Same as

Phase 33

Same as

Phase 33

40* ap68.40.I TN1630B AP30

41* ap60.41.I TN1630B AP30

42* ap68.42.I TN1630B AP30

43† Ii4e.43.I TN1630B AP30

* Automatic

† Demand-Only




01 iun01.l UN304 IRN2

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8



Same asPhase 01

10 iuni10. UN304B IRN2




7/29/2019 172254




401-661-045

12 iun12.l UN304B IRN2

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8


Same asPhase 12

20 iuni20.l UN304B IRN2

40 ap68.40.I TN2522 AP60

41 ap68.41.I TN2522 AP60

Table 6-22. IRN CDN-I Manual Trouble Location CP List (Page 1 of 3)



01 iun01.l UN303 IRN

UN303B IRNB

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8


02 iun02.I Same as

Phase 01

Same as

Phase 01

10 iuni10.I UN303 IRN

UN303B IRNB




7/29/2019 172254





12 iun12.I UN303 IRN

UN303B IRNB

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8

13 iun13.I Same as

Phase 12

Same as

Phase 12

20 iuni20.I UN303 IRN

UN303B IRNB

21 iuni21.I Same asPhase 20

Same asPhase 20

40 irap40.I TN1349 NPI

42 irap42.I UN236 CCS

UN625 CCS16

43 irap43.I TN56 MASA (0-7)

UN95 MASC 0

43 (16meg) irap43_16.I TN1398 MASA16 (0-7)

UN507 MASC16

44 irap44.I UN237 CCC

UN626 CCC16

45 irap45.I TN1349 NPI

46 irap46.I TN56 MASA (0-7)

UN95/UN295 MASC1

47 irap47.I Same asPhase 46 Same asPhase 46

48 irap48.I Same asPhase 46

Same asPhase 46




7/29/2019 172254




401-661-045

49 irap49.I Same as

Phase 46

Same as

Phase 46


Same asPhase 46


Same asPhase 46


Same asPhase 46

53 irap53.I all all

54 irap54.I TN1398 MASA16 (0)

55 irap55.I TN1398 MASA16 (1)

56* irap56.I TN1398 MASA16 (2)

57* irap57.I TN1398 MASA16 (3)

58* irap58.I TN1398 MASA16 (4)

59* irap59.I TN1398 MASA16 (05

60* irap60.I TN1398 MASA16 (6)

61* irap61.I TN1398 MASA16 (7)

* Demand-only




7/29/2019 172254





Table 6-23. IRN2 CDN-II/CDN-IIx Manual Trouble Location CP List (Page 1 of 2)




TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8


02* iun02.l Same as

Phase 01

Same as

Phase 01

10* iuni10.I UN304 IRN2


TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8

13* iun13.l Same asPhase 12

Same asPhase 12

20* iuni20.l UN304 IRN2

40* ap68.40.I TN1630B(CDN-II)

TN1720()(CDN-IIx)

AP30’

41* Ii4e.41.I TN1630B(CDN-II)TN1720()(CDN-IIx)

AP30’

43 Ii4e.43.I TN1630B AP30’

44 ap30.44.I TN1630B(CDN-II)

TN1720()(CDN-IIx)

AP30’

45 ap30.45.I TN1630B(CDN-II)TN1720()(CDN-IIx)

AP30’

46 ap30.46.I TN1630B(CDN-II)TN1720()(CDN-IIx)

AP30’

47 ap30.47.I TN1720() CDN-IIx AP30’

7/29/2019 172254




401-661-045

NOTE:For APX6.1 prior to Software Update that includes diagnostics for CDN-IIx,Phases 43 and 45 through 52 are demand-only phases; Phase 44 is an automaticphase.

For APX6.1 with the Software Update that includes diagnostics for CDN-IIx and

for APX7.0, Phase 43 does not apply; and Phases 44 through 52 are automaticphases.

48 ap30.48.I TN1720() CDN-IIx AP30’

49 ap30.49.I TN1720() CDN-IIx AP30’

50 ap30.50.I TN1720() CDN-IIx AP30’

51 ap30.51.I TN1720() CDN-IIx AP30’

52 ap30.52.I TN1720() CDN-IIx AP30’

* Automatic

Table 6-24. IRN2 CDN-III Trouble Location CP List (Page 1 of 2)



01 iun01.l UN304 IRN2

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8



Same asPhase 01

10 iuni10. UN304B IRN2

Table 6-23. IRN2 CDN-II/CDN-IIx Manual Trouble Location CP List (Page 2 of 2)



7/29/2019 172254





* Automatic

12 iun12.l UN304B IRN2

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8


Same asPhase 12

20 iuni20.l UN304 IRN2

40 ap60.40I TN2523 AP60

41 ap60.41I TN2523 AP60

44 ap60.44I TN2523 AP60

45 ap60.45I TN2523 AP60

46 ap60.46I TN2523 AP60

47 ap60.47I TN2523 AP60

48 ap60.48I TN2523 AP60

Table 6-25. IRN2 EIN Node Trouble Location CP List (Page 1 of 2)




TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8


Table 6-24. IRN2 CDN-III Trouble Location CP List (Page 2 of 2)



7/29/2019 172254




401-661-045

* Automatic

02* iun02.l Same as

Phase 01

Same as

Phase 01

10* iuni10. UN304B IRN2

12* iun12.l UN304B IRN2

TN918 IFB-U

TN1803 IFB-4K/8

TN1508 IFB-16/8

13* iun13.l Same as

Phase 12

Same as

Phase 12

20* iuni20.l UN304 IRN2

40* ein40.I TN4016 ELI

Table 6-26. IRN MDL (CSN, DSN, ICN) Trouble Location CP List



01 iun01.l

02 iun02.l Same as

Phase 01

Same as

Phase 01

10 iuni10. UN303()/UN304() IRN/IRN2

12 iun12.l UN303()/UN304() IRN/IRN2

TN915/TN918 IFB

TN1508/TN1803 IFB

Table 6-25. IRN2 EIN Node Trouble Location CP List (Page 2 of 2)



7/29/2019 172254





Diagnostic Listings

When diagnostic failures still exist after replacing hardware as recommended in

the “Manual Trouble Location Circuit Pack List” tables, analysis of diagnostic testresults is important. This is accomplished using the diagnostic output messageand diagnostic listings (.l files), if available. The diagnostic listings are files that

end with a .l suffix (such as iun01.l, or rpc01.l). See the manual trouble locationcircuit pack list tables. Generally the first failing phase and the first few failing tests

within that phase are useful for analysis. If this data is not on hand, rundiagnostics using the RAW option to print all test failures at the ROP.

A diagnostic listing consists of a prologue, followed by one or more program units.

Each program unit has a prologue, which gives information about what is tested,

how the testing is done, and the hardware involved. The remainder of the programunit consists of the diagnostic command lines, comment lines, and lines that are

ASCII equivalent of the data found in the corresponding object file. The commandlines direct the sequence of diagnostic test execution.

13 iun13.l Same as

Phase 12

Same as

Phase 12

20 iuni20.l UN303()/UN304() IRN/IRN2

21 (IRN only) iuni21.I UN303() IRN

40 (IRN only) iun40.I TN1640 MDL_0

40 (IRN2 only) i2mdI40.I TN1640 MDL_0

41 (IRN only) iun41.I Same asPhase 40

Same asPhase 40

41 (IRN2 only) i2un41.I Same asPhase 40

Same asPhase 40

50 (IRN only) iun50.I TN1640 MDL_1

50 (IRN2 only) i2mdI50.1 TN1640 MDL_1

51 (IRN only)Demand Phase

iun51.I Same asPhase 50

Same asPhase 50

51 (IRN2 only)Demand Phase

i2mdI51.I Same asPhase 50

Same asPhase 50

Table 6-26. IRN MDL (CSN, DSN, ICN) Trouble Location CP List



7/29/2019 172254




401-661-045

Each diagnostic command begins with a statement number. This is the statement

number that is referred to in the interactive diagnostics (EX) input and outputmessage (see “Performing Diagnostics” in this chapter) in early termination output

messages, or in the DGN AUDIT RING output message. Some diagnostic

command lines are preceded by one or more comment lines. These are lines thatbegin with the character C. They are intended to give the purpose of the command

line that follows it.

Each diagnostic command line is followed by a line that shows, in ASCII format,the data corresponding to the command that is contained in the associated

executable object file. This line begins with the string * adr unless the commandgenerates a test, and in this case, the command line begins with the string * test.The test numbers in the diagnostic listings correspond to the test numbers in the

diagnostic output messages. The only data on this line of importance to on-siteusers are the test numbers.

NOTE:For the rdgnrsl diagnostic command, a separate line is shown to illustrate that allfailed test numbers that are returned from the NP are reported by adding 20 to thefailed test number that is actually returned.

Clearing Troubles Using the Diagnostic Listings

If a trouble is not cleared after replacing the hardware as listed in the manualtrouble locating procedures tables, the following procedure is recommended:

1. From the ROP, examine the diagnostic output message to determine whichphases failed.

2. Obtain the files (if available) and read the prologues for the phase and

program unit in which the failing test occurs.

3. Find the diagnostic commands associated with the failed tests by checkingthe test numbers.

4. Read the comments (lines beginning with a C) on the lines that precede thecommand list to gain understanding on where the problem is located.

5. If unable to determine how to proceed on clearing the trouble, seekassistance from the CTS.

LNs with Unequipped LI Boards - MV Updates

It should be noted that when an LN is equipped in an active ring, but does notcontain a link interface (LI4) circuit pack, diagnostic phases 50 through 56 (LI

diagnostics) should not be run on that link node. For this situation, the unit controlblock (UCB) of the equipment configuration data base (ECD) must be modified to

accommodate this unequipped LI4. The member version (MV) field on the UCB

7/29/2019 172254





form for the LN must be changed. Therefore, if the LN is not equipped with an LI4

circuit pack, enter 0x3 in the MV field. If the LN is equipped with an LI4 circuitpack, enter 0x3d in the MV field.

Ring Node Addressing

The addressing of ring nodes and the manner in which frames/cabinets are

identified are for maintenance purposes (see Tables 6-21 through 6-24). Anaddress is identified in terms of an integer sequence number and may be

represented in decimal or hexadecimal notations. The decimal notationsrepresent the physical node identification, ranging from 0 to 1023, where 1023 is

the maximum number of ring nodes located in a location. Another decimalnotation listing, ranging from 3072 to 4095, represents the physical nodeaddresses in machine logic. These notations are not usually seen by the users.

The other type of node addresses are in hexadecimal notations. These areimportant in analyzing the mismatch data produced when Phase 1 or 2 at an RN

fails. The suspected faulty node(s), as well as the beginning of isolation (BISO)and the end of isolation (EISO) nodes, are identified by hexadecimal physical

node addresses. The following tables contain these addresses. Additionalinformation on node addressing can be found in the "Maintenance Description”section in the CNI Maintenance Manual , 256-090-202.

7/29/2019 172254




401-661-045

Table 6-27. Physical Node ID (Decimal Representation) (Page 1 of 3)

GRP MEMBER NUMBER (0 is RPCN, 1 - 15 is IUN)

# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

01 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

02 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

03 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

04 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79

05 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95

06 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111

07 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127

08 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143

09 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159

10 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

11 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191

12 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207

13 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223

14 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239

15 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255

16 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271

17 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287

18 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303

19 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319

20 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335

21 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351

22 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367

23 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383

24 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399

25 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415

26 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431

27 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447

28 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463

7/29/2019 172254





29 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479

30 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495

31 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511

32 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527

33 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543

34 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559

35 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575

36 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591

37 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607

38 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623

39 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639

40 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655

41 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671

42 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687

43 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703

44 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719

45 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735

46 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751

47 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767

48 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783

49 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799

50 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815

51 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831

52 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847

53 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863

54 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879

55 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895

56 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911

57 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927

58 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943



# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7/29/2019 172254




401-661-045

59 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959

60 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975

61 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991

62 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007

63 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023



# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7/29/2019 172254





Table 6-28. Physical Node ID (Hexadecimal Representation) (Page 1 of 3)


# 0 1 2 3 4 5 6 7 8 9 10 11 2 13 14 15

00 000 001 002 003 004 005 006 007 008 009 00A 00B 00C 00D 00E 00F

01 010 011 012 013 014 015 016 017 018 019 01A 01B 01C 01D 01E 01F

02 020 021 022 023 024 025 026 027 028 029 02A 02B 02C 02D 02E 02F

03 030 031 032 033 034 035 036 037 038 039 03A 03B 03C 03D 03E 03F

04 040 041 042 043 044 045 046 047 048 049 04A 04B 04C 04D 04E 04F

05 050 051 052 053 054 055 056 057 058 059 05A 05B 05C 05D 05E 05F

06 060 061 062 063 064 065 066 067 068 069 06A 06B 06C 06D 06E 06F

07 070 071 072 073 074 075 076 077 078 079 07A 07B 07C 07D 07E 07F

08 080 081 082 083 084 085 086 087 088 089 08A 08B 08C 08D 08E 08F

09 090 091 092 093 094 095 096 097 098 099 09A 09B 09C 09D 09E 09F

10 0A0 0A1 0A2 0A3 0A4 0A5 0A6 0A7 0A8 0A9 0AA 0AB 0AC 0AD 0AE 0AF

11 0B0 0B1 0B2 0B3 0B4 0B5 0B6 0B7 0B8 0B9 0BA 0BB 0BC 0BD 0BE 0BF

12 0C0 0C1 0C2 0C3 0C4 0C5 0C6 0C7 0C8 0C9 0CA 0CB 0CC 0CD 0CE 0CF

13 0D0 0D1 0D2 0D3 0D4 0D5 0D6 0D7 0D8 0D9 0DA 0DB 0DC 0DD 0DE 0DF

14 0E0 0E1 0E2 0E3 0E4 0E5 0E6 0E7 0E8 0E9 0EA 0EB 0EC 0ED 0EE 0EF

15 0F0 0F1 0F2 0F3 0F4 0F5 0F6 0F7 0F8 0F9 0FA 0FB 0FC 0FD 0FE 0FF

16 100 101 102 103 104 105 106 107 108 109 10A 10B 10C 10D 10E 10F

17 110 111 112 113 114 115 116 117 118 119 11A 11B 11C 11D 11E 11F

18 120 121 122 123 124 125 126 127 128 129 12A 12B 12C 12D 12E 12F

19 130 131 132 133 134 135 136 137 138 139 13A 13B 13C 13D 13E 13F

20 140 141 142 143 144 145 146 147 148 149 14A 14B 14C 14D 14E 14F

21 150 151 152 153 154 155 156 157 158 159 15A 15B 15C 15D 15E 15F

22 160 161 162 163 164 165 166 167 168 169 16A 16B 16C 16D 16E 16F

23 170 171 172 173 174 175 176 177 178 179 17A 17B 17C 17D 17E 17F

24 180 181 182 183 184 185 186 187 188 189 18A 18B 18C 18D 18E 18F

25 190 191 192 193 194 195 196 197 198 199 19A 19B 19C 19D 19E 19F




7/29/2019 172254




401-661-045




32 200 201 202 203 204 205 206 207 208 209 20A 20B 20C 20D 20E 20F

33 210 211 212 213 214 215 216 217 218 219 21A 21B 21C 21D 21E 21F

34 220 221 222 223 224 225 226 227 228 229 22A 22B 22C 22D 22E 22F

35 230 231 232 233 234 235 236 237 238 239 23A 23B 23C 23D 23E 23F

36 240 241 242 243 244 245 246 247 248 249 24A 24B 24C 24D 24E 24F

37 250 251 252 253 254 255 256 257 258 259 25A 25B 25C 25D 25E 25F

38 260 261 262 263 264 265 266 267 268 269 26A 26B 26C 26D 26E 26F

39 270 271 272 273 274 275 276 277 278 279 27A 27B 27C 27D 27E 27F

40 280 281 282 283 284 285 286 287 288 289 28A 28B 28C 28D 28E 28F

41 290 291 292 293 294 295 296 297 298 299 29A 29B 29C 29D 29E 29F







48 300 301 302 303 304 305 306 307 308 309 30A 30B 30C 30D 30E 30F

49 310 311 312 313 314 315 316 317 318 319 31A 31B 31C 31D 31E 31F

50 320 321 322 323 324 325 326 327 328 329 32A 32B 32C 32D 32E 32F

51 330 331 332 333 334 335 336 337 338 339 33A 33B 33C 33D 33E 33F

52 340 341 342 343 344 345 346 347 348 349 34A 34B 34C 34D 34E 34F

53 350 351 352 353 354 355 356 357 358 359 35A 35B 35C 35D 35E 35F

54 360 361 362 363 364 365 366 367 368 369 36A 36B 36C 36D 36E 36F

55 370 371 372 373 374 375 376 377 378 379 37A 37B 37C 37D 37E 37F

56 380 381 382 383 384 385 386 387 388 389 38A 38B 38C 38D 38E 38F

57 390 391 392 393 394 395 396 397 398 399 39A 39B 39C 39D 39E 39F




# 0 1 2 3 4 5 6 7 8 9 10 11 2 13 14 15

7/29/2019 172254












# 0 1 2 3 4 5 6 7 8 9 10 11 2 13 14 15

7/29/2019 172254




401-661-045

Table 6-29. Physical Node Addresses (Decimal Representation) (Page 1 of 3)

GRP MEMBER NUMBER (0 is RPCN, 1 - 15 IUN)

# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087

01 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103

02 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119

03 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135

04 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151

05 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167

06 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183

07 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199

08 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215

09 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231

10 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247

11 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263

12 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279

13 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295

14 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311

15 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327

16 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343

17 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359

18 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375

19 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391

20 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407

21 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423

22 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439

23 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455

24 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471

25 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487

26 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503

27 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519

28 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535

7/29/2019 172254





29 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551

30 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567

31 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583

32 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599

33 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615

34 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631

35 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647

36 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663

37 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679

38 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695

39 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711

40 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727

41 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743

42 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 375

43 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775

44 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791

45 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807

46 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823

47 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839

48 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855

49 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871

50 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887

51 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903

52 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919

53 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935

54 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951

55 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967

56 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983

57 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999

58 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015



# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7/29/2019 172254




401-661-045

59 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031

60 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047

61 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063

62 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079

63 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095



# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7/29/2019 172254





Table 6-30. Physical Node Addresses (Hexadecimal Representation) (Page 1 of 3)

GRP Member Number (0 is RPCN, 1 - 15 is IUN)

# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

00 C00 C01 C02 C03 C04 C05 C06 C07 C08 C09 C0A C0B C0C C0D C0E C0F










10 CA0 CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 CA9 CAA CAB CAC CAD CAE CAF

11 CB0 CB1 CB2 CB3 CB4 CB5 CB6 CB7 CB8 CB9 CBA CBB CBC CBD CBE CBF

12 CC0 CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 CCA CCB CCC CCD CCE CCF

13 CD0 CD1 CD2 CD3 CD4 CD5 CD6 CD7 CD8 CD9 CDA CDB CDC CDD CDE CDF

14 CE0 CE1 CE2 CE3 CE4 CE5 CE6 CE7 CE8 CE9 CEA CEB CEC CED CEE CEF

15 CF0 CF1 CF2 CF3 CF4 CF5 CF6 CF7 CF8 CF9 CFA CFB CFC CFD CFE CFF

16 D00 D01 D02 D03 D04 D05 D06 D07 D08 D09 D0A D0B D0C D0D D0E D0F










26 DA0 DA1 DA2 DA3 DA4 DA5 DA6 DA7 DA8 DA9 DAA DAB DAC DAD DAE DAF

27 DB0 DB1 DB2 DB3 DB4 DB5 DB6 DB7 DB8 DB9 DBA DBB DBC DBD DBE DBF

28 DC0 DC1 DC2 DC3 DC4 DC5 DC6 DC7 DC8 DC9 DCA DCB DCC DCD DCE DCF

7/29/2019 172254




401-661-045

29 DD0 DD1 DD2 DD3 DD4 DD5 DD6 DD7 DD8 DD9 DDA DDB DDC DDD DDE DDF

30 DE0 DE1 DE2 DE3 DE4 DE5 DE6 DE7 DE8 DE9 DEA DEB DEC DED DEE DEF

31 DF0 DF1 DF2 DF3 DF4 DF5 DF6 DF7 DF8 DF9 DFA DFB DFC DFD DFE DFF

32 E00 E01 E02 E03 E04 E05 E06 E07 E08 E09 E0A E0B E0C E0D E0E E0F










42 EA0 EA1 EA2 EA3 EA4 EA5 EA6 EA7 EA8 EA9 EAA EAB EAC EAD EAE EAF

43 EB0 EB1 EB2 EB3 EB4 EB5 EB6 EB7 EB8 EB9 EBA EBB EBC EBD EBE EBF

44 EC0 EC1 EC2 EC3 EC4 EC5 EC6 EC7 EC8 EC9 ECA ECB ECC ECD ECE ECF

45 ED0 ED1 ED2 ED3 ED4 ED5 ED6 ED7 ED8 ED9 EDA EDB EDC EDD EDE EDF

46 EE0 EE1 EE2 EE3 EE4 EE5 EE6 EE7 EE8 EE9 EEA EEB EEC EED EEE EEF

47 EF0 EF1 EF2 EF3 EF4 EF5 EF6 EF7 EF8 EF9 EFA EFB EFC EFD EFE EFF

48 F00 F01 F02 F03 F04 F05 F06 F07 F08 F09 F0A F0B F0C F0D F0E F0F










58 FA0 FA1 FA2 FA3 FA4 FA5 FA6 FA7 FA8 FA9 FAA FAB FAC FAD FAE FAF



# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7/29/2019 172254





Automatic Diagnostics and Restorals

Automatic restoral of nodes is a feature provided by the node recovery monitor(NRM). Only nodes in the OOS major state are considered for restoral by the

NRM. Depending on the minor states, a conditional, unconditional, or no restoralrequest is issued.

The NRM ensures that any node entering a state which indicates that it is eligible

to be restored to service is the object of an appropriate restoral attempt within afew minutes, unless other work takes precedence. The NRM must perform thefollowing tasks:

1. Attempt recovery of faulted nodes, including any associated ring isolations.

A node can be faulted when a problem is detected during operation orwhen it fails to become active during system-wide initialization.

2. Recover usable nodes which become available due to removal of a ring

isolation.3. Detect and make ineligible for automatic recovery, those nodes which are

too frequently faulted and recovered.

4. Inhibit the automatic starting of node restorals:

s During a system-wide initialization.

s When the ring maintenance state indicates that the ring is

undergoing reconfiguration or is down.

5. Submit all conditional restorals under software known as ARR.

When a requested restoral is not successful, or the internal timer awaiting job

completion expires, the following message is generated:

REPT ARR AUTORST FAILURE FOR aaaa b

where: aaaa b = identifying name of the node.

59 FB0 FB1 FB2 FB3 FB4 FB5 FB6 FB7 FB8 FB9 FBA FBB FBC FBD FBE FBF

60 FC0 FC1 FC2 FC3 FC4 FC5 FC6 FC7 FC8 FC9 FCA FCB FCC FCD FCE FCF

60 FD0 FD1 FD2 FD3 FD4 FD5 FD6 FD7 FD8 FD9 FDA FDB FDC FDD FDE FDF

62 FE0 FE1 FE2 FE3 FE4 FE5 FE6 FE7 FE8 FE9 FEA FEB FEC FED FEE FEF

63 FF0 FF1 FF2 FF3 FF4 FF5 FF6 FF7 FF8 FF9 FFA FFB FFC FFD FFE FFF



# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7/29/2019 172254




401-661-045

If the ECD restoral threshold is exceeded, the following output message is

generated:

REPT ARR AUTORST THRESHOLD EXCEEDED FOR aaaa b


If a time-out occurs while waiting for a reply message, this output message is

generated:

REPT ARR AUTORST TIMEOUT AWAITING MIRA FOR aaaa b


For additional information regarding the BREPT ARR AUTORST messages, refer

to the the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUTMESSAGES Message Manual.

The following priorities determine the order in which nodes eligible for automaticrestoral are served:

1. A nominated critical node (typically the BISO or EISO node)

2. Nodes with faulty ring interfaces

3. RPCNs eligible for unconditional restorals

4. RPCNs eligible for conditional restorals

5. Is eligible for unconditional restorals

6. Is eligible for conditional restorals.

For a more detailed description of automatic node restorals and ARR, refer tothe"“Maintenance Description” section in ththe 401-610-055 Input MessageManual.

Manual (Unit) Diagnostics

Presented on the following pages are variations of procedures that are used in

performing RN diagnostics. Each procedure completely performs the diagnostictasks. The procedures are presented to illustrate that there is no one defined

procedure for performing RN diagnostics. The user may determine whichprocedure to use, depending upon the extent of the diagnostic task, but in

general, use of the 1106 page will provide adequate results.

NOTE:Replace the term nodexx y within each input command with the appropriate nodebeing diagnosed (or RPCN). Also, before any manual diagnostics begin, ARR

7/29/2019 172254





should be inhibited to prevent automatic diagnostics (ARR) from attempting to

diagnose and restore nodes scheduled for manual diagnostics. See fINH:DMQ inthe the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUT

MESSAGES Message Manual.

Before any node associated with an active link can be removed from service for

diagnostic purposes, the appropriate link must be removed from service.

To put the signaling link (SLK) in the AVAILABLE-Manual Out-of-Service (MOOS)state, enter the following message at the MCRT, and proceed with diagnostics as

usual.

CHG:SLK (a, b, [c, d]); MOOS

where:

a = group number (00 - 63)



CHG SLK a b [ c d ]

NEW REQUESTED MINOR STATE = MOOS

where:


b = member number (01 - 15)c = LI4 circuit pack (0 - 1)

d = LI4 port (0 - 3)

If the SLK was manually removed from service, after diagnostics put it back in theAVAILABLE-In Service (IS) or Standby (STBY) state by entering the followingmessage at the MCRT:

CHG:SLK (a, b, [c, d]); {IS | ARST}

where:


b = member number (01 - 15)c = LI4 circuit pack (0 - 1)d = LI4 port (0 - 3)


7/29/2019 172254




401-661-045

CHG SLK a b [ c d ]

NEW REQUESTED MINOR STATE = IS

where:


b = member number (01 - 15)c = LI4 circuit pack (0 - 1)

d = LI4 port (0 - 3)

Refer back to these procedures as required when performing manual diagnostics.

There are basic events that must be accomplished when performing RN

diagnostics. Input messages and formats can vary. As indicated in earlierparagraphs of this guide, some input messages cause the system to perform all

diagnostic activities, such as removing the node from service, isolating the node,diagnosing the node, unisolating the node, and restoring the node to service. Yet,

there are other input messages, where each individual event is acted uponaccording to the diagnostic message used. When performing RN diagnostics withthe use of a conditional restore (RST) or with the DGN command, a basic

sequence of events (excluding obtaining a status report) autonomously occur inthe manner listed below:

1. The node under test (NUT) must first be removed from service. This isdone by changing its state to out-of-service normal (OOS-NORMAL), if it

was in the ACT state prior to performing the diagnostics. For additionalinformation on node state changes, see the “Maintenance Description”

section in this Manual.

2. The NUT is changed to the OOS-ISOLATED state to route incoming and

outgoing traffic around the NUT. The request to isolate the NUT may be

denied for reasons not listed here.

3. The node under test is diagnosed.

4. If the NUT was in the active ring prior to Step 2, after all diagnostic phases

ran, the NUT is configured back into the active ring (OOS-NORMAL). Theconfiguration can be denied if the diagnostics determined that the ring

interface (RI) minor state is faulty (FLTY).

5. Finally, after successfully configuring the node back into the active ring, the

NUT is restored to service. It is automatically pumped with operationalcode, placed into execution, and changed to the active (ACT) state.

NOTE:If the request was a DGN rather than an RST, the node is not restored to service.

7/29/2019 172254





When a diagnostic failure cannot be corrected by CP replacement using the

manual trouble locating process (see the trouble location circuit pack list tables inthis chapter), check:

s Interframe buffering cables

s Backplane and pins

s Wiring.

Before replacing any cables or changing any connections or pins, refer to theappropriate maintenance manuals.

The following pages provide procedures used in performing RN diagnostics. Any

of the following procedures can perform a diagnostic task. The followingprocedures are used for diagnosing either RPCNs or s. Each procedure is totally

independent and should not be combined.

Manual Diagnostics Using the 1106 Display Page

The 1106 display page, sometimes called the ring node status page, allows you toperform diagnostics and remove or unconditionally restore any node in the office.The ring node status page (RNSP), that is, the 1106 page, allows for the

performance of either function mentioned above on the frame/cabinet that isdisplayed on the MCRT. To obtain proper MCRT operation and page display

instructions, see ““Trouble Indicators, Error Analysis, and Display Pages” in thisManual. When the Index Page display has been obtained, enter 1106 on the

command line at the top of the MCRT. Before any node supporting an active link istaken out of service, the associated link must first be removed from service. Thelink should also be placed in its previous state after diagnostics is completed.

Refer to “Manual (Unit) Diagnostic” in this chapter for procedures to add and

remove links. From this point, the following may be performed to diagnose,remove, restore, or display a particular frame/cabinet group:

Procedure 6-1. The 1106 Page Diagnostic Procedure

NOTE:Before any manual diagnostics begin, ARR should be inhibited to preventautomatic diagnostics (ARR) from attempting to diagnose and restore nodesqueued, or actively performing manual diagnostics. See theINH:DMQ message in

the CNI Input Message Manual , 256-090-204.

1. From the MCRT

Display the frame/cabinetgroup to be diagnosed by entering the followingcommand:

6xx

7/29/2019 172254




401-661-045

where: xx = group number.

2. If a node is to be removed from service (OOS-NORMAL) for any reason, the

following input command is used:

2xx

where: xx = display line number of the node to be removed from service.

The node state changes to OOS-NORMAL.

3. From the MCRT

To diagnose a node from this frame/cabinet group, enter the following command:

5xx

where: xx = display line number of the node to be diagnosed.

See the DGN command in the 401-610-057 Output Message Manual, for the

response to the completion of the diagnostics.

If the diagnostic result is:

STF—Determine which phase(s) failed, and record the CP number(s) for

that phase. See the trouble location circuit pack list tables in this chapter foradditional information.

Conditional all-tests-passed (CATP)— Determine the reason for the CATPresponse.

If the reason is “the node was not singly isolated,” go to Step 4.Conditionally restore (RST) the adjacent nodes. When these nodes have

been restored, conditionally restore this node, the first failing node.

If the reason is “the node was not isolated,” correct all problems so that a

duplex ring exists and conditionally restore this node.

If the reason is “the ring is down,” correct all problems so that an active ring

exists and conditionally restore this node.

For additional information on ring configuration and maintenance, see“Maintenance Description” section in this manual.

No-tests-run (NTR)—If an NTR response is received, go to Step 3. If theproblem persists, seek technical assistance.

ABT—If an ABORT is received, determine the reason(s) for the ABORT.After determining the reason(s) for the ABORT, go to Step 3, and/or seek

technical assistance.

4. From the MCRT

Unconditionally restore the node to service by entering the following input command:

3xx

7/29/2019 172254





where: xx = display line number of the node to be unconditionally restored.

! CAUTION:

Do not perform an unconditional restore unless one of the following has occurred:

s A complete diagnostics has produced an all-tests-passed (ATP)

response.

s A complete diagnostics has produced a CATP response, and the RIand the NP minor states are both USBL.

The node which was being diagnosed should return to the system ACT state, andthis should complete the diagnostic tests.

Procedure 6-2. Manual Diagnostics Using the DGN Command

This procedure uses the DGN command. When this command is entered at theMCRT, the following sequence of events normally occurs. For exceptions, see the

DGN: or DGN:RPCN command in the 401-610-055 FLEXENT™/AUTOPLEX ®

Wireless Networks INPUT MESSAGES Message Manual.

1. If the node is active or handling traffic, the node is removed from service

(OOS-NORMAL).

2. The node under test is isolated (OOS-ISOLATED).

3. Diagnostics are performed on the NUT.

4. The node is unisolated (OOS-NORMAL) and configured back into the active ring.

5. The node is not restored to service.

Procedure 6-3. The DGN Command Diagnostic Procedure

When using the DGN command, the following procedure should be used torestore a node to service:

7/29/2019 172254




401-661-045

NOTE:Before any manual diagnostics begin, ARR should be inhibited to preventautomatic diagnostics (ARR) from attempting to diagnose and restore nodesqueued, or actively performing manual diagnostics. See theINH:DMQ message in

the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUTMESSAGES Message Manual.

1. At the MCRT—

Obtain a report on the status of a node in a particular group, or the status of the ring

by entering the following input message, or a variation thereof, as shown in “OP:

Ring Input Message Variations” table, or refer to the 401-610-055 FLEXENT™/

AUTOPLEX ® Wireless Networks INPUT MESSAGES Message Manual.

OP:RING,nodexx y

For LN—

node = LNxx = group number

y = node member number.

For RPCN—node = RPCNxx = group number


NOTE:The input message provided above provides the status information for a specified

RN. For the message completion response, observe the MCRT or the ROP. Todetermine what response message to expect and for an explanation of such, seethe 401-610-057 FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUT

MESSAGES Manuall.

2. At the MCRT—

If there is an active link supported by this node, remove it from service using theprocedures listed previously in this section.

Request diagnostics of the node by entering the following input message, or a

variation thereof, as listed in “DGN Message Input Variation” table. For a completelisting of all DGN input command variations, see the 401-610-055 FLEXENT™/

AUTOPLEX ® Wireless Networks INPUT MESSAGES Message Manuall.

7/29/2019 172254





DGN:nodexx y

For LN—node = LN

xx = group number


For RPCN —node = RPCN

xx = group numbery = node member number.

NOTE:The input message listed above runs all automatic phases on the specified RN. To

determine what response message to expect and for an explanation of thismessage, see the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks

INPUT MESSAGES Message Manual or the 401-610-057 FLEXENT™/ AUTOPLEX ® Wireless Networks OUTPUT MESSAGES Manual

3. At the ROP —

Examine the copy of the DGN printout to determine the status of the diagnostics

tests (determine which phases failed or passed).

If an ATP response is received at the ROP, proceed to Step 4.

If an STF, NTR, or CATP response is received at the ROP, go to Step 5.

4. At the MCRT —

If a link associated with this node was removed from service prior to diagnostics, put

the link back in service using the procedures listed previously in this section.

Unconditionally restore the node to service by entering the following input

message:

RST:nodexx y ;UCL

For LN—


y = node member numberUCL = restores the node without diagnostics.

7/29/2019 172254




401-661-045

For RPCN —

node = RPCNxx = group number

y = node member number

UCL= restores the node without diagnostics.



s A complete diagnostics has produced a CATP response, and the RI and the NP minor states are both USBL.

NOTE:If the major state of the node is OOS-ISOLATED, this input message requests thatthe node be included back into the active ring. If configuring the node back into theactive ring is successful, the node major state is changed to ACT and the node is

pumped with the required operational code. If the node is unable to be configuredback into the active ring, the restore is stopped and the node is left in the

OOS-NORMAL state. If the node was not originally OOS, the restore is stoppedand the node is left in the state it was in prior to the restoral request. The nodes

major state must be changed to OOS via a recent change and verify (RCV)command before it can be restored. For additional information concerning a nodestate change, refer to “Maintenance Description” section in this manual.

NOTE:

If the major state is changed to ACT, the DGN diagnostics are complete. Omit theremainder of this test procedure.

NOTE:Perform Steps 5 through 8 only if an ATP response is not received in Step 3.

5. From the ROP—

If the diagnostic result is:

STF—Determine which phase(s) failed, and record the CPnumber(s) for that phase. See the trouble location circuit pack list

tables in this chapter for additional information on RNs. Proceed toStep 6.

CATP—Determine the reason for the CATP response.

7/29/2019 172254





If the reason is “the node was not singly isolated,” go to

Step 4. Conditionally restore (RST) the adjacent nodes.When these nodes have been restored, conditionally restore

this node, the first failing node.

If the reason is “the node was not isolated,” correct allproblems so that a duplex ring exists and conditionallyrestore this node.

If the reason is “the ring is down,” correct all problems so thatan active ring exists and conditionally restore this node.

For additional information on ring configuration andmaintenance, see the "“Maintenance Description” section in

this manual.

NTR—If an NTR response is received, go to Step 1 or Step 2. If the

problem persists, seek technical assistance.

ABT—If an “ABORT” is received, determine the reason(s) for theABORT. See the 401-610-057 FLEXENT™/AUTOPLEX ® Wireless NetworksOUTPUT MESSAGES Manual.

After determining the reason(s) for the ABORT, go to Step 1 or

Step 2, and/or seek technical assistance.

6. At the ring node frame/cabinet (RNF/C) —

Use the trouble location circuit pack list tables in this chapter to determine the

equipment location for each suspected or faulty CP.

7. At the RNF/C —

Replace the faulty CP(s) using the procedures described in using the procedure

described in Chapter 7, Equipment Handling Procedures.

8. If time permits and there is uncertainty about node operation, repeat diagnostics to

confirm proper system operations. Go to Step 2.

Procedure 6-4. Manual Diagnostics Procedure Using the RST Command

This procedure uses the RST input command. This command provides the same

functions as the DGN command, with the addition of an automatic restoral at thecompletion of running the diagnostic phases. The restoral is conditional upon anATP or CATP diagnostic result, with the RI and NP minor states both being usable

(USBL). This command normally performs the following sequence of events. For

7/29/2019 172254




401-661-045

exceptions, see the RST:/RST:RPCN input command in the 401-610-055

FLEXENT™/AUTOPLEX ® Wireless Networks INPUT MESSAGES MessageManual.

1. Conditionally removes the node from service (OOS-NORMAL).

2. Isolates (OOS-ISOLATED) the node.

3. Runs all automatic phases on the node.

4. Unisolates the node (OOS-NORMAL).

5. Restores the node to service (ACT).

For additional information on the normal sequence of events when using the RST

command, see the 401-610-055 Input Message Manual.

Procedure 6-5. The RST Command Diagnostic Procedure

When using the RST command, the following procedure can be used:

1. At the MCRT—

Obtain a report on the status of a node in a particular group, or the status of the ring

by entering the following input message, or a variation thereof, as shown in the “OP:

Ring Input Message Variations” table.

OP:RING,nodexx y

For LN —node = LN




NOTE:The input message listed provides the status information for a specified RN. To

determine what response message to expect and for an explanation of such, seethe 401-610-057 FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUT

MESSAGES Manual.

7/29/2019 172254





2. At the MCRT —

If there is an active link supported by this node, remove it from service using the

procedures listed previously in this section.

Request node test by entering the following input message:

RST:nodexx y

For LN —node = LN




NOTE:Upon inserting the RST command at the MCRT, the following events normally

occur:

1. The node is conditionally removed from service (OOS-NORMAL). The ringquarantine (RQ) LED on the node processor or IRN lights if the remove

above was successful.

2. The node is isolated from the active ring (OOS-ISOLATED). The no token

(NT) LED lights at the node under test if the node is successfullyconfigured out of the active ring.

3. All diagnostic phases are run on the specified node under test.4. If the diagnostic result is an ATP response, the node is configured back into

the active ring. When the node is successfully configured back into theactive ring, it is restored to service. If the node is unable to configure back

into the active ring, it is left in the OOS state.

To determine what completion response message to expect and for an

explanation of such, see the 401-610-057 FLEXENT™/AUTOPLEX ®

Wireless Networks OUTPUT MESSAGES Manual.

If a link associated with this node was removed from service prior to diagnostics,

put the link back in service using the procedures listed previously in this section.

NOTE:If the node is left in the OOS state, and the response STF, CATP, or NTR isreceived at the ROP, further diagnostics are required. Depending upon the

severity of the failure(s), that is, if a particular phase or range of phases failed,

7/29/2019 172254




401-661-045

choose a DGN input message as listed in the “DGN Message Input Variations”

Table or from the CNI Input Message Manual , 256-090-204 which matches thecircumstances of the failed phase(s), and perform Steps 3 through 9.

At the ROP—From the printout received at the ROP (this step), determine which phase(s)

failed.

If an ATP response is received at the ROP, all diagnostics are complete and therest of this test procedure should be omitted.

If only a particular phase failed, proceed to Step 4, and enter message as listed ininstructions.

If a range of phases failed, enter the appropriate input message from “DGN

Message Input Variations” table in Step 4, and proceed with the test.

NOTE:Perform Steps 4 through 9, only if a CATP, NTR, or STF response is received inSteps 2 and 3.

At the MCRT—

Request diagnostics for the failing phase by entering the following input message,or a variation thereof, as listed in “DGN Message Input Variations” table:

DGN:nodexx y :PH a

For LN—


y = node member numberPH = phasea = number of a particular phase to run

For RPCN —

node = RPCNxx = group number

y = node member numberPH = phasea = number of the particular phase to run.

NOTE:To determine what completion response message to expect and for anexplanation of the message, see the 401-610-055 Input Message Manual or the401-610-057 Output Message Manual.

7/29/2019 172254





1. At the ROP—

Examine the printout and ascertain the failed phase(s), record the CP(s)number(s) and use the trouble location circuit pack list tables in this chapter

to determine the equipment location of the failed or faulty CP(s). The TLP

option can also be used to determine the location of suspected faultyequipment.

2. At the RNF/C—Replace the faulty CP using the procedure described in Chapter 7

Equipment Handling Procedures.

3. If time permits and there is uncertainty about node operation, repeatdiagnostics to confirm proper system operations. Go to Step 2.

CDN-I Fault Isolation

Panic Messages

Panic messages are intended for use in analyzing software problems. They are,for the most part, not useful for hardware fault isolation. Recurring panic

messages should be reported to the CTS. The hardware panic message thatindicates that the microsecond timer on the NPI board is malfunctioning, is a

valuable message. This timer is not tested by the diagnostic but is tested in thebackground of the operational software. If this message is received, the NPI board

should be replaced. If the panic persists, replace the CCS board.

Formerly when a CDN-I crashed because of hardware problems, diagnostics were

relied on to recover the node. Each RAP circuit pack is diagnosed by a particulardiagnostic phase. A failing diagnostic phase is supposed to isolate the fault to the

pack associated with that phase number.

The diagnostics rely on RAP firmware to be operational. This diagnostic is adiagnostics driver which is pumped to the IRN. The driver sends commands to theRAP firmware allowing for the diagnostics to be executed for a given board. A

large percentage of circuitry on every pack on the RAP local bus must beoperational for this to work and even more circuitry must be operational for

firmware execution of the power up initialization sequence. If the RAP cannotinitialize, diagnostics is impossible.

Diagnostic responses received at the host fall into one of three categories. Theyare:

s A normal response containing failure data.

s A response without failure data because the RAP is hung in a diagnosticphase (the board being diagnosed is at fault).

7/29/2019 172254




401-661-045

s A response without failure data because the RAP firmware is not

executing.

The first two faults can be isolated using standard diagnostic procedures. More

than likely, however, the RAP firmware is not executing (a category 3 failure). Inthe automatic recovery procedure, diagnostics are run on a particular sequence of

boards. The first board (on the RAP local bus) of this sequence always failsregardless of which board is bad.

RAP Diagnostic Firmware

Each circuit pack on the RAP bus in a CDN-I is equipped with a diagnostic fail

LED. The system initializes with all LEDs on and if all diagnostics are successful,the LEDs turn off. The diagnostics can be run locally by pressing the DIAG buttonon the PCID. The LEDs can also be used to mark the progress of the initialization

when power is applied to the RAP. When the RAP appears as though it is notinitializing, it is very difficult to isolate the faulty pack because many packs can

affect the bus. Fortunately, the minimum number of packs on the local busrequired for firmware operation is just three (CCS, CCC, MASC_0).

Utilizing RAP firmware greatly reduces RAP downtime as compared with runningthe diagnostics from the host. Refer to the section “Ring Application Processor

Critical Maintenance Procedure” in Chapter 3, Ring Maintenance .

Interactive Diagnostics

Interactive diagnostics (EX) are used to exercise a node in the interactive mode.Interactive diagnostics are used to enter a mode of operation whereby diagnostic

execution is controlled to exercise any particular phase or portion of diagnosticexecution. Interactive diagnostics can be used to replace regular diagnostic

execution when the following is to be performed:

1. To run diagnostics up to a particular point of execution and stop

2. To perform a specific group of tasks repeatedly

3. To start and to stop a loop of diagnostic executions

4. To step through a set of diagnostic commands

5. To suspend diagnostic execution for a specific time period.

NOTE:This capability is limited to data table statements; that is, downloaded diagnostic

code when executed cannot be controlled interactively.

When EX is begun, the following sequence of events occurs:

1. The or RPCN is first removed from service following the rules of the RMV:

or RMV:RPCN input messages.

7/29/2019 172254





2. The node is isolated if the node’s major state is OOS, GROW, OFFLINE, or

UNAV. Otherwise, the diagnostic request is aborted.

3. The EX demand executions are performed.

4. Upon successful completion of the EX routine, an attempt is made toinclude the node back into the active ring if it was in the active ring prior to

entering the EX command. Otherwise, the node is left in the isolatedsegment. In all cases, the node is left in the OOS state.

Procedure 6-6. Interactive (EX) Diagnostic Procedures

When it is desired to perform interactive diagnostics, the following procedureshould be used:

1. To start the interactive diagnostic mode:From the MCRT—

If there is an active link supported by this node, remove it from service using theprocedures listed previously in this section.

Enter the EX command for the desired node. This command returns a slot

number.

For an LN—

EX:xx y :PH b

For RPCN—

EX:RPCNxx y :PH b [,c ]

where:xx = group number

y = node member numberb = phase(s) to be executed

c = statement number

2. From MCRT or ROP—

Wait for the display of EX:STARTED AT STATEMENT a, which indicates that the

interactive mode has started.

7/29/2019 172254




401-661-045

3. From the MCRT—

Execute the diagnostics by entering the EX commands as listed or in the order that

the diagnostics are to be performed:

To pause or suspend diagnostic execution at a specified statement number within adiagnostic phase for an RN, enter the following command:

EX:PAUSE;nodexx y :ST e

where:

node = or RPCNxx = group number

y = node member numberb = phase(s) to be executed

e = statement number

See tthe 401-610-057 FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUT

MESSAGES Manual. for system response to message.

To put the diagnostics in a loop between the specified statement numbers for anyRN, enter the following command:

EX:LOOP;nodexx y :ST f - g

See the 401-610-057 FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUTMESSAGES Manual for the system response to the message.

To step through the diagnostics and to suspend at a specified statement number

for any RN, enter the following command:

EX:STEP;nodexx y :ST e

See the 401-610-057 Output Message Manual for the system response to the

message.

To stop the looping started by the EX:LOOP command for any RN, enter thefollowing command:

EX:STOP;nodexx y

See the 401-610-055 FLEXENT™/AUTOPLEX ® Wireless Networks INPUTMESSAGES Message Manual or the 401-610-057 FLEXENT™/AUTOPLEX ®

Wireless Networks OUTPUT MESSAGES Manual for the system response to themessage.

To exit from the interactive mode for any RN, enter the following input command:

7/29/2019 172254





STOP:DMQ;nodexx y

If a link associated with this node was removed from service prior to diagnostics,

put the link back in service using the procedures listed previously in this section.

Denied Diagnostic Requests

When a manual request is denied, the following message is printed at the ROP:

<type> : NO node AVAILABLE _ RETRY LATER

where: <type> = Type of request:

DGN - Manual diagnosticEX - Interactive diagnostic

RMV - Remove nodeRST - Restore node.

Reenter the request at a later time.

When an automatic request is denied, the user does not receive any notification,and no action on the user’s part is required. For additional information concerning

denied diagnostic requests.

Inhibiting Diagnostic Requests

A diagnostic inhibit (INH) is used to inhibit (stop) automatic diagnostic request.

Any process that sends a restore, remove, or diagnostic request to the system forprocessing can be prevented from being activated for any amount of timespecified. A reminder that a specific inhibit is output at the display terminal at

specified intervals. The message format for inhibiting a diagnostic request is asfollows:

INH:DMQ;SRC a, TINH b, AINH c

where:

INH = inhibitDMQ = diagnosticsSRC a = identity of process to be inhibited

TINH b = time in minutes that inhibit lastsAINH c = alarm intervals in minutes.

7/29/2019 172254




401-661-045

For more details and an explanation of the INH:DMQ command, refer to the 401-

610-057 FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUT MESSAGESManual.

Diagnostic Aborts and Audits

Aborts

At times when performing diagnostics, it may be necessary to abort or cancel arequest in the active queue if:

s The request was entered by mistake.

s A request of higher importance is in the waiting queue, and an active

queue must be cleared to allow room for another.

s An interactive diagnostic is to be exited.

s The active and waiting queues of all requests must be cleared for the field

update of diagnostic files.

When it is necessary to abort or cancel a diagnostic request, the following

procedure should be used:

1. At the MCRT—Enter the following input command:

OP:DMQ

The output from this command tells the user the slot number and queue

assigned to a particular job. The source in the output message may be (butis not limited to) one of the following:

s ARR - Automatic ring recovery

s ADP - Automatic diagnostic process

s MAN - Manual requests input by the user

s PSM - Power switch monitor

s REX - Routine exercise.

2. At the maintenance terminal—Enter the following command to abort a diagnostic request in the active

queue or cancel it from the waiting queue.

STOP:DMQ;nodexx y

7/29/2019 172254





Audits

At various points in the diagnostic execution process, checks are performed to

verify that the diagnostic system is functioning properly. These verifications are:

s Called functions gives correct return codes

s Needed system resources are available

s Necessary files can be opened or read, and executed

s Hardware errors have not occurred

s Illegal operations are not attempted

Audit Failures

If an audit fails, a report is printed at the MCRT. The user should respond to the

audit report in the following manner:

1. If a diagnostic test or phase fails prior to an audit failure, clear the problemindicated by the test failure. This may also clear the audit failure.

2. Save the printout pertaining to the 401-610-057 FLEXENT™/AUTOPLEX ®

Wireless Networks OUTPUT MESSAGES Manualthe 401-610-057 OutputMessage Manual:

s to determine the reason for the audit failure,

s to determine whether or not the CTS should be contacted,

s and to see if any additional data should be collected.

When a diagnostic is aborted, one of two messages is printed at the MTTY and

the ROP. Listed here is only one format and explanation. For details andexplanation of the second format, refer to the 401-610-057 Output Message

Manual.

DGN AUDIT RING

R = b

SYSTEM DATA

D = n

T = i A = j S = k I = l PH = p

where:

b = reason for the audit, (in hexadecimal notation)n = error code returned on a failing system call or

a failing function call (in decimal notation)i = last test executed (in decimal notation)

j = data table address (in hexadecimal notation)

7/29/2019 172254




401-661-045

k = data table statement number (in decimal notation)

l = task routine index (in hexadecimal notation).PH = phase number being executed when the DGN was aborted

(in decimal notation).

For additional information concerning audits, refer to the “ Audits” section of this

manual.

Operating System Diagnostics

The procedures and information needed for performing 3B21D-2 and UNIX

system RTR or UNIX system RTR VLMM diagnostics are provided in the UNIX System RTR 3B20/3B21 Operator’s System Maintenance Manual , 304-046.

7/29/2019 172254


Contents



7

Equipment Handling Procedures

Introduction 7-1

Equipment Description and Handling Precautions 7-1s Power Packs and Fusing Descriptions 7-2

Power Pack Description and Replacement Procedures 7-2

Fuse Description and Replacement Procedures 7-7

s Fan and Filter Maintenance 7-13

Ring Node Frame Fan Unit Description 7-13

Ring Node Cabinet Fan Unit Description 7-13

Analog Facility Access Frame Fan Unit Description 7-13

Filter Maintenance 7-15

Ring Node Circuit Pack Handling Precautions 7-16

s Ring Node Equipment Visual Indicators 7-17

s Removing Affected Equipment From Service 7-17

s UN122C and UN123B Combination Circuit Pack Installation 7-23

s Voice Frequency Link Hardware Equipment ReplacementProcedures 7-28

7/29/2019 172254




401-661-045

Contents

7/29/2019 172254




7


Introduction

This chapter the contains guidelines and precautions to be followed when workingwith equipment in a Common Network Interface (CNI) office. These guidelines

and precautions must be followed closely before and during the handling of allcircuit packs (CPs). Since improper handling may cause isolation of the ring or

total system failure, they are of extreme importance. Use them in conjunction withChapter 4, Ring and Ring Node Maintenance Procedures and Chapter 6,Diagnostic User’s Guide.

Equipment Description and Handling

Precautions

The following precautions are for ring maintenance functions. Failure to followthese procedures could result in the damage to highly integrated CPs or loss ofservice, caused by isolating or totally interrupting the ring. These procedures

cover the handling and the replacement of equipment only. The equipment hasbeen Underwriters Laboratories (UL) approved and consists of the following

components:

s Integrated ring circuit packs (described for each ring node type in the

Overview of Chapter 6, Diagnostic User’s Guide )

s Power converter packs

s Ring node frame/cabinet (RNF/C) fan units.

7/29/2019 172254




401-661-045

NOTE:When handling ring and ring node (RN) equipment, the appropriate light emittingdiodes (LEDs) must be illuminated to prevent severe system interruption or failure.

Power Packs and Fusing Descriptions

The power packs and fuses associated with the RNF/C and power distribution

frame/cabinet provide the necessary power for equipment located on each RNF/ C. A power island (PI) supplies backup power in the event of primary power loss.

The PI provides from 5 to 30 minutes of battery holdover, depending on the loadand the number of battery strings used, and is contained in 2-4 3B21Dcomputer-type cabinets. The fan units provide the necessary equipment cooling.

NOTE:DLNs and CDNs use the same procedures as RNF/C(s). The term “LN” is used in

these procedures to represent all of these nodes.

Power Pack Description and Replacement

Procedures

Each unit on the RN frame/cabinet uses two 495FA or 410AA power converters to

supply power to the three s associated with that particular unit. Therefore, onepower converter supplies power to one and a half s. The loss of either converteraffects the operation of two of the three s in that unit. CDN-I uses 410AA power

converters, one for the node, one for the RAP and Link Node unit, and two foreach additional memory growth unit. Likewise, each RPCNU uses two 495FA

power converters. Loss of either converter affects the operation of that RPCNU.

Before replacing a power supply circuit pack in a 3-node unit, isolate the twonodes adjacent to the power supply. In a 2-node unit, isolate the node adjacent tothe power supply. In an 8-node unit, isolate the four nodes adjacent to the power

supply. In a 5-node unit, learn from the unit horizontal designation strip next to thepower supply in question the nodes serviced by the power supply, and isolate

either three or two nodes.

No power pack should be removed without first removing the associated s orRPCN from service. Power may be affected due to a faulty power converter, ashort in one of the associated circuit packs, or an incorrect or missing current

programming resistor on an circuit pack. Table 7-1 will determine which nodesmust be removed when removing power supplies in the RNF.

7/29/2019 172254





Procedure 7-1. Replacing Ring Node Frame/Cabinet Power Packs

1. At the maintenance cathode ray tube (MCRT), determine affected equipment

location.

2. Press the alarm release (ALM-RLS) key to silence the audible alarm.

NOTE:The audible alarm may also be silenced by pressing the alarm cutoff (ACO) key atthe alarm frame.

3. Remove either the two associated s or the affected RPCN from service. Enter:

RMV:nodexx y

where:

node = LN or RPCNxx = Ring node group number

y = Node position in the ring node group (member number).

4. Isolate the associated RPCN or s from the active ring by entering:

Table 7-1. Power Unit Index

REPLACE POWER UNIT REMOVE NODES:

1 1, 2

2 2, 3

3 4, 5

4 5, 6

5 7, 8

6 8, 9

7 10, 11

8 11, 12

9 13, 14

10 14, 15

7/29/2019 172254




401-661-045

CFR:RING a , b ;EXCLUDE

where:

a = Ring node (if b is present, a is the first ofa range of RNs (in the direction of flow of Ring 0).

In the form of {RPCNx y | x y}b = Last node in the range begun by ‘a’ in the same form.

EXCLUDE = Request to isolate specified node(s) from the activering.

5. At the affected RNF/C, locate the correct faulty converter.

6. Obtain the proper replacement power pack using precautions for handling RN

equipment CPs.

! CAUTION:Before removing the affected power pack, ensure that the associated

RPCN or (s) has been removed from service and isolated. Refer to Table 7-1 to determine the proper nodes to remove from service.

7. At the faulty equipment location, replace the faulty power pack (observe all

equipment handling precautions).

8. At the RN control panel, press the PWR ALM RESET button to restore the frame/

cabinet to normal operation.

9. At the 410AA or 495FA power converter, verify that the power alarm lamp and the

LEDs are illuminated.

10. Place the faulty power pack in protective static wrapping, and return it to storage for

later repair.

11. Before returning the node(s) to service, diagnose the node by entering the following

at the MCRT:

DGN:nodexx y

where:

DGN = Requests the run of all diagnostics phases



7/29/2019 172254





NOTE:Before unconditionally restoring the node to the ring, it is strongly recommendedthat at least Phase 1 and Phase 2 diagnostics are run on the node. The aboveprocedure will execute full diagnostics.

12. After diagnostics returns an ATP message, restore node(s) removed from service by

entering the following at the MCRT:

RST:nodexx y ; UCL

where:

node = An LN or RPCNxx = Ring node group number


For further reference see Chapter 6, Diagnostic User’s Guide.

If after replacing the power converter the power failure is not corrected, then there

may be a short in the . If a short on an circuit pack is the cause of a power failure,then the following procedure should be used to correct the malfunction:

Procedure 7-2. Fixing Power Failures Caused by a Shorted Link Node Circuit Pack

1. At the MCRT, determine the affected equipment location.

2. Press the ALM-RLS key to silence the audible alarm.

NOTE:The audible alarm may also be silenced by pressing the ACO key on the controlpanel of the affected RNF/C.

3. At the affected equipment location, locate the nodes affected by the power loss.

4. At the MCRT, removeeither the two associated s or the affected RPCN from service.

Enter the following command:

RMV:nodexx y

where:

node = An LN or RPCNxx = Ring node group number

y = Node position in ring node group (member number)

7/29/2019 172254




401-661-045

UCL = Restore node unconditionally.

5. Isolate the associated RPCNs or s from the active ring. Enter:

CFR:RING a , b ;EXCLUDE

where:

a = Ring node (if b is present, a is the first of a range ofRNs (in the direction of flow on Ring 0).In the form of {RPCNx y | x y}

b = Last node in the range begun by ‘a’ in the same form.EXCLUDE = Request to isolate specified node(s) from the active ring.

6. At the faulty equipment location, unplug all circuit packs affected by the power loss.

This includes either the affected RPCN or two associated s.

! CAUTION:Before removing the affected power pack, ensure that the associated

RPCN or (s) has been removed from service and isolated. Refer to Table 7-1 to determine the proper nodes to remove from service.

7. At the faulty power pack, recycle power to the affected power converter.

8. If the converter does not turn on with no load on it, then replace the CP. Place the

faulty power pack in protective static wrapping and return it to storage for later repair.

9. If the converter powers up, try replacing each suspect CP one-at-a-time. At the faulty

equipment location, plug in each circuit pack removed in Step 6. The CP with the

short will power down the power converter.

10. Replace the faulty circuit pack with a new one.

11. If the problem is corrected after replacing the faulty CP, place the faulty CP in

protective static wrapping and return it to storage for later repair.

12. At the RN control panel, press the PWR ALM RESET key to restore the frame/



at the MCRT:

DGN:nodexx y

7/29/2019 172254





where:

DGN = requests the run of all diagnostics phasesnode = An LN or RPCN

xx = Ring node group number


NOTE:Before unconditionally restoring the node to the ring, it is strongly recommended

that at least Phase 1 and Phase 2 diagnostics are run on the node. The aboveprocedure will execute full diagnostics.

14. After diagnostics returns an ATP message, restore the node(s) removed from

service by entering the following at the MCRT:

RST:nodexx y ; UCL

where:node = An LN or RPCN

xx = Ring node group numbery = Node position in the ring node group (member number).


Fuse Description and Replacement Procedures

System interruption and/or the loss of other s or RPCNs may be caused by theloss of a 10-amp fuse on the RNF/C. Also, the loss of a 20-amp fuse on the power

distribution frame (PDF), the 20-amp fuse on the DC power distribution cabinet

(DCPD), or the 25-amp fuse on the Global Power Distribution Frame (GPDF) maycause failure of either one RPCNU or one unit. The loss of a 250-amp fuse at the

battery plant could affect a total of four RNFs or RNCs. This causes the failure ofsixty s, and the possible failure of two RPCNs. When this fuse is lost, a major

alarm is triggered in the office and must be corrected as soon as possible.

Procedure 7-3. Fuse Replacement for Ring Node Frame/Cabinet Failures

1. At the MCRT or the affected equipment, determine the blown fuse location.

2. At the MCRT, press the ALM-RLS key to silence the audible alarm.

7/29/2019 172254




401-661-045

NOTE:The audible alarm may also be silenced by pressing the ACO key on the controlpanel of the affected RN frame/cabinet.

3. To avoid ring interruption, the affected ring nodes should be taken out of service andisolated from the active ring before the power converter is removed. If the RNs are

not already OOS and isolated, enter the following commands:

RMV:nodexx y CFR:RING a, b ;EXCLUDE

where:node = LN or RPCN

xx = The ring node group numbery = Position in the ring node group (member number).

a = Ring node (if b is present, a is the first of a range

of RNs (in the direction of flow on Ring 0).In the form of {RPCNx y | x y}b = Last node in the range begun by ‘a’ in the same form.EXCLUDE = Request to isolate specified node(s) from the active ring.

4. At the faulty equipment location, unseat the affected power converter (that which is

associated with the blown fuse and OOS nodes).

5. Replace the faulty fuse.

6. Reseat the power converter. If the fuse does not blow again, proceed to Step 8.

7. Otherwise, the power converter must be replaced:

s unseat the affected power converter,

s insert a new fuse, replace the power converter,

s place the faulty power converter in protective static wrapping,

s and return it to storage for later repair.

8. At the RN control panel, press the PWR ALM RESET key to restore the frame/


9. The lamp test key can be used to test the power alarm (PA) and fuse alarm (FA)

lamps.


at the MCRT:

DGN:nodexx y

7/29/2019 172254





where:

DGN = Requests the run of all diagnostics phasesnode = LN or RPCN


NOTE:Before unconditionally restoring the node to the ring, it is strongly recommendedthat at least Phase 1 and Phase 2 diagnostics are run on the node. The aboveprocedure will execute full diagnostics.



RST:nodexx y ; UCL

where:

node = LNor RPCNxx = Ring node group number



Disruption of either one unit or one RPCNU may be caused by a blown 20-amp

fuse on the PDF or DCPD. Loss of the fuse also affects the two power converterson the or RPCN unit.

Procedure 7-4. Fuse Replacement for Power Distribution Frame/Cabinet Failures

1. At the MCRT, determine affected equipment location. Locate the PDF or DCPD

blown fuse and the RN equipment affected by it.

2. At the affected RN control panel, press the ALM-RLS key to silence the audible

alarm.

NOTE:The audible alarm may also be silenced by pressing the ACO key at the affected

RNF/C.

7/29/2019 172254




401-661-045

3. To avoid ring interruption, the affected ring nodes should be taken out of service and

isolated from the active ring before the power converter is removed. If the RNs are

not already OOS and isolated, enter the following commands:

CFR:RING,a, b ;EXCLUDERST:nodexx y ; UCL

where:

a = Ring node (if b is present, a is the first of arange of RNs (in the direction of flow of Ring 0).

b = Last node in the range begun by ‘a’.EXCLUDE = Request to exclude specified node(s) from the active ring.

node = LN or RPCNxx = Ring node group numbery = Node position in the ring node group (member number).

4. At the faulty equipment location, unseat the affected power converters and circuit

packs. Remove the fan fuse(s).

5. At the PD frame/cabinet, remove the blown fuses (both the main and indicator

fuses). The GPDF does not have indicator fuses.

6. Insert the charging tool into the indicator fuse slot, and press the charge key on the

PD control panel. The GPDF does not have a charging probe.

When this key is pressed, the charge indicator LED illuminates and slowly decays

to off as the fuse location becomes fully charged.

7. Insert a new 20A main fuse and remove the charging tool. The GPDF uses

a 25-amp fuse.

8. Reinsert the indicator fuse.

9. At the affected RNF/C, reseat the power converters and replace the fan fuse.

10. Reseat all circuit packs.

If all fuses hold (on both the RNF/C and the PD frame/cabinet), proceed to the

next step. Otherwise, correct the problem using guidelines for the appropriatecondition.

11. At the RN control panel, press the PWR ALM RESET key to restore the frame/ cabinet to normal operation.


at the MCRT:

7/29/2019 172254





DGN:nodexx y

where:

DGN = Requests the run of all diagnostics phases



NOTE:Before unconditionally restoring the node to the ring, it is strongly recommendedthat at least Phase 1 and Phase 2 diagnostics are run on the node. The above

procedure will execute full diagnostics.



RST:nodexx y ; UCL

where:node = LNor RPCN


For further reference see, Chapter 6, Diagnostic User’s Guide

Procedure 7-5. Fixing Blown Fuse or Power Failures of the Digital Facility Access

Frame/Cabinet

There are also cases where fuses and power failures may occur on the digitalfacility access (DFA) frame/cabinet or the analog facility access frame (AFAF).

1. At the affected equipment control panel, press the ACO key to silence the alarm.

2. At the affected equipment location, locate the blown fuse(s).

3. Unseat the appropriate 495H1 and the 393A power converters (those associated

with the blown fuse or fuses).

4. At the fuse location, replace the blown fuse(s).

5. Reseat both the 495H1 and the 393A power converters.

7/29/2019 172254




401-661-045

6. When powering up the DFA frame/cabinet, a major alarm may be activated before

the power converters stabilize. If a major alarm sounds, continue; otherwise, the

problem is corrected.

7. At the DFA control panel, press the POWER ALARM RESET key to restore theframe/cabinet to normal operation.

8. Press the ACO key to silence the alarm.

Procedure 7-6. Fixing Blown Fuse or Power Failures of the Analog Facility Access

Frame

1. At the affected equipment control panel, press the ACO key to silence the alarm.

2. At the affected equipment location, locate blown fuse(s).

3. Unseat the associated 133K and the 130D power converters.

NOTE:Ensure the correct power converters are removed (those associated with the

blown fuse or fuses).

4. At the fuse location, replace the blown fuse(s).

5. Reseat both the 133K and the 130D power converters.

6. At the AFAF control panel, press the POWER ALARM RESET key to restore the

frame to normal operation.

7. If the alarm is due to a power failure in the fan system, do the following:

a. At the affected AFAF, replace the blown fuse. If the fuse blows again,proceed to Step b; otherwise, the problem is corrected.

b. Replace the fan or restore it to an operational state.

c. On the 64C2 data mounting unit, press the alarm reset key.

d. Press the alarm reset (ARS) key.

7/29/2019 172254





Fan and Filter Maintenance

Two frames/cabinets are equipped with fan units: the Ring Node Frame/Cabinet

(RNF/C) and the Analog Facility Access Frame (AFAF). Each fan unit has a

removable wire mesh air filter. When a fault is detected in one of the fans, the fanalarm (ALM) lamp on the unit and the power alarm (PWR ALM) lamp at the controlpanel both illuminate. Since the fans are used for cooling, corrective action mustbe taken as soon as possible.

The fans should be checked for proper operation every 6 months. Also, the filters

should be cleaned and, if necessary, replaced every 6 months.

Ring Node Frame Fan Unit Description

The Ring Node Frame (RNF) fan unit contains three fans (1, 2, and 3) and a fan

failure detector, with each fan being powered through individual fuses. These

fuses are in a panel at the base of the RNF. The fans are located just above thefuse panel to force cooled air up through the entire frame and thus maintain the

proper operating temperature. An RNF should be able to function properly withthe loss of one fan, but with the loss of two fans, the equipment rapidly overheats.

If there is only one operational fan in an RNF and there are no office spares, thena fan must be taken from another RNF and placed in the faulty unit. It is imperative

that each RNF have at least two operational fans. It is also recommended that theoffice has two spare fans.

Ring Node Cabinet Fan Unit Description

The Ring Node Cabinet (RNC) fan unit contains four fans (1, 2, 3, and 4) and a fan

failure detector, with each fan being powered through individual fuses. Thesefuses are in a panel at the base of the RNF. The fans are located at the bottom of

the cabinet to force cooled air up through the entire cabinet and thus maintain theproper operating temperature. An RNC should be able to function properly withthe loss of two fans, but with the loss of three fans, the equipment rapidly

overheats. If there is only one operational fan in an RNC and there are no officespares, then a fan must be taken from another RNC and placed in the faulty unit. It

is imperative that each RNC have at least two operational fans. It is alsorecommended that the office have two spare fans.

Analog Facility Access Frame Fan Unit Description

In the AFAF, there is one fan unit for each equipped data set unit. Thus, eachframe can have up to two fan units. An AFAF fan unit contains three fans, but isreplaceable only as a unit. Power for each unit is through individual fuses located

in the fuse panel at the base of the frame. The data set unit power converterprovides fan failure detection. The fan unit forces cooled air through the data set

7/29/2019 172254




401-661-045

mounting to maintain the proper operating temperature. Although the data sets

can function properly with a fan unit failure, corrective action should be taken assoon as possible.

Fans in standard and K-cabinets have six fans in the middle of the cabinet; threefans in front and three fans in back. The three fans in front cool the upper half of

the cabinet, and the three fans in back cool the lower half of the cabinet. Thesefans vary in speeds from 1700 RPM to 3400 RPM. The LEDs and toggle switch for

the fans are located on the back of the cabinet.

When a fan failure is detected (as indicated by the ALM and PWR ALM lampsilluminating), one of the following procedures should be used to correct the fault.

Procedure 7-7. Ring Node Frame/Cabinet Fan Replacement Guidelines

1. At the control panel of the affected RNF, retire any audible alarm by pressing the

ALARM CUTOFF key.

2. At the fuse panel, ensure there are no loose or blown fuses. If replacing a fuse

corrects the problem, do not replace the fan, but proceed to Step 8.

3. Power down the faulty fan by releasing the associated fuse (BF0, DF1, or FF2).

4. At the front of the unit, disconnect the faulty fan from the unit by unplugging the 48 V

DC power cabling to the fan.

5. Remove the fan by loosening the two screws on the face of the fan and sliding thefan out the front of the unit.

6. Secure the new fan in place with the two screws, and plug in the power cable.

7. At the fuse panel, reinsert the associated fuse.

8. At the fan unit, press the black FAN ALM RST key. This should extinguish the FAN

ALM lamp.

9. At the control panel, press the PWR ALM RESET key to restore the frame to normal

operation.

7/29/2019 172254





Procedure 7-8. AFAF Fan Replacement Guidelines

1. At the control panel of the affected AFAF, retire any audible alarm by pressing the

ALARM CUTOFF key.

2. At the fuse panel, ensure there are no loose or blown fuses. If replacing a fuse

corrects the problem, do not replace the fans, but proceed to Step 8.

3. Power down the faulty fan unit by releasing the associated fuse (AF0 or BF1).

4. At the rear of the unit, disconnect the unit by unplugging the 48 V DC power cabling.

5. At the front of the unit, remove the fans by loosening the two screws on either side of

the unit (just above the filter) and sliding it out the front.

6. Secure the new unit in place with the two screws and plug in the power cable.

7. At the fuse panel, reinsert the associated fuse.

8. At the right of the data unit, set the ON/RST toggle switch to the RST position and

then back to the ON position. This should extinguish the FAN ALM lamp.

9. At the control panel, press the PWR ALM RESET key to restore the frame to normal

operation.

Filter Maintenance

The air filters are intended to eliminate dust from the cooling air. Dust buildup onframe circuitry could lead to improper system operation. Although no alarms are

associated with the fan filters, they must be properly maintained by periodicreplacement.

The RNF/C filters are positioned horizontally just above the fan unit. To replacethe RNF/C fan filter, simply slide it out the front of the frame/cabinet. On frame

installations, remove the handle from the old filter and attach it to the new filter. Oncabinet installations, simply replace the old filter.

The AFAF filters are positioned horizontally just below the fan unit(s). To replacethe AFAF data unit fan filter, the data unit cover must first be opened. The filterthen simply slides out the front of the frame.

7/29/2019 172254




401-661-045

In the newer cabinets, the filters are above and below the front fan unit. To replace

the filter, slide the filter out of the cabinet and replace it with a new filter.

Ring Node Circuit Pack HandlingPrecautions

Before any RN equipment is replaced on a functional ring, certain handlingprecautions must be observed. This Section presents some of those precautions.

Before removing, installing, or handling any ring node CP, proper ground must be

made to avoid damaging or further damaging the CP. If proper ground is not madebefore handling the CP, static electricity may damage it. To properly avoid thisdischarge of electricity, a static control wrist strap (3M-2200 series) must be worn

at all times when handling RN CPs.

Before touching the CP, connect the wrist strap lead to a nonelectrical metallicportion of a frame/cabinet or any appropriate location where repairing or handling

CPs. The wristband portion of the strap must be placed around the wrist.

The 3M-2200 series wrist strap must also be worn when handling new or repairedCPs. New CPs are always wrapped in a static protective wrapper to avoid staticdischarge damage. Therefore, when handling a new CP, keep it in the static-proof

wrapper until the appropriate ground connections are made and the pack is readyto be inserted. Also, when handling old or defective CPs, static precautions must

be observed as with handling a new CP. The static discharge can cause furtherdamage to a CP, thereby affecting repair procedures. The old or defective CP

should be wrapped in the protective wrapping, labeled with diagnostic failureinformation, and returned for repair.

When a ring node CP is pulled for inspection, or for the purpose of replacement,the pack and the connections must be checked to ensure that:

s Backplane pins do not come out with the pack

s No pins are bent when the replacement CP is inserted. Extreme care mustbe used when handling the ring interface CPs. These CPs requireconsiderable force to insert and remove. Therefore, whenever replacing or

inspecting these CPs, check them carefully and use care in applyingpressure to them.

7/29/2019 172254





Ring Node Equipment Visual Indicators

Located on most ring CPs are visual indicators that indicate faulty or

out-of-service (OOS) states. They indicate when particular maintenance functions

may or may not be performed. These indicators, the ring quarantine (RQ), notoken (NT), error, and diagnostic fail lamps, are found on the NP, RI1, IRN, RAP,AP, and circuit packs.

The RQ visual indicator is located on the NP, IRN, and the LI4 CPs, and indicatesthat the circuit is presently in the OOS maintenance state but is still part of the

active ring. The NT lamp is located on the RI1 and IRN circuit packs and indicatesthat there is no token message traversing the ring. This is an indication that the

node (RPCN or) is in the OOS maintenance state and is isolated from the activering. When a node’s NT lamp is illuminated, any CP may be removed from that

node without affecting system operation. The attached processor uses a red LEDto indicate an error. A red LED also indicates diagnostic failures on a RAP board.

The PWR ALM lamp illuminates on the RN control panel for:

s Fuse failure

s Unplugged power converter

s Fan unit failure. If more than one fan fails, a major alarm sounds. If theproblem is not corrected, a total RNF/RNC failure may occur.

The NT lamps are also adjacent to nodes equipped with IFBs. Before any IFBcircuit pack can be replaced, the NT lamps of both adjacent nodes must be

illuminated. There are only two IFBs per frame/cabinet. These are located at theRPCN node if equipped, or the first and last of the RNF/C. Since the IFB is

adjacent to one node within its own RNF/C and another in the next RNF/C in line,

the NT lamp adjacent to the suspected IFB on the associated frame/cabinet, andthe NT lamp on the frame/cabinet next in line must be illuminated before the IFBcircuit pack can be extracted.

Removing Affected Equipment From Service

When service has been interrupted because of faulty equipment, or when system

maintenance requires replacing CPs, the node associated with the equipmentmust be removed from service. It is important to note that if there is another

isolated segment on the ring, caution must be exercised. All affected nodes andequipment must first be removed from service before any equipment can bereplaced. Removal of a node in this case could create a larger isolated segment.

Therefore, all isolated segments on the ring should be corrected before othermaintenance functions are performed on the ring. For example, if there is an

isolated segment on the ring and another trouble is detected 50 nodes away, the

7/29/2019 172254




401-661-045

original isolation should be corrected before attempting to correct the new

problem. This eliminates the possibility of expanding the isolated segment overthe additional 50 nodes.

System software puts faulty equipment OOS in one of two manners: normally andisolated. By taking it OOS normally, the system leaves it in the OOS-NORMAL

maintenance state. In this state, the equipment is still part of the active ring.However, when the system removes the equipment from service and isolates it

from the active ring, it is in the OOS-ISOLATED maintenance state. In this state,the node is a functional part of the ring for maintenance purposes only. Equipment

Replacement Procedures

Before any ring node equipment involving CPs is replaced or handled, all

precautions and illuminated LEDs must be observed. When performingdiagnostics, faulty CPs are listed in the manual trouble locating process.

Therefore, all precautions must be followed before replacing these CPs.

Following is a summary of the sequence of events that must take place whenreplacing equipment. When a malfunction or faulty equipment is detected:

1. Press the alarm cutoff (ACO) button at the affected equipment, or the ALM-RLS key

at the MCRT, to silence the audible alarm.

2. Before attempting to change, inspect, or handle any CP, ground yourself using the

static control wrist strap (3M-2066).

3. At the faulty equipment location, determine which CP is faulty. On the RNFs or

RNCs, nodes are grouped closely together. Individual CPs are distinguishable by a

color-coded bar above and across each ring node unit. To ensure that the proper

pack is removed, examine each color-coded bar before any pack is extracted. Usingthe identification numberson the faulty CP (be sure tocheckmicrocode, version, and

issue), obtain the proper replacement CP.

4. Make sure the wrist strap is grounded and remove the suspect CP.

5. Insert the replacement CP from the storage cabinet.

6. Wrap up the old CP and place it in a carton for return.

7. Perform diagnostics on any affected equipment, and if all goes well, restore it to

service.

8. If diagnostics fail, the faulty CP may have not been removed. At the replacement CP,

ensure that the proper LEDs are illuminated for the type of CP replaced:

7/29/2019 172254





RI1 The NT lamp on this CP is illuminated and both RQ lamps are

illuminated on the NP and circuit packs.

RI0 The RQ lamp on the adjacent pack and the NT lamp on the

adjacent RI1 CP is illuminated.

NP The RQ lamp on this CP is illuminated, the adjacent RI1 NT lamp

is illuminated, and the adjacent RQ lamp is illuminated.

Link The RQ lamp on this CP is illuminated, the RQ lamp on the

adjacent NP pack is illuminated, and the RI1 NT lamp isilluminated. The MDL boards are not equipped with LEDs.

IFB The NT lamps adjacent to the IFB are illuminated.

AP Both RQ lamps on the adjacent NP and the CPs are illuminated,along with the NT lamp on RI1.

RAP The RQ lamp on the adjacent IRN is illuminated, and the PCIDand power converter for the RAP are turned off.

IRN The RQ and NT lamps on this CP are illuminated, and the RQlamp on the adjacent circuit pack is illuminated.

The CP names and associated identification numbers are as follows:

s IRN/IRNB UN303 or UN303B (VLSI only)

s IRN2/IRN2B UN304 or UN304B

s IFB-U TN918

s IFB-P TN915

s IFB-4K TN1506

s

IFB-F TN1508s IFB-F TN1509

s IFB-F TN1803

s IFB-F TN4016

s 3BI TN914

s DDSBS TN69B

s LI TN916 or TN1317

s LI4S TN1316

s LI4D TN1315

s T1FA UN291

s LI4S TN1316

s 12A Applique APA12

7/29/2019 172254




401-661-045

s AP:

— AP68 TN1340 (2 meg) or TN1641 (8 meg) for DLN

— AP30 TN1630 for DLNE or DLN30

— AP30’ TN1630B with 64-Mbyte mezzanine memory for DLNE-AP30’or CDN-II

— AP30’ TN1630B with 64- to 256-Mbyte mezzanine memory for

CDN-IIx

s NPI TN1349

s RAP 3B15 computer boards

— CCC UN237 (1) for 2-mbyte, UN626 for 16-mbyte

— CCS UN236 (1) for 2-mbyte, UN625 for 16-mbyte

— MASC UN95 (1-6) or UN507 (1) for 16-mbyte memory board option

— MASA TN56 (1-48) or TN1398 (1-8) for 16-mbyte memory boardoption

— PCID TN1128.

As stated earlier, all faulty equipment must be OOS before maintenance isperformed. If the equipment has not been automatically made OOS, then it mustbe manually removed from service before any CPs are handled. Ring node CPs

must be isolated before they can be removed. Also, caution is again stressedwhen isolating nodes in a ring that already contains isolated nodes. To avoid

increasing the size of the original ring isolation, problems associated with theprevious ring isolation should be corrected before isolating any other nodes. This

can be dangerous, in that the isolation may isolate too large of a segment on the

ring, thereby not leaving enough active nodes to have a sufficiently operationalring.

Procedure 7-9. Ring Hardware Circuit Pack Replacement Procedures

The following are guidelines for removing, inspecting, or handling CPs located in

an IUN, RPCN, DLN, or CDN unit. These are the RI0, RI1, NP, 3BI, DDSBS, IRN,NPI, AP, CCS, CCC, MASC, MASA, PCID, LI4D, LI4S, APA12 and IFB circuit

packs.

7/29/2019 172254





When replacing circuit packs in ring nodes, it is important that the proper node

and associated nodes are removed and isolated. There are two power supplies foreach shelf, each power supply feeding 1 ring nodes. Table 7-2 displays additional

nodes that must be isolated and removed when replacing a circuit pack in node.

Assumption: Diagnostics have determined that there are faulty CPs in a node(s)on the ring.

1. At the MCRT, press the ALM-RLS key if necessary to silence alarms.

NOTE:An audible alarm may also be silenced by pressing the ACO key at the affected

RNF/C.

2. If the node with the faulty CP and associated nodes have not been removed from

service, remove them. Refer to Table 7-1 to determine which nodes to remove and

isolate. At the MCRT, enter:

Table 7-2. Ring Node Power Supply Index

REPLACE CIRCUITPACK INRING NODE:

REMOVE ANDISOLATE NODES:

1 1,2

2 1, 2, 3

3 2, 3

4 4, 5

5 4, 5, 6

6 5, 6

7 7, 8

8 7, 8, 9

9 8, 9

10 10, 11

11 10, 11, 12

12 11, 12

13 13, 14

14 13, 14, 15

15 14, 15

7/29/2019 172254




401-661-045

RMV:nodexx y

where:

node = RPCN or LN

xx = Ring node group number (00-63)y = Node position in the ring node group. (0 for RPCN, 1-15

for )

3. At the MCRT, isolate the associated node from the active ring. Enter:

CFR:RING a, b ;EXCLUDE

where:a = Ring node (if b is present, a is the first of

a range of RNs (in the direction of flow on Ring 0).In the form of {RPCNx y | x y}

b = Last node in the range begun by ‘a’ in the same form.

EXCLUDE = Request to isolate specified node(s) from the active ring.

4. At the faulty equipment location, obtain CP identification for the faulty pack. Get the

proper replacement CP (use caution handling the new pack).

5. Ensure that the appropriate node is OOS, proper LEDs are illuminated, and that you

are properly grounded to avoid static discharge.

6. Replace the faulty/suspected CP.

NOTE:Ensure that the adjacent NP and (LI4 and APA12) CP RQ lamps are illuminated

before removing either of these affected CPs.

NOTE:Ensure that the adjacent RI1 NT and the adjacent RQ lamps are both illuminated

before removing either of these CPs.

NOTE:Since most CPs require considerable force to insert or remove, extreme cautionmust be exercised. Carefully inspect the CP edge connector and the backplane

connector for bent or missing pins.

7. Place the old (or faulty) CP in the protective static wrapping, and return it to the

storage cabinet for later repair.

8. At the affected RN control panel, press the PWR ALM RESET button to restore the

frame/cabinet to normal operation.

7/29/2019 172254





9. Diagnose the node by entering the following at the MCRT:

DGN:nodexx y

where:DGN = Requests the run of all diagnostics phasesnode = LN or RPCN


NOTE:Before unconditionally restoring the node to the ring, it is strongly recommended

that at least Phase 1 and Phase 2 diagnostics are run on the node. The aboveprocedure will execute full diagnostics.



RST:nodexx y ; UCL

where:node = LNor RPCN



UN122C and UN123B Combination Circuit Pack

Installation

The UN122C and UN123B CPs are used for the token tracking feature. Eachframe must contain at least one UN122C and UN123B in a node to allow for tokentracking capability.

1. Determine which CPs are to be used for token tracking.

2. The selected node and all nodes sharing the same FA495 converter must be

isolated from the active ring. If the UN122C and UN123B candidate node is at the

end of the unit, the middle node must also be removed from service. If the node is in

the middle of the unit, all three nodes on the unit must be removed.

3. To be sure the minor linkstate of the token tracking node is in the MOOS state,enter:

CHG:SLK=a-b :MOOS

7/29/2019 172254




401-661-045

4. To request full diagnostics on the token tracking node, enter:

DGN:LNa=b

5. Resolve all troubles if the diagnostics fail.

6. To be sure the minor state of the neighbor node(s) is in the MOOS state, enter:

CHG:SLK=a-b :MOOS

7. To remove appropriate neighbor nodes from ring service, enter:

RMV:LNa=b

8. Isolate the token tracking node and the neighbor nodes from the active ring. Enter

this command for each of the nodes:

CFR:RING,LNa=b :EXCLUDE

9. Replace the existing CPs with the new UN122C and UN123B CPs. Be sure to use a

wrist strap to protect from electrostatic discharge.

10. Update the in-core ECD for the token tracking node. First, change the UCB major

state from OOS to GROW. Update the hv values. Now change the major state from

GROW to OOS. See Table 7-3 for the appropriate hv values.

11. To request a full diagnostics on the token tracking node, enter:

DGN:LNa=b

12. Wait for the diagnostics on the token tracking node to run all test pass (ATP). From

the maintenance terminal, go to the 199 page and execute the “activate” RC/V formto copy the in-core copy of the ECD to disk.

13. To restore the neighbor nodes, enter:

RST:LNa=b

14. If the token tracking node is an IUN node, run diagnostic phases 12 and 13 on the

token tracking node. If the token tracking node is an RPC node, run diagnostic

phases 32 and 33. After these diagnostics run ATP, enter the following to restore the

token node:

RST:LNa=b

7/29/2019 172254





Table 7-3. Hardware Version Values (with IFB) (Page 1 of 2)

NP, RI0, & RI1

CPs*

POSITION

IN RNF/C

HV VALUE FOR IFB TYPE

TN918 TN915 TN1506 TN1508 TN1509TN1803

TN913

UN122UN123

Lowest 0x0001 0x0002 0x0004 0x0005 0x0006

Highest 0x0010 0x0020 0x0040 0x0050 0x0060

TN913UN122BUN123B

Lowest 0x0801 0x0802 0x0804 0x0805 0x0806

Highest 0x0810 0x0820 0x0840 0x0850 0x0860

TN913UN122B†

UN123B†

Lowest 0x1001 0x1002 0x1004 0x1005 0x1006

Highest 0x1010 0x1020 0x1040 0x1050 0x1060

TN913

UN122CUN123B

Lowest 0x1801 0x1802 0x1804 0x1805 0x1806

Highest 0x1810 0x1820 0x1840 0x1850 0x1860

TN913

UN122C†UN123B†

Lowest 0x2001 0x2002 0x2004 0x2005 0x2006

Highest 0x2010 0x2020 0x2040 0x2050 0x2060

TN922UN122UN123

Lowest 0x0101 0x0102 0x0104 0x0105 0x0106

Highest 0x0110 0x0120 0x0140 0x0150 0x0160

TN922UN122B

UN123B

Lowest 0x0901 0x0902 0x0904 0x0905 0x0906

Highest 0x0910 0x0920 0x0940 0x0950 0x0960

TN922UN122B†

UN123Bq

Lowest 0x1101 0x1102 0x1104 0x1105 0x1106

Highest 0x1110 0x1120 0x1140 0x1150 0x1160

TN922

UN122CUN123B

Lowest 0x1901 0x1902 0x1904 0x1905 0x1906

Highest 0x1910 0x1920 0x1940 0x1950 0x1960

TN922

UN122C†

UN123B†

Lowest 0x2101 0x2102 0x2104 0x2105 0x2106

Highest 0x2110 0x2120 0x2140 0x2150 0x2160

7/29/2019 172254




401-661-045

UN303 (IRN) Lowest 0x8001 0x8002 0x8004 0x8005 0x8006

Highest 0x8010 0x8020 0x8040 0x8050 0x8060

UN303† (IRN) Lowest 0x8801 0x8802 0x8804 0x8805 0x8806

Highest 0x8810 0x8820 0x8840 0x8850 0x8860

UN303B (IRNB Lowest 0x9001 0x9002 0x9004 0x9005 0x9006

Highest 0x9010 0x9020 0x9040 0x9050 0x9060

UN303B† (IRNB) Lowest 0x9801 0x9802 0x9804 0x9805 0x9806

Highest 0x9810 0x9820 0x9840 0x9850 0x9860

UN304B (IRNB) Lowest 0xc001 0xc002 0xc004 0xc005 0xc006

Highest 0xc010 0xc020 0xc040 0xc050 0xc060

UN304B† (IRNB) Lowest 0xc801 0xc802 0xc804 0xc805 0xc806

Highest 0xc810 0xc820 0xc840 0xc850 0xc860

* The RI CPs may be equipped with the Long Message Strap (LMS). This option is indicated in thesetables within the † symbol next to the CP number. Otherwise, the RI is not equipped with the LMSoption.

Table 7-3. Hardware Version Values (with IFB) (Page 2 of 2)

NP, RI0, & RI1CPs*

POSITIONIN RNF/C

HV VALUE FOR IFB TYPE

TN918 TN915 TN1506 TN1508 TN1509

TN1803

7/29/2019 172254





Example: RI types UN122/UN123B

Remove the letter suffix (B) from the UN122/UN123B RI board code. Then look for

the UN122/UN123 RI type, your node processor (NP) type, and the interframebuffer (IFB) type required, to locate the hardware version value.

The IFB unit name indicates the buffer capacity and the ring speed. In caseswhere it is necessary to identify a specific IFB, the following terminology and

convention should be used:

Table 7-4. Hardware Version Values (No IBF)

NP, RI0, & RI1 CPs*

* The RI CPs may be equipped with the Long Message Strap (LMS). This option isindicated in these tables with the † symbol next to the CP number. Otherwise, the RI isnot equipped with the LMS option.

HV VALUE

TN913, UN122, UN123 0x0000

TN913, UN122B, UN123B 0x0800

TN913, UN122B†, UN123B† 0x1000

TN913, UN122C, UN123B 0x1800

TN913, UN122C†, UN123B† 0x2000

TN922, UN122, UN123 0x0100

TN922, UN122B, UN123B 0x0900

TN922, UN122B†, UN123B† 0x1100

TN922, UN122C, UN123B 0x1900

TN922, UN122C†, UN123B† 0x2100

UN303 (IRN) 0x8000

UN303† (IRN) 0x8800

UN303B (IRNB) 0x9000

UN303B† (IRNB) 0x9800

UN304B (IRNB) 0xc000

UN304B† (IRNB) 0xc800

7/29/2019 172254




401-661-045

Example: IFB-4K/6

This is an IFB with 4K bytes of buffer running at the ring speed of 6 Mhz.

The following information is a summary of current IFBs:

The plain term IFB should be used whenever it is not necessary to refer to a

particular vintage of this circuit.

Voice Frequency Link Hardware Equipment

Replacement Procedures

The voice frequency link (VFL) is composed of a VFL access CP (TN919) anda 2024A or a 2048A data set (the latter is used for 4.8 Kbps applications). The

following are guidelines and precautions for replacing a VFL access CP or a dataset.

Procedure 7-10. Voice Frequency Link Access Circuit Pack Replacement Procedures

1. At the affected equipment location or the MCRT, silence any audible alarm by

pressing the ACO key or the ALM-RLS key.



EXISTING CONVENTION CODE NEW CONVENTION

IFB TN918 IFB (IFB-16)

PIFB padded IFB (IFB-P) TN915 IFB-P (IFB-512)

TN1506 IFB-4k/6

TN1508 IFB-16/8

TN1509 IFB-4k/8

TN1803 IFB-4k/8

7/29/2019 172254





3. Obtain the replacement VFL access CP.

! CAUTION:

Keep the CP in the protective wrapping until it is ready to be inserted in the frame/cabinet.

4. At the MCRT, put the SLK in the UNAV-TEST state. Use the “Change Analog SLK

VFL Access Circuit Board Procedures” in the section referred to above.

NOTE:If the SLK is already in the AVL-OOS state, it can be moved directly to the

UNAV-TEST state without first being moved to the AVL-MOOS state.

5. At the affected equipment location, remove the suspect VFL access CP and insert

the new CP.

6. Wrap the suspect CP, and place it in a carton to be returned for repair.

7. Restore the SLKto service. Use the “ChangeAnalogSLK VFL Access Circuit Board

Procedures” in the section referred to above.

Procedure 7-11. Data Set Replacement Procedures

1. At the affected equipment location or the MCRT, silence any audible alarm by

pressing the ACO key or the ALM-RLS key.



3. Obtain the replacement data set.

! CAUTION:Keep the data set in the protective wrapping until it is ready to be inserted in the frame/cabinet.

4. At the MCRT, put the SLK in the UNAV-TEST state. Use the “Change Analog SLK

Data Speed Procedures” in the section referred to above.

7/29/2019 172254




401-661-045

NOTE:If the SLK is already in the AVL-OOS state, it can be moved directly to theUNAV-TEST state without first being moved to the AVL-MOOS state.

5. At the back of the data set unit, remove the appropriate data set cables and thesuspect data set.

6. On the data set unit, verify that the rise time option switches are set correctly:

s In the open position, the r ise time is set for fast.

s In the closed position (toward numbers), the rise time is set for slow.

7. Insert the new data set and connect the data set cables.

8. Wrap the suspect data set, and place it in a carton to be returned for repair.

9. Set the data set options and restore the SLK to service. Use the “Change AnalogSLK Data Speed Procedures” in the section referred to above.

7/29/2019 172254


Issue 16.0 December 2000 A-1


A

Ring Error Analysis and Recovery

Introduction

This appendix provides information about the ring node portion of the ring erroranalysis and recovery mechanisms. The error handling for ring errors is split

between the node and the 3B21D. When an error is detected by a node, that nodewill perform some recovery action and then report the error by sending a message

to the 3B21D. The 3B21D will then take some corrective action and notify the craftvia message printed on the ROP. This document describes all errors reported tothe 3B21D by the node. Included is a description of the error, the recovery action

taken by the node, and the state of the node after the recovery is complete.

Data Structures

The following structures define the error message the node sends to the 3B21D.

Throughout this document, this message will be referred to as the “errormessage” when discussing data that will be sent from the node to the 3B21D.Normally when an error occurs, the node will send error messages on both rings

to the 3B21D. This ensures that a message will reach the 3B21D. In some cases,this is not possible and this will be noted as otherwise.

This is the 3B21D view of the error message layout. See header file ims/com/

head/ims_emsgs.h for the NP view.

7/29/2019 172254


A-2 Issue 16.0 December 2000


401-661-045

General Information

In the following descriptions, the terms “upstream node” and “downstream node”will be used. These terms describe relative positions of nodes and are based on

the direction of data flow on the rings. Basically, any particular node will RECEIVEdata from its “upstream” neighbor and will SEND data to its “downstream”

neighbor. Since the data flows in opposite directions on the two rings, a node’supstream neighbor on ring 1 is the downstream neighbor on ring 0 and its

upstream neighbor on ring 0 is the downstream neighbor on ring 1.

struct immemsg

{struct immsg_hd immh; /* IMS mtce. message header */

NODE_PADD node; /* phys. addr. of ring node */ unsigned char imm_etype; /* IMS error message type */

unsigned char erring; /* faulty ring */ union vardata {

struct {

union{struct header dhead; /* header from failing msg */

struct{short tokblk; /*Blockage occurred on the token */

short flsint;/ * False interrupt indicator */ short spare2;

short spare3;

} misc;} un;

struct _riracstat ports;/ * the rac error ports */ struct _riracstat opports;/ * opposite rac ports */

} specific;unsigned char dchar[24]; /* general information */

unsigned short dshrt[12];long dlong[6];

} data;

};

7/29/2019 172254





The following pages contain several headings. The “error code” is the defined

symbol for the particular error and is placed in the immemsg.imm_etype field inthe error message. The faulty ring is indicated in the erring field in the error

message.

The “description” is a detailed description of the error and the “node recovery

action” is a description of the node recovery process. The “variable data” is adescription of the variable data in the error message. This data is intended to be

used by the 3B21D when analyzing the error and will differ depending on the errortype. There may be other data in the error message that is provided to be printed

at the ROP.

The “ROP data” is a description of the data that is printed on the ROP. This data is

taken from the error message.

The error message will be in the following general form:


See the output manual page for the complete description of the ROP output

message. When this message is printed, various data fields will be included in theprintout, and it is assumed that data taken from the error message from the node

will be printed in the following order:

0xAAAAAAAA 0xBBBBBBBB 0xCCCCCCCC 0xDDDDDDDD

0xEEEEEEEE 0xFFFFFFFF(TTTTTTTTTT)

AAAAAAAA - immemsg.data.dlong[0]

BBBBBBBB - immemsg.data.dlong[1]

CCCCCCCC - immemsg.data.dlong[2]DDDDDDDD - immemsg.data.dlong[3]

EEEEEEEE - immemsg.data.dlong[4]

FFFFFFFF - immemsg.data.dlong[5]

TTTTTTTTTT - The value of the real time clock.

Blockage Error

Error Code

_RG_BLKG, _RG_RDBLK

7/29/2019 172254




401-661-045

Description

The blockage timer has timed out waiting for transfer of data. The following table

contains the error flags that are used to determine this error. At the present time,

only _RG_BLKG is reported to the 3B21D, regardless of the type of node. The _RD_RDBLK is provided for future use with the IRN.

IRN - The IRN nodes report blockage in two situations: the downstream

node does not take the data or the read FIFO does not take the

data. The first is called propagate blockage and the latter calledread blockage. Propagate blockage means the downstream nodeis the cause of the fault, whereas read blockage indicates that the

reporting node is at fault.

Node Recovery Action

IRN - The node will be put in force read to clear the ring. This action willremove the token from the ring. After error recovery, the node willbe in total silence.

When the blockage is detected, the error message cannot be

sent on the faulty ring so an error message must be sent on the

opposite ring. Consider a case of blockage on a ring that has anisolated segment. If the error message is sent on the oppositering to the home RPC, it may go through the EISO or BISO nodeand return to the faulty RAC before it reaches the home RPC. If

an error message is sent to each RPC, it has a better chance ofarriving at an RPC before it reaches the EISO or BISO node and

is looped back. Therefore, the node will try to send errormessages to each RPC on the opposite ring.

If a blockage is detected by an EISO or BISO node, the node

cannot send error messages because of the blockage, but it willstill perform the recovery action described above with oneadditional step, which is that inhibit input will be set. The total

effect is that the blockage is not reported and the ring has notoken. The first indication of trouble in the 3B21D is that it will

receive an “unexpected loss of token” error message. See the _RG_NOTOKEN error description.

RAC ERROR FLAGS

IRN ERROR

PRPBLK (_RG_BLKG) Propagate Blockage

RDBLK (_RG_RDBLK) Read Blockage

7/29/2019 172254





If the blockage is a read blockage, the hardware will destroy themessage and switch the RAC to the force propagate mode, the

token will remain on the ring and the ring will continue to operate

normally. The read blockage is reported to the 3B21D with the _RG_RINH error code. This code is used to indicate that the

blockage was the fault of the reporting node and not thedownstream node. The error is reported by sending error

messages on each RPC on the opposite ring.

If a blockage occurs on a broadcast message, the error flags willindicate both propagate and read blockages. This case will behandled as a propagate blockage.

Variable Data

immemsg.data.specific.ports -

Rac status ports from the faulty ring.

immemsg.data.specific.opports -

Rac status ports from the opposite ring. See Notes.

immemsg.data.specific.un.misc.tokblk -

Block on token code, which indicates whether the token was being held by

the node when the blockage timeout occurred. Nonzero values indicatethat the node found evidence that it was holding the token. See file

ims_emsgs.h for details.

ROP Data

BLOCKAGE DETECTED (LN/RPCN)XX YY RAC (0/1)

0xaabbccdd 0xeeffgghh 0xjjkkllmm 0xnnppqqrr

0xssttuuvv 0xwwxxyyzz (TTTTTTTTTT)

aabb - Block on token code (see description above).

ccdd - not used.

ee - The node’s home RPC overflow state (IRN only).

ff - The node’s overload state (IRN only).

gg - The node’s overflow state (IRN only).hh - The node’s silence state (IRN only).

jj - node type, 3 = IRN.

kk - port C, faulty ring.

7/29/2019 172254




401-661-045

ll - port B, faulty ring.

mm - port A, faulty ring.

nnpp - not used.

qq - port E, faulty ring (IRN only).

rr - port D, faulty ring (IRN only).

ss - not used.

tt - port C, opposite ring. See Notes.

uu - port B, opposite ring. See Notes.

vv - port A, opposite ring. See Notes.

wwxx - not used.

yy - port E, opposite ring (IRN only). See Notes.

zz - port D, opposite ring (IRN only). See Notes.

NOTE:This status port information from the RAC is used to transmit the error report. Forthis particular error type, the status is always from the RAC opposite to that onwhich the error occurred. If the error report was sent by an RPC node, this status

information is meaningless.

Hard Ring Parity Errors

Error Code

_RG_HPTY

Description

This error indicates a byte with bad parity has been presented to the input of theRAC. A “hard” parity error is a parity error that cannot be cleared by the node

The faulty byte will not be accepted by the node and the upstream node willeventually detect blockage.

RAC ERROR FLAGS

IRN ERROR

PTYERR Parity error

7/29/2019 172254






Since it has been determined that this is a hard error, inhibit input is set to prevent

the faulty byte from producing recurring error interrupts, then the RAC error

latches are cleared. Because the reporting node will not accept data from theupstream node, that node will report a blockage condition. Error messages aresent on both rings to the home RPC. Inhibit input is set to prevent the error from

producing recurring error interrupts.

Variable Data

msg->specific.ports -


msg->specific.opports -

Rac status ports from the ring that was used to write the error message tothe 3B21D. This information was taken just before the error message waswritten.

ROP Data

RAC PARITY/FORMAT ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)



aa - The node’s home RPC overflow state (IRN only).

bb - The node’s overload state (IRN only).

cc - The node’s overflow state (IRN only).

dd - The node’s silence state (IRN only).

eeffgghh - not used.





nnpp - not used.



ss - not used.


7/29/2019 172254




401-661-045



wwxx - not used.



NOTE:This status port information from the RAC is used to transmit the error report. Inmost cases, this is the RAC opposite to that on which the error occurred. If the

error report was sent by an RPC node, this status information is meaningless.

Orphan Byte Error

Error Code

_RG_ORBYTE

Description

An “orphan byte” has been presented to the input of the RAC. An orphan bytecondition occurs when the RAC is expecting a “C” byte but the byte received is not

a “C” byte. At the present time, the orphan byte is reported to the 3B21D using the _RG_HPTY error code. The _RG_ORBYTE code is provided for future IRN

application.

IRN - In the case of the orphan byte, 2 bytes are accepted into the inputFIFO of the IRN. The bytes are not read into memory and will be

held until the error condition is cleared.

RAC ERROR FLAGS

IRN ERROR

ORBYTE Orphan byte

7/29/2019 172254






IRN - The error interrupt is disabled to prevent recurring interrupts. The

error latches are cleared and the 3B21D is notified of themessage.

Two bytes may have been accepted by the input FIFO. A processor RACreset must be issued to clear the orphan byte(s) from the inputFIFO. The input is inhibited to prevent the input FIFO from

accepting more bytes.

Because the reporting node will not accept data from the upstream node, thatnode will report a blockage condition.

The orphan byte error is reported by sending error messages to each RPC only

on the opposite ring.

Variable Data




Rac status ports from the ring that was used to write the error message to

the 3B21D. This information was taken just before the error message was written.

ROP Data

RAC PARITY/FORMAT ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)











7/29/2019 172254




401-661-045


nnpp - not used.



ss - not used.




wwxx - not used.



NOTE:This status port information from the RAC is used to transmit the error report. Forthis particular error type, the status is always from the RAC opposite to that on

which the error occurred. If the error report was sent by an RPC node, this statusinformation is meaningless.

Soft Ring Parity Error

Error Code

_RG_SPTY

Description

This error indicates a ring parity error occurred but was subsequently cleared by

the recovery routine..

IRN - Because of the difference in the recovery action, orphan byteerrors will not be included in this error class. All orphan byte

errors will be hard errors.

RAC ERROR FLAGS

IRN ERROR

PTYERR Parity Error

7/29/2019 172254






The node was able to clear the parity error latch so the parity error is considered

to be a transient error. Error messages are sent to the home RPC via both rings

and the node will be in its normal operating condition. The RAC port information inthe error message will show the RAC status of the faulty ring before the error wascleared.

Variable Data





the 3B21D. This information was taken just before the error message waswritten.

ROP Data

TRANSIENT RAC ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)












nnpp - not used.


rr - port D, faulty ring (IRN only).ss - not used.



7/29/2019 172254




401-661-045


wwxx - not used.



NOTE:This status port information from the RAC is used to transmit the error report. In

most cases, this is the RAC opposite to that on which the error occurred. If theerror report was sent by an RPC node, this status information is meaningless.

Interframe Buffer Parity Error

Error Code

_RG_IFBP

Description

The upstream interframe buffer has detected a parity error..


Inhibit input will be set on the faulty ring and an error message will be sent on bothrings to the home RPC. The inhibit input is effective at the input of the interframe

buffer. This will cause the node upstream of the interframe buffer to report ablockage.

Variable Data




RAC ERROR FLAGS

IRN ERROR

IFBPF IFB parity error

7/29/2019 172254







ROP Data

INTERFRAME BUFFER PARITY ERROR DETECTED (LN/RPCN)XX YY RAC

(0/1)












nnpp - not used.



ss - not used.




wwxx - not used.





7/29/2019 172254




401-661-045

RAC Output Parity Error

Error Code

_RG_ROPF

Description

A explanation on the RAC hardware is needed to understand this error codewhich is another form of blockage. When a node detects blockage while

propagating a message, the hardware will be set to force read the remainder ofthe message that was being propagated and will then stop the ring. If the

blockage occurred while the node was writing data to the ring, the write is stoppedand the contents of the RAC FIFO are read into memory. As part of the recovery

procedure, the data that was read into memory is checked for valid parity. Bad

parity would explain the blockage because the downstream node will not acceptdata with bad parity.

To get this error, the RAC must have received good data either from the upstream

node or the node processor, but it tried to transmit bad parity to the downstreamnode. This implies the RAC hardware is faulty. If a node reports this error, the

downstream node should have reported a hard parity error.

If this error occurs during a write, a partial message may have been written to the

ring and this will cause one or more downstream nodes to report a read formaterror.

IRN - This error code will not be reported from an IRN if the blockage isa read blockage. In that case, no data will be read into the NP

memory.


The node recovery action will be the same as in the blockage error (_RG_BLKG).

RAC ERROR FLAGSIRN ERROR

PRPBLK Propagate Blockage

Propagate Blockage

7/29/2019 172254





Variable Data





ROP Data

RAC OUTPUT PARITY ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)



aabb - not used.

ccdd - not used.ee - The node’s home RPC overflow state (IRN only).


gg - The node’s overflow state (IRN only).

hh - The node’s silence state (IRN only).





nnpp - not used.



ss - not used.




wwxx - not used.

yy - port E, opposite ring (IRN only). See Notes.zz - port D, opposite ring (IRN only). See Notes.

7/29/2019 172254




401-661-045

NOTE:This status port information from the RAC is used to transmit the error report. Forthis particular error type, the status is always from the RAC opposite to that onwhich the error occurred. If the error report was sent by an RPC node, this status

information is meaningless.

Write Format Error

Error Code

_RG_WFMT, _RG_WRSMM, _RG_WRTOSHRT, _RG_WRLEN

Description

These error codes indicate some error occurred while a node was attempting towrite a message to the ring. At the present time, all write errors are reported with

the _RG_WFMT error code, regardless of the type of node reporting the error.The other error codes are provided for future use with the IRN.

IRN - This error code may indicate one of the following:

a. Write source match error. The node tried to write a messageto the ring, but the source address did not match the node’s

address or the source ring in the message did not match thering being used.

b. Write too short. A “C” byte was presented to the header FIFObefore the FIFO had received enough of the header to

determine the disposition of the message.

c. Write length error. When a write is performed, a counter is

loaded with the length value from the message. If the writeFIFO becomes empty and the write DMA channel asserts the

end of DMA signal (EOD) before the counter reaches zero, awrite length error is indicated. This error means the RAC sawat least the first 6 bytes of the message and was able to

RAC ERROR FLAGS

IRN ERROR

WRSMERR (_RG_WRSMM) Write source match

W2SHRT (_RG_WRTOSHRT) Write to short

WRLEN (_RG_WRLEN) Write length error

7/29/2019 172254





determine the disposition of the message. If this error occurs,

partial message was sent on the ring and downstreamnode(s) may report read format errors (_RG_RFMT).


The write in progress is removed from the write queue and a _RETRY code is

returned to the writer. Inhibit input is set and an error message is sent to the homeRPC on both rings.

Variable Data



msg->specific.opports -Rac status ports from the ring that was used to write the error message tothe 3B21D. This information was taken just before the error message was

written.

msg->specific.dhead -

The header of the message that was being written to the ring.

ROP Data

WRITE FORMAT ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)



aabbccdd eeffgghh - Header of the message that was being written to the ring.





nnpp - not used.


rr - port D, faulty ring (IRN only).ss - not used.



7/29/2019 172254




401-661-045


wwxx - not used.





Read Format Error

Error Code

_RG_RFMT, _RG_RDTO, _RG_RDLEN

Description

.

IRN - Read length error. A “C” byte was received before the end of themessage is reached


The error latch is cleared, and the received message is discarded. An error

message is sent to the home RPC on both rings. Note that IUNs will not report aread format error if it occurs on a broadcast message. Only RPCs will report read

format errors on broadcast messages.

Variable Data



RAC ERROR FLAGS

IRN ERROR

Read timeout

RDLEN (_RG_RDLEN) Read length error

7/29/2019 172254






Rac status ports from the ring that was used to write the error message tothe 3B21D. This information was taken just before the error message was

written.


The header of the message that was being read from the ring.

ROP Data

READ FORMAT ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)

MSG SRC: (LN/RPCN)GG MM, MSG TYPE: (NORMAL/BROADCAST/

SEL BROADCAST/TAKE)



MSG SRC, MSG TYPE - MSG SRC and MSG TYPE are the source node andmessage type respectively, extracted from the first word of the

message header: 0xaabbccdd. When the node is unsuccessful inrecovering the message involved in the READ FORMAT

ERROR, 0xaabbccdd is set to 0xffffffff. .

aabbccdd eeffgghh - Header of the message that was being read from to the

ring. If the node could not recover the message that was readfrom the ring, these fields will be set to 0xffffffff.





nnpp - not used.



ss - not used.




wwxx - not used.



7/29/2019 172254




401-661-045

NOTE:This status port information from the RAC is used to transmit the error report. Inmost cases, this is the RAC opposite to that on which the error occurred. If theerror report was sent by an RPC node, this status information is meaningless.

Received Too Short Error

Error Code

_RG_RDTOSHRT

Description

.

IRN - Read too short. A second “C” byte was received before a

complete ims header had been received.


The error latch is cleared and the partial header is discarded. The node will returnto its normal operating mode. It is assumed that an upstream node mutilated themessage. Error messages are sent to the home RPC on both rings.

Variable Data






RAC ERROR FLAGS

IRN ERROR

R2SHRT Read too short.

7/29/2019 172254





ROP Data

READ TOO SHORT DETECTED (LN/RPCN)XX YY RAC (0/1)



aabbccdd eeffgghh - The partial header that was read into memory. If the nodecould not recover the message that was read from the ring, these

fields will be set to 0xffffffff.





nnpp - not used.



ss - not used.




wwxx - not used.





Read Inhibit Error

Error Code

_RG_RINH

7/29/2019 172254




401-661-045

Description

When a blockage occurs during a write, the data in the FIFO should be transferred

to the NP memory. If a blockage occurs during a read or while propagating a

message, the data up to the next “C” byte should be read into memory. This errorcode is set if it appears that no data was put into memory. Either problemindicates that the RAC hardware is faulty or there is a problem with the DMAC.

This error code indicates that the reporting node caused the blockage, not thedownstream node..

IRN - At the present time, this error code is used in the IRN to report aread blockage.


The recovery action will be the same as that action in the blockage error,

(_RG_BLKG).

Variable Data


ac status ports from the faulty ring.



ROP Data

READ INHBIT ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)



aabb - not used.

ccdd - not used.



RAC ERROR FLAGS

IRN ERROR

PRPBLK Propagate Blockage

7/29/2019 172254











nnpp - not used.



ss - not used.




wwxx - not used.



NOTE:This status port information from the RAC is used to transmit the error report. For

this particular error type, the status is always from the RAC opposite to that onwhich the error occurred. If the error report was sent by an RPC node, this statusinformation is meaningless.

Excessive Ring Command Interrupts

Error Code

_RG_XHCMD

Description

In order to detect and recover from problems caused by certain types of

circulating hardware control messages, ring error interrupts generated byhardware control message execution are counted and thresholded at IRN RPCs.

This report indicates that the number of these ring command interrupts generatedat the reporting IRN RPC has exceeded a threshold. A leaky bucket thresholding

technique is used to determine when the number of interrupts is excessive; a

7/29/2019 172254




401-661-045

count of ring command events is incremented during processing of ring error

interrupts, and decremented on each 10 ms clock interrupt. After incrementing theleaky bucket count, the ring error interrupt handler compares the count against a

pre-defined threshold; if the count has exceeded the threshold, a circulating

hardware control message is assumed to be the cause. The leaky bucket countincrement, decrement, and threshold are parameters defined in header file

rg.ear.h. Two separate thresholds are defined: one for use during the normal RPCoperational state (RPCS4), and one for use during the RPC initialization and ring

maintenance states (RPCS2 and RPCS3).

This error condition is most likely an indication of a circulating broadcast typehardware control message - one of the “nonlethal” control types that do notquarantine or NP reset the affected nodes. A circulating nonbroadcast RAC reset

message will also generate ring command interrupts in this way. A less likelycause is faulty ring interface hardware that generates an unclearable ring

command interrupt. Refer to the contents of RAC status port D for an indication ofthe type of hardware control command that generated the excessive interrupt

activity.


As indicated above, only IRN RPC nodes detect and report this error condition.When the condition is detected, the RPC takes a recovery action designed to halt

and destroy circulating hardware control messages: propagate inhibit is set onboth rings, and after a time delay to allow the circulating message to traverse the

ring and return, inhibit input is set on both RACs and both RACs are reset. If theserecovery actions do not clear all errors on the interrupting ring, the error interrupton that ring is disabled. After this recovery action has been completed, the

problem is reported to the 3B21D.

Variable Data


RAC status ports from the interrupting r ing, prior to the node recoveryactions.


RAC status ports from the interrupting ring, after the node recovery actionshave been completed.

ROP Data

EXCESSIVE RING CMD INTERRUPTS DETECTED, RPCNXX YY RAC (0/1)



7/29/2019 172254





aa - RPC node state.

bb - a flag to indicate whether the leaky bucket counter’s valueincremented past the threshold over a span of multiple ring error

interrupts (flag = 01) or entirely during the processing of one ring

error interrupt (flag = 00).

ccdd - value of ring cmd interrupt leaky bucket counter, after it wasincremented and found to exceed the counter threshold.

ee - leaky bucket counter increment, on each ring command event.

ff - leaky bucket counter decrement, on each 10 ms clock tick.

gghh - leaky bucket counter threshold.

jj - node type, 3 = IRN (should always indicate IRN).

kk - port C, interrupting ring (prior to recovery actions).

ll - port B, interrupting ring (prior to recovery actions).

mm - port A, interrupting ring (prior to recovery actions).

nnpp - not used.

qq - port E, interrupting ring (prior to recovery actions).

rr - port D, interrupting ring (prior to recovery actions).

ss - not used.

tt - port C, interrupting ring (after recovery actions).

uu - port B, interrupting ring (after recovery actions).

vv - port A, interrupting ring (after recovery actions).

wwxx - not used.yy - port E, interrupting ring (after recovery actions).

zz - port D, interrupting ring (after recovery actions).

Token Removed from Ring

Error Code

_RG_RDTOKEN

7/29/2019 172254




401-661-045

Description

The opns module has determined this node removed the token from the ring. The

token was taken from the ring as if there was a legitimate destination address

match. The node message switch was delivering messages from the ring buffersand a message destined for the _TOKEN channel was encountered. There are noring status ports to check; this is purely a software decision.

However, if the token was actually removed from the ring, the INACT bit in theRAC status information may be set.


The node takes no recovery action; it only reports the error by sending an errormessage on both rings to the home RPC.

Variable Data


The header of the suspected token.

ROP Data

DEQUEUED TOKEN DETECTED (LN/RPCN)XX YY RAC (0/1)



aabbccdd eeffgghh - The header of the suspected token.


kkllmm - not used.

nnppqqrr - not used.

ssttuuvv - not used.

wwxxyyzz - not used.

Source Match Error

Error Code

_RG_SRCM

7/29/2019 172254





Description

The node placed a message on the ring, but the destination node was not able to

remove the message from the ring. The message traveled completely around the

ring and returned to the source node.

The ring hardware generates a ring error interrupt when a source match occurs.

However, the ring error interrupt handler simply clears the error, letting the nodemessage switch software determine which messages arriving from the ring are

source matches. The node message switch software declares a source matchwhen a message arrives from the ring for which all of the following are true: (a)

the 12-bit source address field contains an address matching the node’s physicaladdress, (b) the source ring ID bit field matches the ring the message wasreceived on, and (c) the 12-bit destination address field does not contain an

address matching the physical address of the node. This software definition ofsource match lets a node send a message to itself.

The hardware generates a source match interrupt under the following conditions:

IRN - A source match error interrupt is generated when a messagearrives with a source address that matches the node’s physical

address, and a destination address that does not match the

node’s physical or virtual address.


IRN - The source match error latch is cleared.

Variable Data


The header from the message that caused the source match.

ROP Data

RMV (LN/RPCN)XX YY; SRC MATCH RPTD BY (LN/RPCN)AA BB

0xaabbccdd 0xeeffgghh (TTTTTTTTTT)

RAC ERROR FLAGS

IRN ERROR

Ring source match

7/29/2019 172254




401-661-045

aabb - The source address of the source match message.

ccdd - The control word of the source match message.

ee - The destination function of the source match message.

ff - The source function of the source match message.

gghh - The destination address of the source match message.

Miscellaneous RAC Problem

Error Code

_RG_RACPROB

Description

An error interrupt is generated and the error cannot be cleared. This error is a“catch-all” to handle some unexpected hardware or software condition.

When any error interrupt is generated, the status ports are saved, the errors are

cleared, and some recovery action is taken. After the recovery has completed, thestatus ports are checked again. If errors still exist, the cycle of clearing the errorand performing the recovery is repeated. If the number of times the cycle is

repeated exceeds a predefined threshold, it is assumed that the error ispermanent and the RAC problem is reported to the 3B21D.

It is possible for this error to be caused by a circulating message on the ring.


The error is reported and inhibit input is set to prevent the recurring interrupt if it iscaused by ring messages. The error interrupt is also disabled.

Variable Data





immemsg.data.specific.un.misc.flsint -

7/29/2019 172254





Will be 1 if the problem is a false interrupt; otherwise, it will be 0.

ROP Data

GENERAL RAC ERROR DETECTED (LN/RPCN)XX YY RAC (0/1)



aabb - False interrupt indicator. If this field is 0x1, the problem was afalse interrupt generated by a RAC.

ccdd - not used.









nnpp - not used.



ss - not used.




wwxx - not used.



NOTE:This status port information from the RAC is used to transmit the error report. Forthis particular error type, the status is always from the RAC opposite to that on

which the error occurred. If the error report was sent by an RPC node, this statusinformation is meaningless.

7/29/2019 172254




401-661-045

Unexpected Loss of Token

Error Code

_RG_NOTOKEN

Description

The node is trying to write to the ring, and a timer expired while waiting for thetoken to arrive at this node. This timing interval is 60 msec. This error is reported

only by an RPC and then only when it has attempted to write to the ring.


The node reports the error and the pending write is removed from the write queue.

Variable Data

None.

ROP Data

UNEXPLAINED LOSS OF TOKEN ON aa

aa - RING 0, RING1 or BOTH RINGS.

Checksum Audit Failure

Error Code

_BADTXTCS

Description

The node checksum audit on a text or data section has failed.


The node reports the error but takes no recovery action.

7/29/2019 172254





ROP Data

RMV (LN/RPCN)XX YY; NODE CKSUM ERROR

0xaabbccdd 0xeeffgghh (TTTTTTTTTT)

aa - Current audit number.

bb - Accumulated sum.

cc - Not used.

dd - Reference sum.

eeff - Segment that the audit was running in.

gghh - Offset to the beginning of the section that was being audited.

Node Processor Parity Failure

Error Code

_RG_NPPF

Description

This error code should never be reported to the 3B21D, but it is included here forreference. If a node processor parity failure occurs, the node will “panic” but it will

not send an error message. If there is bad parity and an attempt is made to send amessage, it may create parity errors at the downstream node and cause that node

to be removed from service.

If an NP parity error occurs while writing a message to the ring, the write will be

terminated. This will chop off the end of the message and cause the downstreamnode(s) to report a read format error.

The 3B21D will be unaware of the problem until a message destined for the node

is returned as a source match.

Design Issues

Some new error codes were created, but they were mapped to existing errorcodes. These new codes were provided for future use in the 3B21D.

1. Presently, the indication of which ring is at fault is the upper bit of the errorcode in the error message. Would it be any simpler to dedicate a separate

field in the error message for this purpose?

7/29/2019 172254




401-661-045

Yes. A spare field in the error message will be assigned to use as the faulty

ring indicator. The bit will still be set until the 3B21D code is changed to thenew field in the message.

2. The error messages will contain an indication of the type of node that sent

the message.

3. In the current configuration, the error message may contain the RAC portsof both rings. Is this necessary?

The info from the opposite port is not usually used by the 3B21D, but insome cases the additional information in the ROP printout is helpful in the

analysis of the problem. For that reason, the opposite port information willbe retained whenever possible.

This status information is really data that is obtained from the RAC onwhich the error message was transmitted. When an error message is sent

on both rings, it is not possible to tell which RAC this status belongs to.Should something be added to the error message to indicate which RAC

the message was transmitted from? Should this status information beprovided when the reporting node is an RPC?

4. Should the orphan byte error be handled separately from the parity error?

Yes. Previously, these errors were grouped together because the recovery

action was the same in either case. That is no longer true, so a new errorcode will be assigned for orphan byte errors. Also, the orphan byte error

requires that error messages be sent to all RPCs on the opposite ring.

5. There are three error codes that indicate blockage, _RG_BLKG,

_RG_ROPF, and _RG_RINH. Is it necessary to have all of the error codes?

The ring error analysis in the 3B21D relies on the different error codes to

determine how to recover from the error.

6. What if there is a source match and the destination address of the source

match message is a virtual address. How does the 3B21D know whichnode to remove? Is it going to have to wait until the neighbor audit runs to

discover which node is in error?

This is a known hazard associated with using virtual addresses. The

source match will not be reported if the destination is a virtual address. Thefaulty node will not be removed until the neighbor audit runs.

7. At the present time, the recovery strategy for a write format error(_RG_WFMT), sets inhibit input, which will cause the upstream node to

see a blockage. Is this overkill? There seems to be a couple of things toconsider. Why should we block the ring because one node cannot write?

The problem with causing blockage is that all traffic on the ring is lost andthis seems a harsh penalty to pay for a write format error.

7/29/2019 172254





It seems logical that we could clean up after the error and try to continue

normal operation. The 3B21D could then make the decision whether toremove the node from service.

This is also the case with the input format error (_RG_IFMT). This is really

a “read too short” error. Inhibit input is also set when this error occurs.

The final decision was to set inhibit input in the IRN to make it look like anolder node.

8. In some error cases in older nodes, inhibit input is set to prevent recurringerror interrupts. Will that work in the IRN? Or would it be better to disable

the ring interrupt?

The IRN will continue to use inhibit input to prevent recurring interrupts. If it

disabled interrupts, it would be difficult for the node to determine when toreenable the interrupt.

9. The error report printed at the ROP presently contains data taken from theerror message. This is provided to help analyze the problem. The amount

of data printed may change and is subject to the time required to print themessage. The time used to print the report affects the total error recovery

time.

10. The “loss of token” error message is sent to the 3B21D if the timer times

out during a token write or if it times out during a priority write. Should therebe a separate codes for the different write failures?

The final decision was not to create a new error code.

11. The input format error is really a “read too short” error, so the error code is

changed from _RG_IFMT to _RG_RDTOSHRT.

7/29/2019 172254




401-661-045

7/29/2019 172254


Issue 16.0 December 2000 B-1


B

Ring Maintenance Reference Material

Ring Transport Errors

This section provides brief descriptions of the circumstances that are associatedwith each type of REPT RING TRANSPORT ERROR message. The messages

are classified according to the consequences of the errors that the messagesreport. The REPT RING TRANSPORT ERR/ UNEXPLAINED LOSS OF TOKEN

message is listed separately as belonging to a class by itself.

Ring-Related Errors

The following ring transport errors indicate faults that obstruct the transportation ofmessages on the ring. Such faults usually lead to ring restarts and/or node

isolations.

BLOCKAGE

A node’s blockage timer timed out waiting for the downstream node orinterframe buffer board (IFB) to accept an offered data byte. The blocked

node will clear the ring by reading all data from the ring, including the tokenmessage. It then reports the condition to the 3B20D/3B21D by sending on

the opposite ring a BLOCKAGE Ring Transport Error Message to eachRPCN.

7/29/2019 172254


B-2 Issue 16.0 December 2000


401-661-045

RAC OUTPUT PARITY ERROR

A node attempted to transmit bad parity to the downstream node or IFB.Since bad parity is not accepted by the downstream node or IFB, the

transmitting node eventually detects blockage and reads the data with bad

parity into memory as part of the blockage recovery process. Uponrecognizing the bad parity, the transmitting node will take the samerecovery action as with BLOCKAGE, except that this error is reportedinstead of BLOCKAGE.

READ INHIBIT ERROR

Blockage occurred during a read or while propagating a message and nodata was read into NP memory as part of the blockage recovery process.

The node will take the same recovery action as with BLOCKAGE.

RAC PARITY/FORMAT ERROR

A node reporting this error will not accept data from its upstream neighbor,

thereby forcing the upstream node to detect ring blockage. The followingtwo conditions cause this error. (1) A ring data byte with bad parity has

been offered to the node; and the node recovery action of resampling thedata could not clear the error. (If bad parity were due to a transient error,

resampling should clear it.) (2) An “orphan byte” has been offered to thenode. An orphan byte condition occurs when a node expects to receive acontrol byte but is offered another byte instead. The control byte is the first

byte of data in an IMS message. A special signal lead on the ring bus isasserted only during the control byte, thereby allowing the receiving node

to identify the control byte from all other message bytes.

INTERFRAME BUFFER PARITY ERROR

The upstream interframe buffer has detected a ring parity error. The IFBwill not accept any more data, thereby forcing blockage in the nodeupstream from the IFB.

WRITE FORMAT ERROR

Some error occurred while a node was attempting to write a message tothe ring. For example, the message may have had a source address thatdoes not match that of the writing node, or the message specified an

improper message length. A node reporting this error will not accept ringdata from its upstream node, thereby forcing the upstream node to detect

blockage.

GENERAL RAC ERROR

A “catch-all” error type used to report an unexpected node hardware or

software condition. A node reporting this error will not accept ring data fromits upstream node, thereby forcing the upstream node to detect blockage.

7/29/2019 172254





DEQUEUED TOKEN

A ring node reports this error when it finds that it has read the tokenmessage from the ring. This error is intended to detect failures that cause a

node to inadvertently read data from the ring.

RING INTERFACE FAILURE

During a boot, ring maintenance activity found an RPC’s ring interface to be

faulty.

PIO FAILURE

A Programmed IO operation at an RPCN from the 3B20D/3B21D failed.

RPCN ISOLATION

An RPCN was removed from service due to isolation. The RPCN may ormay not be an innocent victim. This condition is reported as a ring transport

error but is actually a status message, since it is a condition imposed uponan RPCN by the 3B20D/3B21D as a result of ring transport error messagesit has previously received.

Node-Related Errors

The following ring transport errors indicate faults that prevent the processing andtransmission of messages in nodes. They usually lead to node quarantine.

SOURCE MATCH

A ring message returned to the sending node because the destination

node did not remove the message from the ring.

SRC MATCH

This is the same as the SOURCE MATCH error, except the detection was

made by the node audit (NAUD) operation.

NAUD FAILURE

The node audit operation failed in a communication test with a node.

RPCN PANIC

This is a failure condition in RPCN software.

RPCN STATE CHANGE FAILURE

The RPCN failed to confirm that it has followed a 3B20D/3B21D directive to

change into a particular software state during ring maintenance activity.

7/29/2019 172254




401-661-045

UNXPCTD STATE CHNG MSG

This is similar to the RPCN STATE CHANGE FAILURE. Without havingbeen sent a 3B20D/3B21D directive, an RPCN reported that it has

changed into a particular software state.

RING WRITE FAILURE

An RPCN reported that it failed to write a message to the active ring.

MSG RELAY FAILURE

This is similar to the RING WRITE FAILURE. An RPCN failed in relaying amessage from the 3B20D/3B21D onto one of the rings during ring

maintenance activity.

RING READ FAILURE

An RPCN reported that it failed to read a message from the active ring.

UNXPCTD SET QUA

The 3B20D/3B21D received an unprovoked confirmation from an RPCNthat it has been directed to quarantine itself.

RAC CONTROL FAILURE

During ring maintenance activity, the ability of the 3B20D/3B21D to controlan RPC’s ring access circuit (RAC) failed.

Errors Without Consequences

The following ring transport errors cause no system action other than a report.

TRANSIENT RAC ERROR

A ring data byte with bad parity was offered to the node and node recoveryaction of resampling the data cleared the error. Had the error not been

cleared, a RAC PARITY/FORMAT ERROR would have been reported. Ifoccurrences of this error exceed a specified rate, a RAC PARITY/ FORMAT ERROR will be reported and the node will be isolated.

READ FORMAT ERROR

A node read a message that was shorter than the length indicated in themessage header, but at least the length of an IMS header (8 bytes). The

received message is discarded. IUNs will not report this error if it occurs ona broadcast message, but RPCNs will.

7/29/2019 172254





READ TOO SHORT ERROR

A node read a message that was shorter than an IMS header (8 bytes).The partial message header is discarded.


RPCNs have reported loss of token to the 3B20D/3B21D, but no node has

reported another ring transport error type to identify the cause or location of thering problem.

Some IMS Input Messages

The following tables identify commonly used versions of some IMS inputmessages. In these tables, as elsewhere in this document, the following

conventions are used: In the expression NODEa b substitute for NODE RPCN,IUN, or LN, substitute for a the 2-digit group number, and substitute for bthe 2-digit member number. For a complete listing of all IMS input messages and

their variations, consult the the 401-610-055 FLEXENT™/AUTOPLEX ® WirelessNetworks INPUT MESSAGES Message Manual or the 401-610-057

FLEXENT™/AUTOPLEX ® Wireless Networks OUTPUT MESSAGES Manual

Table B-1. Some Versions of the RST Input Message

Message Result

RST:NODEa b Restores the specified node

conditionally.

RST:NODEa b:TLP Restores the specified node

conditionally and executes the TroubleLocating Procedure, thus generating atthe conclusion of a failed diagnostic a

list of circuit packs suspected of beingfaulty.

RST:NODEa b;UCL Restores the specified nodeunconditionally.

CFR:RING Returns all eligible, isolated nodes to

the active ring.

CFR:RING ,NODEa b Returns the specified, isolated node, ifit is eligible, to the active ring.

CFR:RING ,NODEa b ,NODEa b Returns the specified range of nodes, if

they are eligible, to the active ring.

7/29/2019 172254




401-661-045

Setting the ECD Flag for Manual Ring

Mode

The manual ring mode flag is field 22 of the UNODS 0 UCB form. The following isthe procedure to set/reset this flag:

1. After executing the trbegin form, enter the form name ucb on the forms

selection page.

2. ODIN will display the database operation page and request the action

desired. Enter u to indicate a form update is required.

3. ODIN will display page 1 of the UCB form and position the cursor at field 1.

4. Advance the cursor to field 3 by depressing the <CR> key twice.

5. Enter UNODS in field 3. Advance the cursor to field 4 by depressing the<CR>, Enter 0 (zero) in field 4.

6. If the form name is found, ODIN will display the current values in the ECDfor each field for page 1 of the form. A prompt for the next operation desired

will appear at the lower portion of the screen.

7. Enter 2 in response to move to page 2 of the UCB form. ODIN will display

page 2 of the form, and another operation prompt will appear.

CFR:RING ,NODEa b ,NODEa b;INCLUDE

Returns the specified range of nodes, if

they are eligible, to the active ring.

CFR:RING ,NODEa b;EXCLUDE Isolates the specified node, if it iseligible.

CFR:RING ,NODEa b NODEa b;EXCLUDE Isolates the specified range of nodes, if

they are eligible.

CFR:RING,NODEa,b;MOVFLT Moves the indication of a faulty ring

interface from the currently isolatednode to the node identified as

NODEa,b and causes the isolation toshift so that NODEa,b becomes thenewly isolated node and the currently

isolated node becomes the EISO orBISO node. See “Manual Recovery

from a Hard Fault on a Small Ring” inChapter 3, Ring Maintenance.

Table B-1. Some Versions of the RST Input Message

Message Result

7/29/2019 172254





8. Enter c in response to indicate that a field is to be changed. ODIN will then

prompt for the field number.

9. Enter 22 in response to specify field 22 (the equippage field). ODIN will

position the cursor at field 22.

10. Change the value of field 22 as follows:

s 0x8 at the beginning of the manual ring initialization is used to setthe flag.

s and at the completion of the manual ring initialization, after the ring

is stable, to reset the flag.

ODIN will prompt for the next field to be changed.

11. Depress the <CR> key to indicate that no other changes are desired on the

page. ODIN will again display the operations prompt at the lower portion ofthe screen.

12. Enter u in response to update the form and inform ODIN that no otherchanges are required for this session.

13. The message FORM UPDATED will flash once at the upper right of thescreen when the form is updated. ODIN will then return to page 1 of the

form.

14. Return to the forms selection page by depressing the < key, and execute

the TREND Form.

ECD Values for Interframe Buffers

For interframe buffers that are upstream of RAC 0, set bits 0-3 of the ECD UCBHV field to the following values:

VALUE BUFFER TYPE

0 no IFB

1 TN918 (unpadded)

2 TN915 (padded 512 byte capacity)

3 TN1507 (fiber 256 byte capacity)


5 TN1508 (fast unpadded 16 bytecapacity)

6 TN1509 (fast 4104 byte capacity)

7/29/2019 172254




401-661-045

For interframe buffers that are upstream of RAC 1, set bits 4-7 of the ECD UCB

HV field to the following values:

VALUE BUFFER TYPE

0 no IFB

1 TN918 (unpadded)


3 TN1507 (fiber 256 byte capacity)


5 TN1508 (fast unpadded 16 byte

capacity)

6 TN1509 (fast 4104 byte capacity)

7/29/2019 172254


Issue 16.0 December 2000 AC-1


Abbreviations

For definitions of terms used in this acronym list, see the Glossary or consult the Index fortext references.

Numerics

3B20D

AT&T 3B20 Duplex Real Time Reliable computer

3B21D

A new version of the existing 3B20D processor

5ESS®

Registered trademark of Lucent Technologies for its premier electronic switching system

A

ACCH

Associated control channel

ACDN

Administrative Call Processing/Database Node

ACT

Active state

ACTS

Automated Cellular Test System

ACU

Analog conversion unit

AIF

Antenna Interface Frame (Series II Cell)

AMA

Automatic Message Accounting

AMASE

Automatic Message Accounting Standard Entries

AMPS

Advanced Mobile Phone Service

AP

Attached Processor - Another name for the Ring Application/Attached Processor.

7/29/2019 172254


AC-2 Issue 16.0 December 2000

401-661-045


ATP

All Tests Passed

AUTOPLEX

AT&T Registered Trademark for its Cellular Switching Systems

AutoPACE

Performance Analysis and Cellular Engineering

B

BBA

Bus Interface Unit + Baseband Combiner & Radio + Analog Conversion Unit (BIU+BCR+ACU)

BCR

Baseband Combiner & RadioBER

Bit Error Rate

BIU

Bus Interface Unit

BWM

Broadcast Warning Message

C

CCC

CDMA Cluster Controller

CCCEQ

CDMA Cluster equipage form

CCFDB

Custom Calling Features Database

CCU

CDMA Channel Unit

CDMA

Code Division Multiple Access

CDN

Call Processing/Database Node

CDN-II

Call Processing/Database Node - II

7/29/2019 172254




CDN-IIX

Call processing/database node - IIX

CE

Channel Element

CELLDB

Cell Site Database

CEQCOM1

Series I Cell Equipage Common form

CEQCOM2

The Series II Cell Equipage RC/V Form

CEQFACE

Cell Equipage Face

CGSADB

Cellular Geographic Service Area Database

CNI

Common Network Interface

CNI/IMS

Common Network Interface/Interprocess Message Switch

CPI

Communication processor interface

CPU

Core processor unit

CSC

Cell Site Controller

CUControl unit

D

DAT

Digital Audio Tape

DCCH

Digital Control Channel

DCIDual-Serial Channel (DSCH) Computer Interconnect

DCS

Digital Cellular Switch

7/29/2019 172254



401-661-045


DCSDB

Digital Cellular Switch Database

DFI

Digital Facility Interface

DRTU

Digital Radio Test Unit

DRU

Digital Radio Unit

DS-1

Digital Signal level 1

DS0

Digital Signal-0

DSN

Digital Switch Node

E

EA

Emergency Action Page

EA/NORM

Emergency Action/Normal Display Key on MCRT

ECD

Equipment Configuration Database

ECP

Executive Cellular Processor

ECPC

ECP Complex

ECPDB

Executive Cellular Processor Database

F

FAF

Feature Activation File

FDMA

Frequency Division Multiple Access

7/29/2019 172254




FER

Frame Error Rate

G

GPS

Global Positioning System

H

HO

Handoff

Hz

Hertz

I

IMS

Interprocessor Message Switch

IIRN

Integrated Ring Node

IRN2

Integrated ring node version 2

7/29/2019 172254



401-661-045


J

K

L

LAF

Linear Amplifier Frame

LAN

Local Area Network

M

MAHO

Mobile Assisted Handoff

MB

Mega Byte

MCRT

Maintenance Cathode Ray Tube/Terminal

MHD

Moving Head Disk

MHz

Megahertz

MSC

Mobile Switching Center (formerly MTSO)

MSO

Multiple Size Option for Subscriber Database

MUFDB

Mobile Unit Features Data Base

7/29/2019 172254




N

N/ANot Applicable

NVM

Non-Volatile Memory

O

OA&M

Operations, Administration & Maintenance

ODAOffice Data Assembler

ODD

Office Dependent Data

OMP

Operations Mgmt Platform, previously Operations and Maintenance Processor

OOS

Out-Of-Service

P

PC

Personal Computer

PM

Plant Measurements

PSTN

Public switched telephone network

PSU

Packet Switching Unit

7/29/2019 172254



401-661-045


Q

R

RAM

Random Access Memory

RCC

Radio Control Complex

RCU

Radio Channel Unit

RCV

Recent Change & Verify

RF

Radio Frequency

RFTG

Reference frequency and timing generator

RN

Ring Node

ROP

Read/Receive-Only Printer

RPC

Ring Peripheral Controller (node)

RPCN

Ring Peripheral Controller Node

RTR

Real Time Reliable

RTU

Radio Test Unit

S

SC

Stable Clear

7/29/2019 172254




SCSI

Small Computer System Interface

SCT

Synchronous Clock and Tone

SH

Speech Handler

SII

Series II Cell Site

SM

Service Measurements -

SMS

Short Message Service

SS7

Signaling System 7

STBY

Standby

SU

Software Update

T

TDMA

Time Division Multiple Access

TEA

Translations Entry Assistant

TRKGRP

Trunk group

TRTU

TDMA Radio Test Unit

7/29/2019 172254



401-661-045


U

V

VCSA

Voice Channel Selection Activity

W

WTSCWireless Technical Support Center (formerly CTSO)

7/29/2019 172254


Issue 16.0 December 2000 GL-1


Glossary

A

Attached Processor (AP)

A circuit pack used with the direct link node (DLN) that provides

expanded storage for added processing capacity on the ring.

B

Basic Error Correction (BEC)

BEC or “Basic” is an algorithm for Level 2 error correction on

signaling links with “shor t” one-way propagation delay. In normal

operation, BEC ensures correct transfer of message signal units

over CCS7 and CCITT7 signaling links, in sequence and with no

double delivery. Positive acknowledgments indicate correct transfer

of message signal units. Negative acknowledgments request a

retransmission of those signal units because they were received in

a corrupt form.

C

Call Processor/Database Node (CDN)

A CNI node that handles the call processing functions of the

FLEXENT™/AUTOPLEX® Wireless Network Systems. A CDN is a

two-part unit consisting of a node and ring application processor

(RAP). There are several versions of CDNs: CDN, CDN-I, CDN-II,

and CDN-IIx.

CCITT

Consultive Committee International Telegraph and Telephone

(Comite Consultatif International Telegraphique et Telephonique).

An international body that controls the standards ofcommunications protocols.

CDN

Call Processor/Database Node

7/29/2019 172254


GL-2 Issue 16.0 December 2000

401-661-045


CDN-I

A CDN that is comprised of an IRN and a 3B15-based computer.

This is sometimes referred to as a SMART Node (SN).

CDN-IIA CDN that is comprised of an IRN2 and an AP30’. This is

sometimes referred to as a Turbo CDN.

CDN-IIx

A CDN that is comprised of an IRN2B and a modified AP30’.

CNCE

CCS Network Critical Events

Common Network Interface (CNI)

A common subsystem software component supplied to various

network components whose primary function is providing CCS

network access and CCS message routing.

Computer Congestion Control

The 3B20D/3B21D computer congestion control feature enables a

craft to reduce real-time congestion by reducing CNI’s activity on

the 3B20D/3B21D computer. If not used by a craft, it remains

inactive.

Critical Node Restore/Monitor

CNI’s critical node monitor looks for configurations of out-of-service

link nodes and direct link nodes (DLNs) that have cut its ring off

from the outside world. To restore these nodes quickly, it tells

Interprocess Message Switch (IMS) to give them a “user critical”

priority on its automatic ring recovery (ARR) priority list. The

monitor also permits its ring’s application to nominate nodes to this

priority.

CSNCell Site Node

D

DCS

Digital Cellular Switch

Destination Point Code (DPC)

A unique value associated with every network component that is

used for routing.

Direct Link Node (DLN)

A DLN is basically an RPCN equipped with an AP. A DLN routes the

data link message traffic between cellular systems for both X.25

and SS7 messaging.

7/29/2019 172254



Glossary


Direct Link Node 30 (DLN30)

The DLN30 has IRN2B, AP30, 3BI, and DDSBS boards. The

IRN2B board provides increased performance and higher reliability.

Direct Link Node Enhanced (DLNE)The DLNE has IRNB, AP30, 3BI and DDSBS boards.

DSN

Digital Switch Node

E

EAI

Emergency Action Interface

EARError Analysis and Recovery

Extended Access Links (E-Links) and Full Point Code Routing (FPCR)

The ELINKS/FPCR features allow LECs to achieve the following

benefits in their networks: provides additional routes to destinations

which further minimizes signaling end point (SEP) isolation; forces

traffic to be directly routed (thus using fewer intermediate STPs) to

more efficient and less problematic routes which improves network

performance; and allows switching traffic between Access Links

(A-Links) and E-Links which makes network reconfiguration easier.

F

Full Process Initialization (FPI)

FPI will reduce failed and abandoned initializations. It is a faster and

more reliable initialization response than the “abort and boot”

initialization.

7/29/2019 172254



401-661-045


G

H

I

ICN

Inter-Cellular Node

IFB

Interframe Buffer Board

IMS User Node (IUN)

An IMS provided node on the ring where with the addition of CNI

hardware provides an interface between the ring and the

transmission facility. This includes all non-RPCNs.

Integrated Ring Node (IRN)

A ring node that uses very large scale integration to combine the

node processor and both ring interfaces into one circuit pack. There

are several versions of the IRN referenced in this document: the

IRN (UN303), the IRNB (UN303B), the IRN2 (UN304), and IRN2B

(UN304B). Functionally, they all serve the same purpose, but

different IRN versions are used in different node types.

Interprocess Message Switch (IMS)

A common subsystem software component that provides a ring

based interfunction, interprocessor transport mechanism.

IUN Init with Optional Pump

This restores the node without repumping the node. It increases the

systems availability through reduced down time.

7/29/2019 172254



Glossary


J

K

L

LI

Link Interface

LIN-E

Link Interface Node - Encrypted

LIN-NE

Link Interface Node - Nonencrypted

Link Node (LN)

A node on the r ing where digital information enters from or exits to

the transmission facility.

M

MCRT

Maintenance Cathode Ray Tube

MDL

Memory Data Link

Message Switch

The portion of the IMS software that handles the sending and

receiving of internal messages. There are portions of the message

switch in all ring nodes and in the central processor.

Message Transfer Part (MTP)

The functional part of CCS7 that transfers signaling messages as

required by all the users and also performs the necessary

subsidiary functions (for example, error control and signaling

security).

7/29/2019 172254



401-661-045


N

Network Interconnect (NI)NI is used to interconnect signaling points in different North

American networks which adhere to the ANSI standard

specifications for the CCS7 protocol. It provides: MTP and SCCP

routing to PCs in nonlocal networks, SNM and SCMG for nonlocal

network PCs, administration of the associated nonlocal network

routing data, new routing types to support routing to small networks

and cluster-level-only routing to populated clusters, and NID only

routing.

Node Processor (NP)

The NP is the central processing unit (CPU) portion of a ring node.

It controls and schedules the processes in the ring node.

Nonlocal Point CodeAny signaling point code which has a network identifier value that is

different from the network identifier value of the local point code.

NRM

Node Recovery Monitor

O

Offline Boot (OFLBOOT)

The OFLBOOT feature allows the 3B20D/3B21D duplex processor

of a 5ESS-2000 switch to be logically separated into two simplexmachines: the ONLINE side and the OFFLINE side. This allows

personnel at a 5ESS-2000 switch to cut over to a new software

release with a minimum of downtime.

P

Peripheral Routing

Provides the capability to do CCS7 MTP and SCCP routing in a

node on the ring.

Preventive Cyclic Retransmission (PCR)PCR is an algorithm for Level 2 error correction on CCS7 or

CCITT7 signaling links with a “long” one-way propagation delay.

Each message signal unit must be retained at the transmitting

signaling link terminal until a positive acknowledgement arrives

7/29/2019 172254



Glossary


from the receiving signaling link terminal. During the period when

there are no new signal units to be transmitted, all the signal units

which have not yet been positively acknowledged are retransmitted

cyclically.

Protected Applications Segment (PAS)

CNI data that rarely changes is referred to as static data, and is

preserved in the protected applications segment area of 3B

memory. CNI can reuse this data from PAS during CNI init level 2,

saving time that would have been wasted downloading the data

from disk. To insure PAS data is safe, it must be protected from

accidental writes. For this purpose, CNI has improved protection of

the PAS area.

Q

R

Ring

Refers collectively to the RPCNs and IUNs which are serially

connected to one of two circular busses. The ring provides 4

megabyte data paths in both directions between adjacent nodes

and can uniquely address up to 1,024 nodes.

Ring Application Processor

A modified 3B15 computer used in the standard multiapplication

real time node that performs processing on the ring.

Ring Configuration

For various reasons, the ring is reconfigured under control of the

3B20D/3B21D computer to isolate the faulty segment.

Ring Generic Access Package (RGRASP)

RGRASP is a debugging tool for CNI ring nodes.

Ring Interface (RI)

A RI is one of two circuits in a ring node that interfaces the node

processor to the ring. Each RI can access either ring 0 or ring 1 to

insert messages onto, or remove messages from, the active ring.

The heart of the circuit is a first-in first-out (FIFO) buffer that

provides access to the ring yet allows messages to circulate in the

ring independent of the node.

Ring Isolation

A ring configuration where r ing nodes are isolated from the active

ring.

7/29/2019 172254



401-661-045


Ring Peripheral Controller Node (RPCN)

A node on the r ing where digital information is removed from the

ring for transferral to the 3B20D/3B21D computer for processing or,

after processing, reenters the ring.

S

Signaling Connection Control Part (SCCP)

An adjunct to the MTP layer of CCS7 which performs interpoint

code subsystem status.

Signaling End Point (SEP) Dual Point Code (DUALPC)

The DUALPC feature allows Signaling End Points (SEPs) to

support a two point code assignment to facilitate the change of the

point code for resectoring of the SEP with minimal Signaling

System Number 7 (SS7) service disruption.

SMART Node (SN)

Standard Multi-Application Real Time node. See CDN-I.

SS7

Signaling System 7

T

Turbo CDN

See CDN-II.

7/29/2019 172254



Glossary


U

V

W

WTSC

Wireless Technical Support Center

X

Y

Z

7/29/2019 172254



401-661-045


7/29/2019 172254


Index

Issue 16.0 December 2000 IN-1


Index

A

About this document, xv

comments, xix

Automatic Diagnostics and Restorals, 6-55

C

Circuit Pack Trouble Location, 6-24

D

Diagnostic Listings, 6-41

Diagnostic Message Structure, 6-6

E

Equipment Description, 7-1

G

Global Positioning System, AC-5

HHandling Precautions, 7-1

Hardware and Interfaces, 6-2

I

Interactive Diagnostics, 6-70IRN CDN-I Diagnostic Phases, 6-18

IRN DLNE Node Diagnostic Phases, 6-14

IRN LN (LI4S/SS7) Node Diagnostic Phases, 6-12

IRN LN (LIN-E/SS7) Node Diagnostic Phases, 6-11

IRN2 CDN-II/CDN-IIx Diagnostic Phases, 6-20

IRN2 CDN-III Diagnostic Phases, 6-22

IRN2 DLN30 Node Diagnostic Phases, 6-15

IRN2 DLN60 Node Diagnostic Phases, 6-17

L

LNs with Unequipped LI Boards - MV Updates, 6-42

M

Manual (Unit) Diagnostics, 6-56

Manual Diagnostics Using the 1106 Display Page, 6-

59

Manual Diagnostics Using the DGN Command, 6-61

N

Node Diagnostic Phases

IRN CDN-I, 6-18

IRN DLNE, 6-14

IRN LN (LI4S/SS7), 6-12

IRN LN (LIN-E/SS7), 6-11

IRN2 CDN-II/CDN-IIx, 6-20

IRN2 CDN-III, 6-22

IRN2 DLN30, 6-15

IRN2 DLN60, 6-17

Node Phase Descriptions, 6-9

7/29/2019 172254


401-661-045


IN-2 Issue 16.0 December 2000

O

Operating System Diagnostics, 6-75

P

Performing Diagnostics, 6-6

Power Packs and Fusing, 7-2

R

RAP Diagnostic Firmware, 6-69

Ring Node Addressing, 6-43

S

System Diagnostics, 6-8

System Maintenance Interfaces, 6-5

U

Unexplained Loss of Token, B-5

7/29/2019 172254


Lucent Technologies welcomes your comments on this information product. Your opinion is of great value and helps us to

improve.

1. Was the information product:

Yes No Not

applicable

In the language of your choice?

In the desired media (paper, CD-ROM, etc.)?

Available when you needed it?

Please provide any additional comments:

________________________________________________________________________________________________

________________________________________________________________________________________________

2. Please rate the effectiveness of this information product:

Excellent More than Satisfactory Less than Unsatisfactory Not

satisfactory satisfactory applicable

Ease of use

Level of detail

Readability and clarity

Organization

Completeness

Technical accuracy

Quality of translation

Appearance

If your response to any of the above questions is “ Less than satisfactory” or “Unsatisfactory,” please explain your rating.

________________________________________________________________________________________________

________________________________________________________________________________________________

3 If ld h hi b hi i f i d h ld i b ?

Lucent Technologiesvalues your comments!

Flexent™/AUTOPLEX ® Wireless NetworksExecutive Cellular Processor (ECP) Release 16.0Common Network Interface (CNI) Ring Maintenance401-661-045 Issue 16 December 2000

Documents

172254