25
Volume 29, Issue 1 | Winter 2016 ISSN 1709-2574 andrewjohnpublishing.com Physical Site Security Safety Checks Enhanced Simulation, Social Media and Multi-Jurisdictional Training among Top New Trends

Volume 29, Issue 1 | Winter 2016 - Calian · PDF fileVolume 29, Issue 1 | Winter 2016 ... work with our allied associations; iCERT, NENA Ontario, ... If you have a presentation that

  • Upload
    builiem

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Volume 29 Issue 1 | Winter 2016ISSN 1709-2574 andrewjohnpublishingcom

Physical Site Security

Safety Checks

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends

Volume 29 Issue 1| Winter 2016

Contents

Wavelength is published four times per year by Andrew John Publishing Inc with offices at 115 King St W Dundas ON Canada L9H 1V1 We welcome editorial submissions but cannot assume responsibility for commitment for unsolicited material Any editorial materials including photographs that are accepted from an unsolicited contributor will become the property of Andrew John Publishing Inc

FEEDBACK We welcome your views and comments Please send them to Andrew John Publishing Inc 115 King St W Dundas ON Canada L9H 1V1 Copyright 2015 by Andrew John Publishing All rights reserved Reprinting in whole or in part is forbidden without express written consent from the publisher

Publication Agreement Number 40025049 | ISSN1709-2574

Return undeliverable Canadian addresses to

115 King St W Dundas ON Canada L9H 1V1

Wavelength Editorial Board

Editor-in-ChiefTheresa Virgin theresavirginapcoca

Technical EditorRonald Williscroft ronwilliscroftapcoca

LeadershipSupervisionManagement EditorsRyan Lawson ryanlawsonapcoca

Robert Stewart robertstewartapcoca

Training EditorJoel MacDonald joelmcdonaldapcoca

Front Line EditorJoel MacDonald joelmcdonaldapcoca

Conference EditorGavin Hayes gavinhayesapcoca

ContributorsMark Graham Gavin Hayes

Daniel Morelos Thomas PuttMike Reschny Ron Williscroft

Theresa Virgin

Managing EditorScott Bryant scottbryantandrewjohnpublishingcom

Art DirectorDesignAmanda Zylstra designstudio19ca

AdvertisingJohn Birkby

Ph 905-628-4309 jbirkbyandrewjohnpublishingcom

Sales amp Circulation CoordinatorBrenda Robinson brobinsonandrewjohnpublishingcom

AccountingSusan McClung

Group PublisherJohn D Birkby jbirkbyandrewjohnpublishingcom

For more infocontact us at

(514) 426-7879salescvdscomwwwcvdscom

Inc

When Reliability is Key

The Communications Recorder

ComLogProduct of CVDS

TM

bull Recording of analogdigitalIP inputs from conventional sources

such as telephones radios consoles etc

bull 4-240 Recording Channels

bull Integrated AMBE+2 IMBE DVSI vocoder for P25 NEXEDGE

bull Search by Date Time Radio ID and other metadata

bull Real-Time Monitor

bull Powerful Incident Recreation

bull NG-9-1-1 Ready

For more infocontact us at

AMBE+2trade and IMBEtrade are trademarks of Digital Voice Systems Inc

TM TM

Volume 29 Issue 1| Winter 2016

Contents

Wavelength is published four times per year by Andrew John Publishing Inc with offices at 115 King St W Dundas ON Canada L9H 1V1 We welcome editorial submissions but cannot assume responsibility for commitment for unsolicited material Any editorial materials including photographs that are accepted from an unsolicited contributor will become the property of Andrew John Publishing Inc

FEEDBACK We welcome your views and comments Please send them to Andrew John Publishing Inc 115 King St W Dundas ON Canada L9H 1V1 Copyright 2015 by Andrew John Publishing All rights reserved Reprinting in whole or in part is forbidden without express written consent from the publisher

Publication Agreement Number 40025049 | ISSN1709-2574

Return undeliverable Canadian addresses to

115 King St W Dundas ON Canada L9H 1V1

Wavelength Editorial Board

Editor-in-ChiefTheresa Virgin theresavirginapcoca

Technical EditorRonald Williscroft ronwilliscroftapcoca

LeadershipSupervisionManagement EditorsRyan Lawson ryanlawsonapcoca

Robert Stewart robertstewartapcoca

Training EditorJoel MacDonald joelmcdonaldapcoca

Front Line EditorJoel MacDonald joelmcdonaldapcoca

Conference EditorGavin Hayes gavinhayesapcoca

ContributorsMark Graham Gavin Hayes

Daniel Morelos Thomas PuttMike Reschny Ron Williscroft

Theresa Virgin

Managing EditorScott Bryant scottbryantandrewjohnpublishingcom

Art DirectorDesignAmanda Zylstra designstudio19ca

AdvertisingJohn Birkby

Ph 905-628-4309 jbirkbyandrewjohnpublishingcom

Sales amp Circulation CoordinatorBrenda Robinson brobinsonandrewjohnpublishingcom

AccountingSusan McClung

Group PublisherJohn D Birkby jbirkbyandrewjohnpublishingcom

DEPARTMENTS4 Message From The Editor-In-Chief By Theresa Virgin

5 Message from the President By Ronald E Williscroft ENP

APCO CANADA NEWS6 The Conference Corner By Gavin Hayes

7 Update and Requalification Required for Instructing APCO Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

FEATURE8 Enhanced Simulation Social Media

and Multi-Jurisdictional Training Among Top New Trends in Emergency Management Sector

By Thomas E Putt MSM LOL CD BMASc

TECHNOVATIONS12 Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks By Mark Graham

TRAINING20 Safety Checks

By Mike Reschny EFD EPD EMD

MANAGEMENT23 Physical Site Security A Back to Basics

Approach By Daniel Morelos

12

8

20

23

3 Wavelength | wwwapcoca

MESSAGE FROM THE EDITOR-IN-CHIEF

Hope you all had a safe and healthy New Year Our thoughts and condolences go out to the community of La Loche Saskatchewan and the emergency service providers that took the calls and responded to the scene and provided subsequent assistance It makes me wonder what the local provincial and federal governments are doing to prevent this type of situation particularly in the remote norther communities of our country

At my house this is the year of the ldquoThe Weddingrdquo My son is getting married in August and as the MOG (Mother of the Groom) I get to participate (albeit more on the sidelines) of this event It takes me back to our wedding and the input my then future mother-in-law gave us With that in mind I have resolutely vowed not to make any comments and to only accept their decisions and support them in any way I can

Weddings (and funerals) bring out the best and the worst in families These are the things that family grudges and feuds start out at and only escalate over time Why is that The decisions made at these things can alter the course of a family historymdashwhy There are advantages to not being invited to a wedding There are advantages for not being picked to be in the wedding party There are advantages to just sit on the sidelines and watch things develop around you Why all the drama If you are invited to the event just enjoy the day and all that it brings If you are not invitedmdashconsider the amount of money you have been saved and what you can do with it Weddings should not be the things that break a family apartmdashthat is just wrong

My goal for this event and their future together is to be the type of mother-in-law I wanted One who would give advice if asked but not offer it unsolicited Provide assistance and support always and enjoy the relationship that my son has with his future wife and be joyous in the fact that they bring out the best in each other I hope that I can live up to this goalmdashtime with tell MOG

Theresa Virgin ENPDirector APCO Canada

Editor-in-Chief

Board o f D i rec to rs

PRESIDENT

RON WILLISCROFT

City of Winnipeg Fire Paramedic Service

ronwilliscroftapcoca

VICE PRESIDENT

ROBERT STEWART

robertstewartapcoca

PAST PRESIDENT

GAVIN HAYES

Halton Regional Police Service (retired)

gavinhayesapcoca

DIRECTOR

RYAN LAWSON

Operations Manager E-Comm

ryanlawsonapcoca

DIRECTOR

THERESA VIRGIN

Durham Regional Police Service (retired)

theresavirginapcoca

DIRECTOR

JOEL MCDONALD

Communications Specialist

City of Lethbridge

joelmcdonaldapcoca

DIRECTOR

CINDY SPARROW

Assistant Deputy Chief of the 9-1-1

Emergency Communications Centre

Red Deer

cindysparrowapcoca

4 Wavelength | wwwapcoca

wwwapcoca | Wavelength 5

MESSAGE FROM THE PRESIDENT

Happy New Year and welcome to 2016 I hope that all of you great holiday season and welcome 2016 for the potential to be a great year for all

As we look towards 2016 I am enthusiastic to continue our work with our allied associations iCERT NENA Ontario CITIG and iCERT (Industry Council of Emergency Response Technology) moving key agendas forward such as NG911 Emergency Telecommunicator Health and Wellness the National Public Safety Broadband Network and go on with our work on LMR Interoperability and P25 standards I also look forward to building new working relationships and explore how we can best maintain our representation of public safety communications professionalsrsquo interests in Canada in this ever changing process with the evolution of new technologies and better industry practices in addressing the needs and expectations of the public

APCO Canada and the other APCO associations and partners will carry on in their collaborative work to ensure that these items and others from abroad are shared to enrich our collective memberships as we venture along this evolutionary path together

I am truly looking forward to this yearsrsquo APCO Canada annual conference and trade show to be held in breathtaking Banff Alberta at the ldquoCastle in The Rockiesrdquo the beautiful Fairmont Banff Springs Hotel Located in the Banff National Park a UNESCO World Heritage Site there will be much to do and see apart from the benefits of attending the fantastic networking and education opportunities as well as the largest public safety communications trade show in Canada We hope to see you there and hope you register early as Irsquom sure the event will fill up quickly more to come from Gavin Hayes in his Conference Corner as things progress

We still have a lot of work ahead of us and many changes and challenges to come Our executive team and I will continue to dedicate our work to ensure that APCO Canada continues to be your ldquoVoice of Public Safety Communications in Canadardquo

Yours in ServiceRonald E Williscroft ENP

President APCO Canada

APCO CANADA NEWS

The Conference CornerAs always I want to again recognize the efforts of the public safety telecommunicators from across this country that have and continue to be involved in major disasters Your tireless dedication to your professions is what makes a difference to Canadians in their time of need You truly are the ldquoFirst First Respondershelliprdquo

Over the past few months I have been sifting through the evaluations that were completed by both delegates and our vendors From the feedback we have received we seem to have met the mark in 2015 but as always we are looking to improve our event to ensure the membership and our supporters are not only getting great value for their money but we are providing a detailed educational event

We have extended our contract with our current conference management company Events by Spark through 2017 Anh and her team have worked tirelessly with me and the board to produce a very tight professional show that people want to attend

I was selected to attend the hosted buyers program at the Tete-a-Tete event in Ottawa last week This event brings travel industry professionals together in one space to network and discover what the various destinations in Canada can give to our association should we wish to hold our event there I attended a full trade show and discussed the various needs we have when producing an event of our size Again these conversations took place to ensure we are getting the best value and service for the money we invest in our national event

APCO Canada has been working with HelmsBricoe a hotel procurement company since 2007 Leanne Calderwood is the global accounts manager who has an in-depth knowledge of our program and is always looking for locations in Canada that could be a ldquofitrdquo for our program I will be heading to Halifax and Charlottetown in late February to conduct a formal site visit of the various spaces that could fit our event

The 2016 event is already in the planning stages and the calls for papers will be listed on the conference site very soon If you have a presentation that would fit into our educational tracks please submit it by the deadline Our event is headed to Banff Alberta this year and is being held at the Fairmont Banff Springs Hotel from November 7-10

Please continue to monitor both the APCO Canada Website and the APCO Canada Conference Website for a number of updates about important information from your board

Gavin R HayesImmediate Past-

President APCO Canada

National Events Coordinator

6 Wavelength | wwwapcoca

wwwapcoca | Wavelength 7

APCO CANADA NEWS

Folks the currently version of Public Safety Telecommunicator 1 Canada Edition has been updated to mirror the changes in the PST 1 7th Edition The following is information for all members who are verified APCO trainers or who have taken the PST1 Course In response to the approval of new standards for Public Safety Telecommunicators (PST) and PST instructors APCO has updated its PST1 and PST1 Instructor courses to meet those standards The new 7th edition courses replace the current 6th edition of the classes

APCO Institute will accept PST 6th Edition paperwork from agency instructors until June 30 2016 The change will require individuals certified in the PST 6th Edition student course to upgrade their certificates to the new edition by December 31 2017

WHAT YOU NEED TO KNOWIf you are certified as a PST1 6th Edition instructorCertification Update CourseBeginning January 13 2016 and running through December 31 2017 current PST1 6th Edition instructors

will be provided with a self-paced online update at no charge Initially instructor update classes will be offered on a weekly basis available 247 Each class will remain open for three weeks and students should anticipate two hours of commitment to complete this update Two CDEs will be provided upon successful completion To register log into APCOIntlorg Training and Certification Training Courses Institute Online registration then scroll down to Public Safety Telecommunicator 1 7th Edition Update Canada or scroll down to Public Safety Telecommunicator 1 Instructor 7th Edition Update

Book ExchangePST1 6th Edition student manuals may be exchanged through December 31 2016 An exchange order form with additional information will be available in the PST1 7th Edition Instructor PSConnect community Cost for the exchange is $20 per manual plus shipping and handling

If you hold a certification for PST1 6th Edition and are not an instructorBeginning January 13 2016 through December 31 2017 a student online update will be available Individuals that received their PST certification in calendar year 2015 can update at no cost Anyone who has successfully completed an APCO basic course prior to 2015 can update to PST1 7th Edition for $30 This update is also accessible 247 and provides two CDEs for successful completion

RECERTIFICATION FOR PST1 7TH EDITIONRecertification for PST1 7th Edition is required every two years (24 hours per certification year) The cost to recertify is $30 The cost is only $15 if you submit multiple recertifications at the same time (ie PST and EMD)

WHATrsquoS NEW IN THE 7TH EDITIONSome of the new topics included in the 7th edition courses includebull 9-1-1 Disaster Preparedness in the PSAPbull Telecommunicator Emergency

Response Taskforces (TERT)bull Quality AssuranceQuality

Improvement programsbull Guide cards the Design and Use in

the PSAPbull NextGen and Other Emerging

Technologies (Module Five ndash all new)bull Record Management Systemsbull Managing SpecialtyHigh-Risk Callsbull The Need for Continuing Education

and Professional Developmentbull The Public Safety Telecommunicator 1

7th Edition meets or exceeds American National Standards as contained in the ANSI approved Minimum Training Standard for Public Safety Telecommunicators (APCO ANS 310322015)

Update and Requalification Required for Instructing Apco Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

For answers to questions or more information contact APCO Institute at 888-272-6911 or instituteapcointlorg

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

Volume 29 Issue 1| Winter 2016

Contents

Wavelength is published four times per year by Andrew John Publishing Inc with offices at 115 King St W Dundas ON Canada L9H 1V1 We welcome editorial submissions but cannot assume responsibility for commitment for unsolicited material Any editorial materials including photographs that are accepted from an unsolicited contributor will become the property of Andrew John Publishing Inc

FEEDBACK We welcome your views and comments Please send them to Andrew John Publishing Inc 115 King St W Dundas ON Canada L9H 1V1 Copyright 2015 by Andrew John Publishing All rights reserved Reprinting in whole or in part is forbidden without express written consent from the publisher

Publication Agreement Number 40025049 | ISSN1709-2574

Return undeliverable Canadian addresses to

115 King St W Dundas ON Canada L9H 1V1

Wavelength Editorial Board

Editor-in-ChiefTheresa Virgin theresavirginapcoca

Technical EditorRonald Williscroft ronwilliscroftapcoca

LeadershipSupervisionManagement EditorsRyan Lawson ryanlawsonapcoca

Robert Stewart robertstewartapcoca

Training EditorJoel MacDonald joelmcdonaldapcoca

Front Line EditorJoel MacDonald joelmcdonaldapcoca

Conference EditorGavin Hayes gavinhayesapcoca

ContributorsMark Graham Gavin Hayes

Daniel Morelos Thomas PuttMike Reschny Ron Williscroft

Theresa Virgin

Managing EditorScott Bryant scottbryantandrewjohnpublishingcom

Art DirectorDesignAmanda Zylstra designstudio19ca

AdvertisingJohn Birkby

Ph 905-628-4309 jbirkbyandrewjohnpublishingcom

Sales amp Circulation CoordinatorBrenda Robinson brobinsonandrewjohnpublishingcom

AccountingSusan McClung

Group PublisherJohn D Birkby jbirkbyandrewjohnpublishingcom

For more infocontact us at

(514) 426-7879salescvdscomwwwcvdscom

Inc

When Reliability is Key

The Communications Recorder

ComLogProduct of CVDS

TM

bull Recording of analogdigitalIP inputs from conventional sources

such as telephones radios consoles etc

bull 4-240 Recording Channels

bull Integrated AMBE+2 IMBE DVSI vocoder for P25 NEXEDGE

bull Search by Date Time Radio ID and other metadata

bull Real-Time Monitor

bull Powerful Incident Recreation

bull NG-9-1-1 Ready

For more infocontact us at

AMBE+2trade and IMBEtrade are trademarks of Digital Voice Systems Inc

TM TM

Volume 29 Issue 1| Winter 2016

Contents

Wavelength is published four times per year by Andrew John Publishing Inc with offices at 115 King St W Dundas ON Canada L9H 1V1 We welcome editorial submissions but cannot assume responsibility for commitment for unsolicited material Any editorial materials including photographs that are accepted from an unsolicited contributor will become the property of Andrew John Publishing Inc

FEEDBACK We welcome your views and comments Please send them to Andrew John Publishing Inc 115 King St W Dundas ON Canada L9H 1V1 Copyright 2015 by Andrew John Publishing All rights reserved Reprinting in whole or in part is forbidden without express written consent from the publisher

Publication Agreement Number 40025049 | ISSN1709-2574

Return undeliverable Canadian addresses to

115 King St W Dundas ON Canada L9H 1V1

Wavelength Editorial Board

Editor-in-ChiefTheresa Virgin theresavirginapcoca

Technical EditorRonald Williscroft ronwilliscroftapcoca

LeadershipSupervisionManagement EditorsRyan Lawson ryanlawsonapcoca

Robert Stewart robertstewartapcoca

Training EditorJoel MacDonald joelmcdonaldapcoca

Front Line EditorJoel MacDonald joelmcdonaldapcoca

Conference EditorGavin Hayes gavinhayesapcoca

ContributorsMark Graham Gavin Hayes

Daniel Morelos Thomas PuttMike Reschny Ron Williscroft

Theresa Virgin

Managing EditorScott Bryant scottbryantandrewjohnpublishingcom

Art DirectorDesignAmanda Zylstra designstudio19ca

AdvertisingJohn Birkby

Ph 905-628-4309 jbirkbyandrewjohnpublishingcom

Sales amp Circulation CoordinatorBrenda Robinson brobinsonandrewjohnpublishingcom

AccountingSusan McClung

Group PublisherJohn D Birkby jbirkbyandrewjohnpublishingcom

DEPARTMENTS4 Message From The Editor-In-Chief By Theresa Virgin

5 Message from the President By Ronald E Williscroft ENP

APCO CANADA NEWS6 The Conference Corner By Gavin Hayes

7 Update and Requalification Required for Instructing APCO Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

FEATURE8 Enhanced Simulation Social Media

and Multi-Jurisdictional Training Among Top New Trends in Emergency Management Sector

By Thomas E Putt MSM LOL CD BMASc

TECHNOVATIONS12 Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks By Mark Graham

TRAINING20 Safety Checks

By Mike Reschny EFD EPD EMD

MANAGEMENT23 Physical Site Security A Back to Basics

Approach By Daniel Morelos

12

8

20

23

3 Wavelength | wwwapcoca

MESSAGE FROM THE EDITOR-IN-CHIEF

Hope you all had a safe and healthy New Year Our thoughts and condolences go out to the community of La Loche Saskatchewan and the emergency service providers that took the calls and responded to the scene and provided subsequent assistance It makes me wonder what the local provincial and federal governments are doing to prevent this type of situation particularly in the remote norther communities of our country

At my house this is the year of the ldquoThe Weddingrdquo My son is getting married in August and as the MOG (Mother of the Groom) I get to participate (albeit more on the sidelines) of this event It takes me back to our wedding and the input my then future mother-in-law gave us With that in mind I have resolutely vowed not to make any comments and to only accept their decisions and support them in any way I can

Weddings (and funerals) bring out the best and the worst in families These are the things that family grudges and feuds start out at and only escalate over time Why is that The decisions made at these things can alter the course of a family historymdashwhy There are advantages to not being invited to a wedding There are advantages for not being picked to be in the wedding party There are advantages to just sit on the sidelines and watch things develop around you Why all the drama If you are invited to the event just enjoy the day and all that it brings If you are not invitedmdashconsider the amount of money you have been saved and what you can do with it Weddings should not be the things that break a family apartmdashthat is just wrong

My goal for this event and their future together is to be the type of mother-in-law I wanted One who would give advice if asked but not offer it unsolicited Provide assistance and support always and enjoy the relationship that my son has with his future wife and be joyous in the fact that they bring out the best in each other I hope that I can live up to this goalmdashtime with tell MOG

Theresa Virgin ENPDirector APCO Canada

Editor-in-Chief

Board o f D i rec to rs

PRESIDENT

RON WILLISCROFT

City of Winnipeg Fire Paramedic Service

ronwilliscroftapcoca

VICE PRESIDENT

ROBERT STEWART

robertstewartapcoca

PAST PRESIDENT

GAVIN HAYES

Halton Regional Police Service (retired)

gavinhayesapcoca

DIRECTOR

RYAN LAWSON

Operations Manager E-Comm

ryanlawsonapcoca

DIRECTOR

THERESA VIRGIN

Durham Regional Police Service (retired)

theresavirginapcoca

DIRECTOR

JOEL MCDONALD

Communications Specialist

City of Lethbridge

joelmcdonaldapcoca

DIRECTOR

CINDY SPARROW

Assistant Deputy Chief of the 9-1-1

Emergency Communications Centre

Red Deer

cindysparrowapcoca

4 Wavelength | wwwapcoca

wwwapcoca | Wavelength 5

MESSAGE FROM THE PRESIDENT

Happy New Year and welcome to 2016 I hope that all of you great holiday season and welcome 2016 for the potential to be a great year for all

As we look towards 2016 I am enthusiastic to continue our work with our allied associations iCERT NENA Ontario CITIG and iCERT (Industry Council of Emergency Response Technology) moving key agendas forward such as NG911 Emergency Telecommunicator Health and Wellness the National Public Safety Broadband Network and go on with our work on LMR Interoperability and P25 standards I also look forward to building new working relationships and explore how we can best maintain our representation of public safety communications professionalsrsquo interests in Canada in this ever changing process with the evolution of new technologies and better industry practices in addressing the needs and expectations of the public

APCO Canada and the other APCO associations and partners will carry on in their collaborative work to ensure that these items and others from abroad are shared to enrich our collective memberships as we venture along this evolutionary path together

I am truly looking forward to this yearsrsquo APCO Canada annual conference and trade show to be held in breathtaking Banff Alberta at the ldquoCastle in The Rockiesrdquo the beautiful Fairmont Banff Springs Hotel Located in the Banff National Park a UNESCO World Heritage Site there will be much to do and see apart from the benefits of attending the fantastic networking and education opportunities as well as the largest public safety communications trade show in Canada We hope to see you there and hope you register early as Irsquom sure the event will fill up quickly more to come from Gavin Hayes in his Conference Corner as things progress

We still have a lot of work ahead of us and many changes and challenges to come Our executive team and I will continue to dedicate our work to ensure that APCO Canada continues to be your ldquoVoice of Public Safety Communications in Canadardquo

Yours in ServiceRonald E Williscroft ENP

President APCO Canada

APCO CANADA NEWS

The Conference CornerAs always I want to again recognize the efforts of the public safety telecommunicators from across this country that have and continue to be involved in major disasters Your tireless dedication to your professions is what makes a difference to Canadians in their time of need You truly are the ldquoFirst First Respondershelliprdquo

Over the past few months I have been sifting through the evaluations that were completed by both delegates and our vendors From the feedback we have received we seem to have met the mark in 2015 but as always we are looking to improve our event to ensure the membership and our supporters are not only getting great value for their money but we are providing a detailed educational event

We have extended our contract with our current conference management company Events by Spark through 2017 Anh and her team have worked tirelessly with me and the board to produce a very tight professional show that people want to attend

I was selected to attend the hosted buyers program at the Tete-a-Tete event in Ottawa last week This event brings travel industry professionals together in one space to network and discover what the various destinations in Canada can give to our association should we wish to hold our event there I attended a full trade show and discussed the various needs we have when producing an event of our size Again these conversations took place to ensure we are getting the best value and service for the money we invest in our national event

APCO Canada has been working with HelmsBricoe a hotel procurement company since 2007 Leanne Calderwood is the global accounts manager who has an in-depth knowledge of our program and is always looking for locations in Canada that could be a ldquofitrdquo for our program I will be heading to Halifax and Charlottetown in late February to conduct a formal site visit of the various spaces that could fit our event

The 2016 event is already in the planning stages and the calls for papers will be listed on the conference site very soon If you have a presentation that would fit into our educational tracks please submit it by the deadline Our event is headed to Banff Alberta this year and is being held at the Fairmont Banff Springs Hotel from November 7-10

Please continue to monitor both the APCO Canada Website and the APCO Canada Conference Website for a number of updates about important information from your board

Gavin R HayesImmediate Past-

President APCO Canada

National Events Coordinator

6 Wavelength | wwwapcoca

wwwapcoca | Wavelength 7

APCO CANADA NEWS

Folks the currently version of Public Safety Telecommunicator 1 Canada Edition has been updated to mirror the changes in the PST 1 7th Edition The following is information for all members who are verified APCO trainers or who have taken the PST1 Course In response to the approval of new standards for Public Safety Telecommunicators (PST) and PST instructors APCO has updated its PST1 and PST1 Instructor courses to meet those standards The new 7th edition courses replace the current 6th edition of the classes

APCO Institute will accept PST 6th Edition paperwork from agency instructors until June 30 2016 The change will require individuals certified in the PST 6th Edition student course to upgrade their certificates to the new edition by December 31 2017

WHAT YOU NEED TO KNOWIf you are certified as a PST1 6th Edition instructorCertification Update CourseBeginning January 13 2016 and running through December 31 2017 current PST1 6th Edition instructors

will be provided with a self-paced online update at no charge Initially instructor update classes will be offered on a weekly basis available 247 Each class will remain open for three weeks and students should anticipate two hours of commitment to complete this update Two CDEs will be provided upon successful completion To register log into APCOIntlorg Training and Certification Training Courses Institute Online registration then scroll down to Public Safety Telecommunicator 1 7th Edition Update Canada or scroll down to Public Safety Telecommunicator 1 Instructor 7th Edition Update

Book ExchangePST1 6th Edition student manuals may be exchanged through December 31 2016 An exchange order form with additional information will be available in the PST1 7th Edition Instructor PSConnect community Cost for the exchange is $20 per manual plus shipping and handling

If you hold a certification for PST1 6th Edition and are not an instructorBeginning January 13 2016 through December 31 2017 a student online update will be available Individuals that received their PST certification in calendar year 2015 can update at no cost Anyone who has successfully completed an APCO basic course prior to 2015 can update to PST1 7th Edition for $30 This update is also accessible 247 and provides two CDEs for successful completion

RECERTIFICATION FOR PST1 7TH EDITIONRecertification for PST1 7th Edition is required every two years (24 hours per certification year) The cost to recertify is $30 The cost is only $15 if you submit multiple recertifications at the same time (ie PST and EMD)

WHATrsquoS NEW IN THE 7TH EDITIONSome of the new topics included in the 7th edition courses includebull 9-1-1 Disaster Preparedness in the PSAPbull Telecommunicator Emergency

Response Taskforces (TERT)bull Quality AssuranceQuality

Improvement programsbull Guide cards the Design and Use in

the PSAPbull NextGen and Other Emerging

Technologies (Module Five ndash all new)bull Record Management Systemsbull Managing SpecialtyHigh-Risk Callsbull The Need for Continuing Education

and Professional Developmentbull The Public Safety Telecommunicator 1

7th Edition meets or exceeds American National Standards as contained in the ANSI approved Minimum Training Standard for Public Safety Telecommunicators (APCO ANS 310322015)

Update and Requalification Required for Instructing Apco Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

For answers to questions or more information contact APCO Institute at 888-272-6911 or instituteapcointlorg

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

Volume 29 Issue 1| Winter 2016

Contents

Wavelength is published four times per year by Andrew John Publishing Inc with offices at 115 King St W Dundas ON Canada L9H 1V1 We welcome editorial submissions but cannot assume responsibility for commitment for unsolicited material Any editorial materials including photographs that are accepted from an unsolicited contributor will become the property of Andrew John Publishing Inc

FEEDBACK We welcome your views and comments Please send them to Andrew John Publishing Inc 115 King St W Dundas ON Canada L9H 1V1 Copyright 2015 by Andrew John Publishing All rights reserved Reprinting in whole or in part is forbidden without express written consent from the publisher

Publication Agreement Number 40025049 | ISSN1709-2574

Return undeliverable Canadian addresses to

115 King St W Dundas ON Canada L9H 1V1

Wavelength Editorial Board

Editor-in-ChiefTheresa Virgin theresavirginapcoca

Technical EditorRonald Williscroft ronwilliscroftapcoca

LeadershipSupervisionManagement EditorsRyan Lawson ryanlawsonapcoca

Robert Stewart robertstewartapcoca

Training EditorJoel MacDonald joelmcdonaldapcoca

Front Line EditorJoel MacDonald joelmcdonaldapcoca

Conference EditorGavin Hayes gavinhayesapcoca

ContributorsMark Graham Gavin Hayes

Daniel Morelos Thomas PuttMike Reschny Ron Williscroft

Theresa Virgin

Managing EditorScott Bryant scottbryantandrewjohnpublishingcom

Art DirectorDesignAmanda Zylstra designstudio19ca

AdvertisingJohn Birkby

Ph 905-628-4309 jbirkbyandrewjohnpublishingcom

Sales amp Circulation CoordinatorBrenda Robinson brobinsonandrewjohnpublishingcom

AccountingSusan McClung

Group PublisherJohn D Birkby jbirkbyandrewjohnpublishingcom

DEPARTMENTS4 Message From The Editor-In-Chief By Theresa Virgin

5 Message from the President By Ronald E Williscroft ENP

APCO CANADA NEWS6 The Conference Corner By Gavin Hayes

7 Update and Requalification Required for Instructing APCO Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

FEATURE8 Enhanced Simulation Social Media

and Multi-Jurisdictional Training Among Top New Trends in Emergency Management Sector

By Thomas E Putt MSM LOL CD BMASc

TECHNOVATIONS12 Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks By Mark Graham

TRAINING20 Safety Checks

By Mike Reschny EFD EPD EMD

MANAGEMENT23 Physical Site Security A Back to Basics

Approach By Daniel Morelos

12

8

20

23

3 Wavelength | wwwapcoca

MESSAGE FROM THE EDITOR-IN-CHIEF

Hope you all had a safe and healthy New Year Our thoughts and condolences go out to the community of La Loche Saskatchewan and the emergency service providers that took the calls and responded to the scene and provided subsequent assistance It makes me wonder what the local provincial and federal governments are doing to prevent this type of situation particularly in the remote norther communities of our country

At my house this is the year of the ldquoThe Weddingrdquo My son is getting married in August and as the MOG (Mother of the Groom) I get to participate (albeit more on the sidelines) of this event It takes me back to our wedding and the input my then future mother-in-law gave us With that in mind I have resolutely vowed not to make any comments and to only accept their decisions and support them in any way I can

Weddings (and funerals) bring out the best and the worst in families These are the things that family grudges and feuds start out at and only escalate over time Why is that The decisions made at these things can alter the course of a family historymdashwhy There are advantages to not being invited to a wedding There are advantages for not being picked to be in the wedding party There are advantages to just sit on the sidelines and watch things develop around you Why all the drama If you are invited to the event just enjoy the day and all that it brings If you are not invitedmdashconsider the amount of money you have been saved and what you can do with it Weddings should not be the things that break a family apartmdashthat is just wrong

My goal for this event and their future together is to be the type of mother-in-law I wanted One who would give advice if asked but not offer it unsolicited Provide assistance and support always and enjoy the relationship that my son has with his future wife and be joyous in the fact that they bring out the best in each other I hope that I can live up to this goalmdashtime with tell MOG

Theresa Virgin ENPDirector APCO Canada

Editor-in-Chief

Board o f D i rec to rs

PRESIDENT

RON WILLISCROFT

City of Winnipeg Fire Paramedic Service

ronwilliscroftapcoca

VICE PRESIDENT

ROBERT STEWART

robertstewartapcoca

PAST PRESIDENT

GAVIN HAYES

Halton Regional Police Service (retired)

gavinhayesapcoca

DIRECTOR

RYAN LAWSON

Operations Manager E-Comm

ryanlawsonapcoca

DIRECTOR

THERESA VIRGIN

Durham Regional Police Service (retired)

theresavirginapcoca

DIRECTOR

JOEL MCDONALD

Communications Specialist

City of Lethbridge

joelmcdonaldapcoca

DIRECTOR

CINDY SPARROW

Assistant Deputy Chief of the 9-1-1

Emergency Communications Centre

Red Deer

cindysparrowapcoca

4 Wavelength | wwwapcoca

wwwapcoca | Wavelength 5

MESSAGE FROM THE PRESIDENT

Happy New Year and welcome to 2016 I hope that all of you great holiday season and welcome 2016 for the potential to be a great year for all

As we look towards 2016 I am enthusiastic to continue our work with our allied associations iCERT NENA Ontario CITIG and iCERT (Industry Council of Emergency Response Technology) moving key agendas forward such as NG911 Emergency Telecommunicator Health and Wellness the National Public Safety Broadband Network and go on with our work on LMR Interoperability and P25 standards I also look forward to building new working relationships and explore how we can best maintain our representation of public safety communications professionalsrsquo interests in Canada in this ever changing process with the evolution of new technologies and better industry practices in addressing the needs and expectations of the public

APCO Canada and the other APCO associations and partners will carry on in their collaborative work to ensure that these items and others from abroad are shared to enrich our collective memberships as we venture along this evolutionary path together

I am truly looking forward to this yearsrsquo APCO Canada annual conference and trade show to be held in breathtaking Banff Alberta at the ldquoCastle in The Rockiesrdquo the beautiful Fairmont Banff Springs Hotel Located in the Banff National Park a UNESCO World Heritage Site there will be much to do and see apart from the benefits of attending the fantastic networking and education opportunities as well as the largest public safety communications trade show in Canada We hope to see you there and hope you register early as Irsquom sure the event will fill up quickly more to come from Gavin Hayes in his Conference Corner as things progress

We still have a lot of work ahead of us and many changes and challenges to come Our executive team and I will continue to dedicate our work to ensure that APCO Canada continues to be your ldquoVoice of Public Safety Communications in Canadardquo

Yours in ServiceRonald E Williscroft ENP

President APCO Canada

APCO CANADA NEWS

The Conference CornerAs always I want to again recognize the efforts of the public safety telecommunicators from across this country that have and continue to be involved in major disasters Your tireless dedication to your professions is what makes a difference to Canadians in their time of need You truly are the ldquoFirst First Respondershelliprdquo

Over the past few months I have been sifting through the evaluations that were completed by both delegates and our vendors From the feedback we have received we seem to have met the mark in 2015 but as always we are looking to improve our event to ensure the membership and our supporters are not only getting great value for their money but we are providing a detailed educational event

We have extended our contract with our current conference management company Events by Spark through 2017 Anh and her team have worked tirelessly with me and the board to produce a very tight professional show that people want to attend

I was selected to attend the hosted buyers program at the Tete-a-Tete event in Ottawa last week This event brings travel industry professionals together in one space to network and discover what the various destinations in Canada can give to our association should we wish to hold our event there I attended a full trade show and discussed the various needs we have when producing an event of our size Again these conversations took place to ensure we are getting the best value and service for the money we invest in our national event

APCO Canada has been working with HelmsBricoe a hotel procurement company since 2007 Leanne Calderwood is the global accounts manager who has an in-depth knowledge of our program and is always looking for locations in Canada that could be a ldquofitrdquo for our program I will be heading to Halifax and Charlottetown in late February to conduct a formal site visit of the various spaces that could fit our event

The 2016 event is already in the planning stages and the calls for papers will be listed on the conference site very soon If you have a presentation that would fit into our educational tracks please submit it by the deadline Our event is headed to Banff Alberta this year and is being held at the Fairmont Banff Springs Hotel from November 7-10

Please continue to monitor both the APCO Canada Website and the APCO Canada Conference Website for a number of updates about important information from your board

Gavin R HayesImmediate Past-

President APCO Canada

National Events Coordinator

6 Wavelength | wwwapcoca

wwwapcoca | Wavelength 7

APCO CANADA NEWS

Folks the currently version of Public Safety Telecommunicator 1 Canada Edition has been updated to mirror the changes in the PST 1 7th Edition The following is information for all members who are verified APCO trainers or who have taken the PST1 Course In response to the approval of new standards for Public Safety Telecommunicators (PST) and PST instructors APCO has updated its PST1 and PST1 Instructor courses to meet those standards The new 7th edition courses replace the current 6th edition of the classes

APCO Institute will accept PST 6th Edition paperwork from agency instructors until June 30 2016 The change will require individuals certified in the PST 6th Edition student course to upgrade their certificates to the new edition by December 31 2017

WHAT YOU NEED TO KNOWIf you are certified as a PST1 6th Edition instructorCertification Update CourseBeginning January 13 2016 and running through December 31 2017 current PST1 6th Edition instructors

will be provided with a self-paced online update at no charge Initially instructor update classes will be offered on a weekly basis available 247 Each class will remain open for three weeks and students should anticipate two hours of commitment to complete this update Two CDEs will be provided upon successful completion To register log into APCOIntlorg Training and Certification Training Courses Institute Online registration then scroll down to Public Safety Telecommunicator 1 7th Edition Update Canada or scroll down to Public Safety Telecommunicator 1 Instructor 7th Edition Update

Book ExchangePST1 6th Edition student manuals may be exchanged through December 31 2016 An exchange order form with additional information will be available in the PST1 7th Edition Instructor PSConnect community Cost for the exchange is $20 per manual plus shipping and handling

If you hold a certification for PST1 6th Edition and are not an instructorBeginning January 13 2016 through December 31 2017 a student online update will be available Individuals that received their PST certification in calendar year 2015 can update at no cost Anyone who has successfully completed an APCO basic course prior to 2015 can update to PST1 7th Edition for $30 This update is also accessible 247 and provides two CDEs for successful completion

RECERTIFICATION FOR PST1 7TH EDITIONRecertification for PST1 7th Edition is required every two years (24 hours per certification year) The cost to recertify is $30 The cost is only $15 if you submit multiple recertifications at the same time (ie PST and EMD)

WHATrsquoS NEW IN THE 7TH EDITIONSome of the new topics included in the 7th edition courses includebull 9-1-1 Disaster Preparedness in the PSAPbull Telecommunicator Emergency

Response Taskforces (TERT)bull Quality AssuranceQuality

Improvement programsbull Guide cards the Design and Use in

the PSAPbull NextGen and Other Emerging

Technologies (Module Five ndash all new)bull Record Management Systemsbull Managing SpecialtyHigh-Risk Callsbull The Need for Continuing Education

and Professional Developmentbull The Public Safety Telecommunicator 1

7th Edition meets or exceeds American National Standards as contained in the ANSI approved Minimum Training Standard for Public Safety Telecommunicators (APCO ANS 310322015)

Update and Requalification Required for Instructing Apco Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

For answers to questions or more information contact APCO Institute at 888-272-6911 or instituteapcointlorg

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

MESSAGE FROM THE EDITOR-IN-CHIEF

Hope you all had a safe and healthy New Year Our thoughts and condolences go out to the community of La Loche Saskatchewan and the emergency service providers that took the calls and responded to the scene and provided subsequent assistance It makes me wonder what the local provincial and federal governments are doing to prevent this type of situation particularly in the remote norther communities of our country

At my house this is the year of the ldquoThe Weddingrdquo My son is getting married in August and as the MOG (Mother of the Groom) I get to participate (albeit more on the sidelines) of this event It takes me back to our wedding and the input my then future mother-in-law gave us With that in mind I have resolutely vowed not to make any comments and to only accept their decisions and support them in any way I can

Weddings (and funerals) bring out the best and the worst in families These are the things that family grudges and feuds start out at and only escalate over time Why is that The decisions made at these things can alter the course of a family historymdashwhy There are advantages to not being invited to a wedding There are advantages for not being picked to be in the wedding party There are advantages to just sit on the sidelines and watch things develop around you Why all the drama If you are invited to the event just enjoy the day and all that it brings If you are not invitedmdashconsider the amount of money you have been saved and what you can do with it Weddings should not be the things that break a family apartmdashthat is just wrong

My goal for this event and their future together is to be the type of mother-in-law I wanted One who would give advice if asked but not offer it unsolicited Provide assistance and support always and enjoy the relationship that my son has with his future wife and be joyous in the fact that they bring out the best in each other I hope that I can live up to this goalmdashtime with tell MOG

Theresa Virgin ENPDirector APCO Canada

Editor-in-Chief

Board o f D i rec to rs

PRESIDENT

RON WILLISCROFT

City of Winnipeg Fire Paramedic Service

ronwilliscroftapcoca

VICE PRESIDENT

ROBERT STEWART

robertstewartapcoca

PAST PRESIDENT

GAVIN HAYES

Halton Regional Police Service (retired)

gavinhayesapcoca

DIRECTOR

RYAN LAWSON

Operations Manager E-Comm

ryanlawsonapcoca

DIRECTOR

THERESA VIRGIN

Durham Regional Police Service (retired)

theresavirginapcoca

DIRECTOR

JOEL MCDONALD

Communications Specialist

City of Lethbridge

joelmcdonaldapcoca

DIRECTOR

CINDY SPARROW

Assistant Deputy Chief of the 9-1-1

Emergency Communications Centre

Red Deer

cindysparrowapcoca

4 Wavelength | wwwapcoca

wwwapcoca | Wavelength 5

MESSAGE FROM THE PRESIDENT

Happy New Year and welcome to 2016 I hope that all of you great holiday season and welcome 2016 for the potential to be a great year for all

As we look towards 2016 I am enthusiastic to continue our work with our allied associations iCERT NENA Ontario CITIG and iCERT (Industry Council of Emergency Response Technology) moving key agendas forward such as NG911 Emergency Telecommunicator Health and Wellness the National Public Safety Broadband Network and go on with our work on LMR Interoperability and P25 standards I also look forward to building new working relationships and explore how we can best maintain our representation of public safety communications professionalsrsquo interests in Canada in this ever changing process with the evolution of new technologies and better industry practices in addressing the needs and expectations of the public

APCO Canada and the other APCO associations and partners will carry on in their collaborative work to ensure that these items and others from abroad are shared to enrich our collective memberships as we venture along this evolutionary path together

I am truly looking forward to this yearsrsquo APCO Canada annual conference and trade show to be held in breathtaking Banff Alberta at the ldquoCastle in The Rockiesrdquo the beautiful Fairmont Banff Springs Hotel Located in the Banff National Park a UNESCO World Heritage Site there will be much to do and see apart from the benefits of attending the fantastic networking and education opportunities as well as the largest public safety communications trade show in Canada We hope to see you there and hope you register early as Irsquom sure the event will fill up quickly more to come from Gavin Hayes in his Conference Corner as things progress

We still have a lot of work ahead of us and many changes and challenges to come Our executive team and I will continue to dedicate our work to ensure that APCO Canada continues to be your ldquoVoice of Public Safety Communications in Canadardquo

Yours in ServiceRonald E Williscroft ENP

President APCO Canada

APCO CANADA NEWS

The Conference CornerAs always I want to again recognize the efforts of the public safety telecommunicators from across this country that have and continue to be involved in major disasters Your tireless dedication to your professions is what makes a difference to Canadians in their time of need You truly are the ldquoFirst First Respondershelliprdquo

Over the past few months I have been sifting through the evaluations that were completed by both delegates and our vendors From the feedback we have received we seem to have met the mark in 2015 but as always we are looking to improve our event to ensure the membership and our supporters are not only getting great value for their money but we are providing a detailed educational event

We have extended our contract with our current conference management company Events by Spark through 2017 Anh and her team have worked tirelessly with me and the board to produce a very tight professional show that people want to attend

I was selected to attend the hosted buyers program at the Tete-a-Tete event in Ottawa last week This event brings travel industry professionals together in one space to network and discover what the various destinations in Canada can give to our association should we wish to hold our event there I attended a full trade show and discussed the various needs we have when producing an event of our size Again these conversations took place to ensure we are getting the best value and service for the money we invest in our national event

APCO Canada has been working with HelmsBricoe a hotel procurement company since 2007 Leanne Calderwood is the global accounts manager who has an in-depth knowledge of our program and is always looking for locations in Canada that could be a ldquofitrdquo for our program I will be heading to Halifax and Charlottetown in late February to conduct a formal site visit of the various spaces that could fit our event

The 2016 event is already in the planning stages and the calls for papers will be listed on the conference site very soon If you have a presentation that would fit into our educational tracks please submit it by the deadline Our event is headed to Banff Alberta this year and is being held at the Fairmont Banff Springs Hotel from November 7-10

Please continue to monitor both the APCO Canada Website and the APCO Canada Conference Website for a number of updates about important information from your board

Gavin R HayesImmediate Past-

President APCO Canada

National Events Coordinator

6 Wavelength | wwwapcoca

wwwapcoca | Wavelength 7

APCO CANADA NEWS

Folks the currently version of Public Safety Telecommunicator 1 Canada Edition has been updated to mirror the changes in the PST 1 7th Edition The following is information for all members who are verified APCO trainers or who have taken the PST1 Course In response to the approval of new standards for Public Safety Telecommunicators (PST) and PST instructors APCO has updated its PST1 and PST1 Instructor courses to meet those standards The new 7th edition courses replace the current 6th edition of the classes

APCO Institute will accept PST 6th Edition paperwork from agency instructors until June 30 2016 The change will require individuals certified in the PST 6th Edition student course to upgrade their certificates to the new edition by December 31 2017

WHAT YOU NEED TO KNOWIf you are certified as a PST1 6th Edition instructorCertification Update CourseBeginning January 13 2016 and running through December 31 2017 current PST1 6th Edition instructors

will be provided with a self-paced online update at no charge Initially instructor update classes will be offered on a weekly basis available 247 Each class will remain open for three weeks and students should anticipate two hours of commitment to complete this update Two CDEs will be provided upon successful completion To register log into APCOIntlorg Training and Certification Training Courses Institute Online registration then scroll down to Public Safety Telecommunicator 1 7th Edition Update Canada or scroll down to Public Safety Telecommunicator 1 Instructor 7th Edition Update

Book ExchangePST1 6th Edition student manuals may be exchanged through December 31 2016 An exchange order form with additional information will be available in the PST1 7th Edition Instructor PSConnect community Cost for the exchange is $20 per manual plus shipping and handling

If you hold a certification for PST1 6th Edition and are not an instructorBeginning January 13 2016 through December 31 2017 a student online update will be available Individuals that received their PST certification in calendar year 2015 can update at no cost Anyone who has successfully completed an APCO basic course prior to 2015 can update to PST1 7th Edition for $30 This update is also accessible 247 and provides two CDEs for successful completion

RECERTIFICATION FOR PST1 7TH EDITIONRecertification for PST1 7th Edition is required every two years (24 hours per certification year) The cost to recertify is $30 The cost is only $15 if you submit multiple recertifications at the same time (ie PST and EMD)

WHATrsquoS NEW IN THE 7TH EDITIONSome of the new topics included in the 7th edition courses includebull 9-1-1 Disaster Preparedness in the PSAPbull Telecommunicator Emergency

Response Taskforces (TERT)bull Quality AssuranceQuality

Improvement programsbull Guide cards the Design and Use in

the PSAPbull NextGen and Other Emerging

Technologies (Module Five ndash all new)bull Record Management Systemsbull Managing SpecialtyHigh-Risk Callsbull The Need for Continuing Education

and Professional Developmentbull The Public Safety Telecommunicator 1

7th Edition meets or exceeds American National Standards as contained in the ANSI approved Minimum Training Standard for Public Safety Telecommunicators (APCO ANS 310322015)

Update and Requalification Required for Instructing Apco Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

For answers to questions or more information contact APCO Institute at 888-272-6911 or instituteapcointlorg

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

wwwapcoca | Wavelength 5

MESSAGE FROM THE PRESIDENT

Happy New Year and welcome to 2016 I hope that all of you great holiday season and welcome 2016 for the potential to be a great year for all

As we look towards 2016 I am enthusiastic to continue our work with our allied associations iCERT NENA Ontario CITIG and iCERT (Industry Council of Emergency Response Technology) moving key agendas forward such as NG911 Emergency Telecommunicator Health and Wellness the National Public Safety Broadband Network and go on with our work on LMR Interoperability and P25 standards I also look forward to building new working relationships and explore how we can best maintain our representation of public safety communications professionalsrsquo interests in Canada in this ever changing process with the evolution of new technologies and better industry practices in addressing the needs and expectations of the public

APCO Canada and the other APCO associations and partners will carry on in their collaborative work to ensure that these items and others from abroad are shared to enrich our collective memberships as we venture along this evolutionary path together

I am truly looking forward to this yearsrsquo APCO Canada annual conference and trade show to be held in breathtaking Banff Alberta at the ldquoCastle in The Rockiesrdquo the beautiful Fairmont Banff Springs Hotel Located in the Banff National Park a UNESCO World Heritage Site there will be much to do and see apart from the benefits of attending the fantastic networking and education opportunities as well as the largest public safety communications trade show in Canada We hope to see you there and hope you register early as Irsquom sure the event will fill up quickly more to come from Gavin Hayes in his Conference Corner as things progress

We still have a lot of work ahead of us and many changes and challenges to come Our executive team and I will continue to dedicate our work to ensure that APCO Canada continues to be your ldquoVoice of Public Safety Communications in Canadardquo

Yours in ServiceRonald E Williscroft ENP

President APCO Canada

APCO CANADA NEWS

The Conference CornerAs always I want to again recognize the efforts of the public safety telecommunicators from across this country that have and continue to be involved in major disasters Your tireless dedication to your professions is what makes a difference to Canadians in their time of need You truly are the ldquoFirst First Respondershelliprdquo

Over the past few months I have been sifting through the evaluations that were completed by both delegates and our vendors From the feedback we have received we seem to have met the mark in 2015 but as always we are looking to improve our event to ensure the membership and our supporters are not only getting great value for their money but we are providing a detailed educational event

We have extended our contract with our current conference management company Events by Spark through 2017 Anh and her team have worked tirelessly with me and the board to produce a very tight professional show that people want to attend

I was selected to attend the hosted buyers program at the Tete-a-Tete event in Ottawa last week This event brings travel industry professionals together in one space to network and discover what the various destinations in Canada can give to our association should we wish to hold our event there I attended a full trade show and discussed the various needs we have when producing an event of our size Again these conversations took place to ensure we are getting the best value and service for the money we invest in our national event

APCO Canada has been working with HelmsBricoe a hotel procurement company since 2007 Leanne Calderwood is the global accounts manager who has an in-depth knowledge of our program and is always looking for locations in Canada that could be a ldquofitrdquo for our program I will be heading to Halifax and Charlottetown in late February to conduct a formal site visit of the various spaces that could fit our event

The 2016 event is already in the planning stages and the calls for papers will be listed on the conference site very soon If you have a presentation that would fit into our educational tracks please submit it by the deadline Our event is headed to Banff Alberta this year and is being held at the Fairmont Banff Springs Hotel from November 7-10

Please continue to monitor both the APCO Canada Website and the APCO Canada Conference Website for a number of updates about important information from your board

Gavin R HayesImmediate Past-

President APCO Canada

National Events Coordinator

6 Wavelength | wwwapcoca

wwwapcoca | Wavelength 7

APCO CANADA NEWS

Folks the currently version of Public Safety Telecommunicator 1 Canada Edition has been updated to mirror the changes in the PST 1 7th Edition The following is information for all members who are verified APCO trainers or who have taken the PST1 Course In response to the approval of new standards for Public Safety Telecommunicators (PST) and PST instructors APCO has updated its PST1 and PST1 Instructor courses to meet those standards The new 7th edition courses replace the current 6th edition of the classes

APCO Institute will accept PST 6th Edition paperwork from agency instructors until June 30 2016 The change will require individuals certified in the PST 6th Edition student course to upgrade their certificates to the new edition by December 31 2017

WHAT YOU NEED TO KNOWIf you are certified as a PST1 6th Edition instructorCertification Update CourseBeginning January 13 2016 and running through December 31 2017 current PST1 6th Edition instructors

will be provided with a self-paced online update at no charge Initially instructor update classes will be offered on a weekly basis available 247 Each class will remain open for three weeks and students should anticipate two hours of commitment to complete this update Two CDEs will be provided upon successful completion To register log into APCOIntlorg Training and Certification Training Courses Institute Online registration then scroll down to Public Safety Telecommunicator 1 7th Edition Update Canada or scroll down to Public Safety Telecommunicator 1 Instructor 7th Edition Update

Book ExchangePST1 6th Edition student manuals may be exchanged through December 31 2016 An exchange order form with additional information will be available in the PST1 7th Edition Instructor PSConnect community Cost for the exchange is $20 per manual plus shipping and handling

If you hold a certification for PST1 6th Edition and are not an instructorBeginning January 13 2016 through December 31 2017 a student online update will be available Individuals that received their PST certification in calendar year 2015 can update at no cost Anyone who has successfully completed an APCO basic course prior to 2015 can update to PST1 7th Edition for $30 This update is also accessible 247 and provides two CDEs for successful completion

RECERTIFICATION FOR PST1 7TH EDITIONRecertification for PST1 7th Edition is required every two years (24 hours per certification year) The cost to recertify is $30 The cost is only $15 if you submit multiple recertifications at the same time (ie PST and EMD)

WHATrsquoS NEW IN THE 7TH EDITIONSome of the new topics included in the 7th edition courses includebull 9-1-1 Disaster Preparedness in the PSAPbull Telecommunicator Emergency

Response Taskforces (TERT)bull Quality AssuranceQuality

Improvement programsbull Guide cards the Design and Use in

the PSAPbull NextGen and Other Emerging

Technologies (Module Five ndash all new)bull Record Management Systemsbull Managing SpecialtyHigh-Risk Callsbull The Need for Continuing Education

and Professional Developmentbull The Public Safety Telecommunicator 1

7th Edition meets or exceeds American National Standards as contained in the ANSI approved Minimum Training Standard for Public Safety Telecommunicators (APCO ANS 310322015)

Update and Requalification Required for Instructing Apco Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

For answers to questions or more information contact APCO Institute at 888-272-6911 or instituteapcointlorg

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

APCO CANADA NEWS

The Conference CornerAs always I want to again recognize the efforts of the public safety telecommunicators from across this country that have and continue to be involved in major disasters Your tireless dedication to your professions is what makes a difference to Canadians in their time of need You truly are the ldquoFirst First Respondershelliprdquo

Over the past few months I have been sifting through the evaluations that were completed by both delegates and our vendors From the feedback we have received we seem to have met the mark in 2015 but as always we are looking to improve our event to ensure the membership and our supporters are not only getting great value for their money but we are providing a detailed educational event

We have extended our contract with our current conference management company Events by Spark through 2017 Anh and her team have worked tirelessly with me and the board to produce a very tight professional show that people want to attend

I was selected to attend the hosted buyers program at the Tete-a-Tete event in Ottawa last week This event brings travel industry professionals together in one space to network and discover what the various destinations in Canada can give to our association should we wish to hold our event there I attended a full trade show and discussed the various needs we have when producing an event of our size Again these conversations took place to ensure we are getting the best value and service for the money we invest in our national event

APCO Canada has been working with HelmsBricoe a hotel procurement company since 2007 Leanne Calderwood is the global accounts manager who has an in-depth knowledge of our program and is always looking for locations in Canada that could be a ldquofitrdquo for our program I will be heading to Halifax and Charlottetown in late February to conduct a formal site visit of the various spaces that could fit our event

The 2016 event is already in the planning stages and the calls for papers will be listed on the conference site very soon If you have a presentation that would fit into our educational tracks please submit it by the deadline Our event is headed to Banff Alberta this year and is being held at the Fairmont Banff Springs Hotel from November 7-10

Please continue to monitor both the APCO Canada Website and the APCO Canada Conference Website for a number of updates about important information from your board

Gavin R HayesImmediate Past-

President APCO Canada

National Events Coordinator

6 Wavelength | wwwapcoca

wwwapcoca | Wavelength 7

APCO CANADA NEWS

Folks the currently version of Public Safety Telecommunicator 1 Canada Edition has been updated to mirror the changes in the PST 1 7th Edition The following is information for all members who are verified APCO trainers or who have taken the PST1 Course In response to the approval of new standards for Public Safety Telecommunicators (PST) and PST instructors APCO has updated its PST1 and PST1 Instructor courses to meet those standards The new 7th edition courses replace the current 6th edition of the classes

APCO Institute will accept PST 6th Edition paperwork from agency instructors until June 30 2016 The change will require individuals certified in the PST 6th Edition student course to upgrade their certificates to the new edition by December 31 2017

WHAT YOU NEED TO KNOWIf you are certified as a PST1 6th Edition instructorCertification Update CourseBeginning January 13 2016 and running through December 31 2017 current PST1 6th Edition instructors

will be provided with a self-paced online update at no charge Initially instructor update classes will be offered on a weekly basis available 247 Each class will remain open for three weeks and students should anticipate two hours of commitment to complete this update Two CDEs will be provided upon successful completion To register log into APCOIntlorg Training and Certification Training Courses Institute Online registration then scroll down to Public Safety Telecommunicator 1 7th Edition Update Canada or scroll down to Public Safety Telecommunicator 1 Instructor 7th Edition Update

Book ExchangePST1 6th Edition student manuals may be exchanged through December 31 2016 An exchange order form with additional information will be available in the PST1 7th Edition Instructor PSConnect community Cost for the exchange is $20 per manual plus shipping and handling

If you hold a certification for PST1 6th Edition and are not an instructorBeginning January 13 2016 through December 31 2017 a student online update will be available Individuals that received their PST certification in calendar year 2015 can update at no cost Anyone who has successfully completed an APCO basic course prior to 2015 can update to PST1 7th Edition for $30 This update is also accessible 247 and provides two CDEs for successful completion

RECERTIFICATION FOR PST1 7TH EDITIONRecertification for PST1 7th Edition is required every two years (24 hours per certification year) The cost to recertify is $30 The cost is only $15 if you submit multiple recertifications at the same time (ie PST and EMD)

WHATrsquoS NEW IN THE 7TH EDITIONSome of the new topics included in the 7th edition courses includebull 9-1-1 Disaster Preparedness in the PSAPbull Telecommunicator Emergency

Response Taskforces (TERT)bull Quality AssuranceQuality

Improvement programsbull Guide cards the Design and Use in

the PSAPbull NextGen and Other Emerging

Technologies (Module Five ndash all new)bull Record Management Systemsbull Managing SpecialtyHigh-Risk Callsbull The Need for Continuing Education

and Professional Developmentbull The Public Safety Telecommunicator 1

7th Edition meets or exceeds American National Standards as contained in the ANSI approved Minimum Training Standard for Public Safety Telecommunicators (APCO ANS 310322015)

Update and Requalification Required for Instructing Apco Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

For answers to questions or more information contact APCO Institute at 888-272-6911 or instituteapcointlorg

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

wwwapcoca | Wavelength 7

APCO CANADA NEWS

Folks the currently version of Public Safety Telecommunicator 1 Canada Edition has been updated to mirror the changes in the PST 1 7th Edition The following is information for all members who are verified APCO trainers or who have taken the PST1 Course In response to the approval of new standards for Public Safety Telecommunicators (PST) and PST instructors APCO has updated its PST1 and PST1 Instructor courses to meet those standards The new 7th edition courses replace the current 6th edition of the classes

APCO Institute will accept PST 6th Edition paperwork from agency instructors until June 30 2016 The change will require individuals certified in the PST 6th Edition student course to upgrade their certificates to the new edition by December 31 2017

WHAT YOU NEED TO KNOWIf you are certified as a PST1 6th Edition instructorCertification Update CourseBeginning January 13 2016 and running through December 31 2017 current PST1 6th Edition instructors

will be provided with a self-paced online update at no charge Initially instructor update classes will be offered on a weekly basis available 247 Each class will remain open for three weeks and students should anticipate two hours of commitment to complete this update Two CDEs will be provided upon successful completion To register log into APCOIntlorg Training and Certification Training Courses Institute Online registration then scroll down to Public Safety Telecommunicator 1 7th Edition Update Canada or scroll down to Public Safety Telecommunicator 1 Instructor 7th Edition Update

Book ExchangePST1 6th Edition student manuals may be exchanged through December 31 2016 An exchange order form with additional information will be available in the PST1 7th Edition Instructor PSConnect community Cost for the exchange is $20 per manual plus shipping and handling

If you hold a certification for PST1 6th Edition and are not an instructorBeginning January 13 2016 through December 31 2017 a student online update will be available Individuals that received their PST certification in calendar year 2015 can update at no cost Anyone who has successfully completed an APCO basic course prior to 2015 can update to PST1 7th Edition for $30 This update is also accessible 247 and provides two CDEs for successful completion

RECERTIFICATION FOR PST1 7TH EDITIONRecertification for PST1 7th Edition is required every two years (24 hours per certification year) The cost to recertify is $30 The cost is only $15 if you submit multiple recertifications at the same time (ie PST and EMD)

WHATrsquoS NEW IN THE 7TH EDITIONSome of the new topics included in the 7th edition courses includebull 9-1-1 Disaster Preparedness in the PSAPbull Telecommunicator Emergency

Response Taskforces (TERT)bull Quality AssuranceQuality

Improvement programsbull Guide cards the Design and Use in

the PSAPbull NextGen and Other Emerging

Technologies (Module Five ndash all new)bull Record Management Systemsbull Managing SpecialtyHigh-Risk Callsbull The Need for Continuing Education

and Professional Developmentbull The Public Safety Telecommunicator 1

7th Edition meets or exceeds American National Standards as contained in the ANSI approved Minimum Training Standard for Public Safety Telecommunicators (APCO ANS 310322015)

Update and Requalification Required for Instructing Apco Public Safety Telecommunicator 1mdashCanadian Version

By Theresa Virgin

For answers to questions or more information contact APCO Institute at 888-272-6911 or instituteapcointlorg

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

FEATURE

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

The emergency management sector continues to evolve at a rapid pace with new directives technologies and best practices Thomas Putt leads Calianrsquos emergency management training Western Canada operations In a Wavelength Magazine exclusive Putt takes a look at emerging trends in the sector and their impact going forward

As Calianrsquos Regional Director West Thomas E Putt MSM LOL CD BMASc has over three decades of experience in the security and emergency management field

8 Wavelength | wwwapcoca

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

wwwapcoca | Wavelength 9

Enhanced Simulation Social Media and Multi-Jurisdictional Training among Top New Trends in Emergency Management Sector

PUTT

1 Emergency management preparedness has become a priority of all levels of government and the private sectorEmergency management has changed incredibly in the past 10 to 15 years to include more involvement from municipal provincial and private sector organizations There is an increasing focus on the design development and delivery of a complete range of emergency management exercises targeting the most important issues from a local perspective These include ice storms national disasters and train derailments as well as floods fires earthquakes and other local emergency management priorities

For example Calian undertook a two-year work-up with educational vignettes and exercises for the 2010 Vancouver Olympics culminating with a major preparatory exercise to ensure the 200 participating agencies were synchronized and ready to respond to whatever happened at the games It is no longer just the purview of emergency management experts to be involved in the process ndash everyone has a responsibility in a crisis

2 Training has shifted to multi-day multi-jurisdictional exercisesThe emergency management community continues to evolve from comparatively simple one-day seminars or tabletop exercises to demanding multi-jurisdictional multi-day events aimed at evaluating emergency response and business continuity plans to make sure they are current relevant and operational This approach has much more value add for the organization and really helps prepare all key players to be ready to respond to even the worst-case scenario

3 Organizations are thinking long-term and involving all players not just first respondersA big trend in the industry is the focus on developing a long-term sustainable methodology approach to exercise delivery This includes making sure that all stakeholders are active participants and understand their role within the larger context of the plan through to the delivery and execution of the validation exercise

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

5 Social media is becoming increasingly embedded in all training applications Including a strong social media component in all modern emergency preparedness simulations is not just about the information but ensuring that decision makers understand how important it is to keep the public informed In the integrated simulation model each participant in a crisis is involved in this process and each have their own social media feeds to manage This allows all individuals involved to be comfortable with their roles and their responsibilities to effectively communicate to the public in an event of a crisis

4 Simulation is becoming essential to real-life training scenarios Simulation and simulation tools are becoming critical to effective emergency management training As a whole simulation creates a realistic picture of what is going on during the crisis The important part of simulation is that it provides for real-time and space in the way a tabletop exercise alone cannot Participants are forced to confront an issue head on as they would in real life rather than talk it through with colleagues

One of the big challenges of simulation years ago was that it was very expensive Now it is more attainable which allows smaller municipalities with fixed budgets to do emergency preparedness training in a more realistic way

ENHANCED SIMULATION SOCIAL MEDIA AND MULTI-JURISDICTIONAL TRAINING AMONG TOP NEW TRENDS

10 Wavelength | wwwapcoca

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

copy2015 Intergraph Corporation dba Hexagon Safety amp Infrastructure Hexagon Safety amp Infrastructure is part of Hexagon All rights reserved Hexagon Safety amp Infrastructure and the Hexagon Safety amp Infrastructure logo are trademarks of Hexagon or its subsidiaries in the United States and in other countries

INTERGRAPH IS NOWHEXAGON SAFETY amp INFRASTRUCTURE

We are global proven and innovative

Our integrated incident management solutions increase public safety and security performance and productivity while reducing the total cost of ownership for mission-critical IT investments

hexagonsafetyinfrastructurecom

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

12 Wavelength | wwwapcoca

Applying High Availability Design and

Parallel Redundancy Protocol (PRP) in

Safety Critical Wide Area Networks

TECHNOVATIONS

By Mark Graham

Harris Corporation Government Communications Systems Division Mission Critical Networks

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

AbstractNetworks which protect the safety of human lives place special emphasis on network availability and survivability The nationrsquos Air Traffic Control (ATC) and First Responder public safety networks used by police departments fire and rescue and emergency medical teams are examples of networks that require high availability and survivability The term mission critical network is often used to describe the characteristics of networks which protect the safety of human lives There is not a universally accepted standard definition of the term but much literature on the subject typically identifies three salient characteristics

bull Highly Securebull Highly Availablebull Highly Survivable

Highly secure is an important characteristic and needed to design a safety critical network but the focus of this paper is availability and survivability It should be noted that mission critical safety networks are private networks and should not be confused with the public Internet simply because they use IP A private network in itself does not constitute a mission critical network but it is a significant characteristic of a mission critical network due to the security and performance benefits it supports The security benefit is risk mitigation from external threats because only authorized internal users can access the network The performance benefit is similar in that only authorized users have access to the network and their network usage does not have to compete for bandwidth with other external users

Availability and Survivability are related but they are not the same thing Availability is simply a measure of the time the network is operating compared to the total time it should be operating Availability is defined as Uptime divided by Uptime plus Downtime This same reference defines Survivability as the capability of a system (or network in this case) to perform its mission recognizing

that failures are going to occur As will be explained later in this paper survivability considers catastrophic events that cannot be easily predicted in an inherent availability model

Specifically this paper focuses on the availability and survivability of the Wide Area Network (WAN) terrestrial core backbone component of safety critical networks Much literature on public safety networks for First Responders is devoted to the wireless radio networks including Land Mobile Radio (LMR) P25 packet radio cellular telephony and evolution towards broadband 4G Long Term Evolution (LTE) wireless networks Air Traffic Control networks rely on other wireless forms of communication including narrow-band Air-to- Ground (aircraft to ground based controller) voice and data links in the Very High Frequency (VHF) spectrum All of these wireless forms of communication rely on a terrestrial core backbone for backhauling and distributing information to the right place The terrestrial core backbone is a foundational building block for other safety critical network components

This paper also describes some of the differences between legacy Time Division Multiplexing (TDM) technology and modern Internet Protocol (IP) packet switched technology Historically networks such as the nationrsquos Air Traffic Control (ATC) network have relied on point-to-point TDM technology

Introduction Mission critical networks that provide voice video and data communication services for safety critical applications require special consideration to ensure they are highly available and survivable Most network systems and services desire a highly available network design because of the economic impact associated with lost revenue opportunity and the cost of doing business Safety critical networks need to place special emphasis on availability and survivability of the network because not only is there an economic

impact when the network fails but there is also the risk of safety impact which could potentially affect human lives

Modern networks use routing and switching architectures to improve flexibility and efficiency Flexibility is improved through the ability to dynamically route network traffic over multiple paths to avoid failures and improve network throughput performance These routed and switched networks are more adaptable and affordable compared to traditional TDM networks because they are built over a shared routing infrastructure A shared routing infrastructure uses packet switching and statistical multiplexing to ldquosharerdquo network resources for all traffic In contrast a TDM network infrastructure separates network traffic into separate timeslots and dedicates the timeslots to specific users and traffic The downside of a TDM implementation is that the dedicated timeslots are unused and wasted when the specific user or traffic which the timeslot is assigned is idle A modern routed network overcomes this shortcoming of TDM networks with statistical multiplexing but the improvement creates new vulnerabilities that did not have to be considered in a TDM network In a point-to-point TDM network physical equipment redundancy and physical circuit diversity are enough to overcome most types of network failures In a modern routed network the routing function is a logical function that can also fail Logical routing failures are not common but even infrequent failures cannot be tolerated in mission critical networks where public safety is at risk These uncommon failures are characterized as ldquosix sigmardquo events because of their infrequent occurrence An example of six sigma event is a logical routing protocol phenomenon known as a black hole where packets are dropped and data is lost even though no apparent or obvious failure event has occurred Although these events are rare the sinister nature of them makes them difficult to be detected and repaired And

wwwapcoca | Wavelength 13

GRAHAM

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

14 Wavelength | wwwapcoca

unlike point-to-point TDM circuits which affect only sites and services at each end of a circuit when failures occur a shared routing network infrastructure problem can be widespread effecting multiple sites services and applications

Network survivability goes beyond traditional availability modelling to address the challenge of overcoming these unpredictable six sigma events Parallel Redundancy Protocol (PRP) was designed for mission critical environments where high survivability is required Initially designed for dual core Local Area Networks (LAN) PRP has been enhanced for dual core WAN environments and is part of the solution needed for providing a highly survivable WAN for public safety A dual core network is exactly what the term implies two separate core networks The separation can be physical logical or both High availability requires the two networks to be physically diverse from one another with physically separate equipment and physically separate circuit paths High survivability requires them to be logically diverse with separate routing domains Furthermore logically separate means the two networks cannot be interconnected with a routing protocol such as Border Gateway Protocol (BGP) because of the potential for routing anomalies from one network affecting the other Note that more than two networks can be used for even greater survivability and this capability is currently being evaluated in the Critical Network labs of Harris Corporation The dual core WAN implementation has already been evaluated and deployed and is in operation today

This paper focuses on availability and survivability considerations describing how PRP works in conjunction with a dual core network and how it has been enhanced to provide the needed survivability for mission critical public safety networks

Modern networks being used and deployed for these safety critical services are based on shared routing

infrastructures using Internet Protocol (IP) technology The flexibility and economies of scale afforded by IP technology cannot be ignored even though they present new challenges to address for safety critical networks

Furthermore almost all researches and development activities in industry are focused on the more flexible and cost effective IP technologies as many TDM technologies are becoming obsolete Technology obsolescence is a serious telecommunications infrastructure concern that will have to be addressed in coming years [1-3]

High Availability DesignAvailability modelling is a proven science which has been adapted for networks and telecommunications It uses Reliability engineering to calculate Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR) parameter values and proven mathematical formulas to predict the expected availability of a network service

There are multiple aspects to consider when designing a high availability network These aspects are

bull Physically diverse and redundant equipment strings

bull Protection switching and routingbull Circuit and fibre path diversity and

redundancy

Physically redundant and diverse equipment strings as well as circuit and fibre path redundancy rely on proven mathematical formulas for serial and parallel components for calculating the service availability between a core node in the network to any other core node in the network The ITU [4] defines availability as follows

ldquoAvailability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval assuming that the external resources if required are provided [4]rdquo

In our model we use the ITU definition and further define the ldquostate to perform a required functionrdquo the ability to provide communication service between core nodes in the network Bit errors excessive latency bandwidth congestion and other forms of degraded signal transmission are not considered in these calculations These degraded forms of communications are handled by ldquohigher levelrdquo functions By higher level we mean protocols andor applications which detect degraded signals and correct them through other means such as re-transmissions or Forward Error Correction (FEC) Examples of failures considered in the availability model are failures that cause a complete loss of communication service such as nodal equipment failure andor a fiber cut The term service outage is often used to describe this condition

For illustration we use the ITU definition and define the interval of time we want to measure as one day (24 hours)

Since the item we want to perform the required function is the network we define the ability of the network to communicate to be Network Uptime and Availability is calculated as

Availability=Network Uptime Network Uptime + Network Downtime

In other words if the network is designed to operate 24 hours a day but it is only communicating 23 hours in a given day then the network availability for that day is 9583 (2324)

Note that in practice availability is usually measured over longer intervals (monthly and annually) providing a larger population sample of data points to use in the Availability calculation Service Level Agreements (SLA) where a network service provider contracts to meet a certain availability threshold are based on these longer intervals [5]

When designing a network the actual network uptime is unknown so the

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

wwwapcoca | Wavelength 15

GRAHAM

Availability of specific components is predicted using Mean Time Between Failure (MTBF) and Mean Time to Restore (MTTR) metrics also defined by the ITU MTBF and MTTR are Reliability parameters [1] When using these metrics Availability is predicted by the following formula

This formula is used to predict the availability of network nodal equipment and power subsystems but we also need the ability to predict failures in the transmission medium Metrics for calculating the MTBF and MTTR for the transmission medium is shown in the IEEE journal on page 101 with references to other Telcordia standards6 (formerly known as Bellcore) The key metrics identified are an expected failure rate of 439 failures per year per 1000 miles of sheath The predicted restoral rate for each one of these failures is 12 hours

Once the Availability of the specific components is known the actual availability for a specific network design using redundant and parallel equipment and circuit paths (also shown in reference 1) can be calculated using the formula below

Aparallel=1 ndash ((1-A1) (1-A2) hellip (1-An))

Since there is more than one piece of equipment in a string the availability of each of the parallel components is combined using the following formula for serial components (also shown in reference 1)

Astring=Aserial-1 Aserial-2 Aparallel-1 An

As shown the formulas for parallel and serial components can be repeated ldquon timesrdquo or as much as necessary to account for strings and sub-strings of equipment

Protection switching and routing is also very important because there is a need to be able to switch to a redundant path when one path fails The protection switch itself can be a single point of failure if it is not designed properly An example of

a properly designed protection switch is the Cisco Optical Network Server (model 15454) which is a Synchronous Optical Network (SONET) Add-Drop Multiplexer (ADM) Other well-known manufacturers of SONET ADMs are Alcatel-Lucent and Fujitsu SONET ADMrsquos are proven reliable protection switches and have a long and successful track record

Uni-directional Path Switched Ring (UPSR) SONET technology is preferred for high availability design because of its simpler design but Bi-directional Line Switched Ring (BLSR) technology can also be used if necessary UPSR rings are implemented with two fibers in a ring configuration [6] One fibre transmits in the clockwise direction and the other transmits in the counter-clockwise direction Each node in the ring is initialized to receive the signal on the fibre transmitting in the clockwise direction and when a failure is detected it switches to the other fibre transmitting in the counter-clockwise direction The two fibre paths are referred to a ldquoworkingrdquo which is the active path in use and ldquoprotectrdquo which is the redundant path ready to be used when a failure occurs on the working path This feature is often described as ldquoself-healingrdquo because it can withstand a fibre cut

BLSR rings can be implemented with two fiber or four fibre configurations Unlike the simpler UPSR implementation each fibre can be configured to transmit in either direction BLSR ring configurations offer capacity benefits to service providers (relative to UPSR implementations) because they do not need the entire ring to transmit between two nodes on the ring

Modern core network backbone topologies are designed with mesh backbones The term stems from the idea that multiple paths exist from one core node to other core nodes in the network backbone forming a mesh A full mesh core backbone is where every core node has a direct connection to every other core node in the network as shown below (Figure 1)

Figure 1 Full mesh core backbone

A partial mesh backbone is when every core node is not directly connected to every other node The figure above would become a partial mesh backbone if one or more of the connections in the figure above were removed with the caveat that all nodes remain connected just not necessarily with direct connections

Mesh backbone technologies are also difficult to design to meet an availability objective Over time methods and tools have been developed to solve this problem In general Telcordia standard metrics are used to estimate the expected number of failures on a circuitfibre route Fundamentally the longer the route the lower the availability because of the higher probability that something will fail (eg equipment failure fibre cut etc) The other problem is identifying the number of parallel and serial paths between any two end points in a mesh topology For example the simple full mesh topology shown below has five possible paths from every node to every other node (Figure 2)

Figure 2 Simple fullmesh topology

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

16 Wavelength | wwwapcoca

The problem becomes much more complex in a large wide area network where tens of thousands of paths have to be evaluated Tools which use Dijkstrarsquos algorithm10 have been developed to identify all of the possible paths as well as which ones overlap with one another

Once all the possible routing paths are known the predicted availability from each core node to every other core node can be calculated and predicted The ldquoNxNrdquo matrix below shows the predicted availability from each core node in the Simple Full Mesh Topology shown above (Figure 3) A shorthand notation is used in the matrix to improve readability

Figure 3 ldquoNxNrdquo matrix

For example the availability between the SFO and ATL node is shown as 9(5)2 which represents a predicted availability of 999992 (stated as five nines two) The availability calculation was based on the distances shown in the diagram using the metrics identified above [7] The matrix shows the predicted availability for communication service between every core node pair but it is common to be conservative and report the predicted availability of a given core node backbone using the lowest predicted availability in the matrix for a specific core node pair In this example we would predict the availability for any communication service over this simple full mesh core node backbone topology to be 99999 (five nines zero) because it is the lowest predicted availability between any two core nodes as highlighted in the matrix (SFO to NYC)

SurvivabilityThe other component needed for mission critical safety design is network survivability Survivability is often confused with availability but it is quite different in that availability predicts failures using

reliability engineering while survivability is concerned with the behaviour of the network when failures occur Survivability also considers ldquoaccidentsrdquo or failures that cannot be predicted by an inherent availability model based on reliability parameters Routing anomalies such as black holes are logical failures and occur less frequently than physical failures Because of their infrequent occurrence they are not always addressed by network architects but they can have widespread and devastating impact to routed networks and are unacceptable to mission critical networks where public safety is at stake

The dual core architecture is a solution for overcoming these unpredictable six sigma events that cannot be predicted by traditional availability modelling A dual core architecture uses two logically separate (separate routing domains with separate Autonomous System (AS) numbers) backbones It preserves the flexibility that dynamic routing can provide along with the cost efficiencies of using a shared infrastructure This is where Parallel Redundancy Protocol (PRP) technology comes into play and in particular the enhancements to PRP known as PRP-1+ which provide the ability to seamlessly use either one of the core networks in a dual core WAN architecture

PRP was introduced as an industrial engineering standard for LANs by the International Electrotechnical Commission (IEC) an international standards body that develops and publishes standards for electrical electronic and related technologies used in industry today The original standard was published as IEC

62439-3 in 2010 and is often referred to as PRP-0 The standard was updated in July 2012 to be compatible with High-availability Seamless Redundancy (HSR) protocol IEC 62439-3 (2012) has superseded IEC 62439-3 (2010) and is referred to as PRP-111 [8]

PRP was motivated by the energy industry where automated power substations require a highly reliable operation with seamless failover PRP-0 required separate parallel LANs of similar design12 PRP-1 improved upon the original PRP-0 by making it more flexible in its ability to deal with multi-protocol environments and more network topologies especially ring topologies implemented using High-availability Seamless Redundancy (HSR) Harris Corporation enhanced the PRP-1 implementation so that it could operate in a WAN environment hence the name PRP-1+

How PRP WorksA functional diagram of dual core network architecture is shown below The diagram shows two separate networks Blue and Red and a PRP Network Redundancy Box referred to as a Red Box on either side running PRP1+ (Figure 4)

Starting from the left side the Ethernet frames entering the Red Box are duplicated and each packet is transmitted over the Red and Blue Core networks simultaneously to the Red Box on the right side The Red Box on the right side uses a Discard Duplicate (DD) algorithm to identify duplicate packets and discard them when they are found The operation is transparent to the user applications sending and receiving packets whereby

Figure 4 Red Box on either side running PRP1+

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

GRAHAM

wwwapcoca | Wavelength 17

the user is unaware which network (Red or Blue) is carrying the packets they receive

How PRP was Enhanced to Work over a WAN PRP was originally designed to work in a Local Area Network (LAN) and was limited to this environment Limitations were due to the design of the Discard Duplicate algorithm which uses specific information in the Layer 2 frame to identify duplicates and discard them when they are found Specifically PRP uses the source Medium Access Control (MAC) address and a sequence identifier which PRP adds to a Redundancy Control Trailer (RCT) at the end of each frame when the sender duplicates a frame With this approach the PRP receiver stores the source MAC address and sequence identifier from the first frame it receives and passes the frame through Each subsequent frame received is compared to these two fields and if they match the Discard Duplicate process will discard the duplicate frames it finds Frame information is buffered for a specific duration of time so that the sequence number does not wrap causing duplicate frames to erroneously be missed Note that early PRP implementations required other fields for identifying duplicates (eg Lane Identifier) but PRP-1 eliminated the requirement of these fields in order to support improved flexibility in the types of networks where PRP can be used PRP-1 simply provides guidance for how the Discard Duplicate algorithm should work instead of specifying the actual design

PRP is ideal for a dual LAN architecture used in the energy industry for both its high reliability and seamless failover characteristics When considered for use in a WAN environment it did not seem applicable at first glance

There were some technical challenges associated with getting PRP to work in the WAN environment The first challenge was figuring out how to get the Discard Duplicate algorithm to work in the WAN environment The reason is the key layer 2 data elements source MAC address and layer 2 sequence number would

not be retained as the layer 2 link level information is updated at every network node hop in a WAN To overcome this it was determined that the PRP receiver could determine if a frame contained Internet Protocol (IP) packets in the frame by the layer 2 protocol identifier field Once it is known that the frame contains an IP packet then fields in the layer 3 IP packet header that correspond to similar fields used by PRP at layer 2 could be used by the Discard Duplicate algorithm In the IP packet header the corresponding fields to use are the source IP packet address and packet Identification field which are similar to the frame sequence number field used by PRP The structure of an IP packet is shown on the right

Another challenge to overcome in the WAN environment was the relative difference in latency between the two networks In a LAN environment duplicate frames arriving on two separate networks will typically be microseconds apart In a WAN environment this disparity of arriving duplicate packets on two separate WANs will be milliseconds apart so efficient buffering and processing techniques are required to identify and discard duplicates Efficient buffering and processing is required because of the high data rates used in todayrsquos networks and the number of packets that will arrive on the two networks that have to be checked for duplicates (Figure 5)

Figure 5 IP header structure

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

APPLYING HIGH AVAILABILITY DESIGN AND PRP IN SAFETY CRITICAL WIDE AREA NETWORKS

18 Wavelength | wwwapcoca

For example using the smallest packet size of a 46 octet IP packet and a 1Gbps line rate requires approximately 1488M packets per second that have to be examined for duplicates This calculation takes into account the following characteristics for IEEE 8021q Ethernet frames (Table 1)

The 46 octet layer 3 packet is encapsulated in the data field of the layer 2 Ethernet frame and each packet will require 672 bit times as shown below

bull 46 octets + 8 octets Preamble and SFD + 12 octets of Destination and Source address + 2 octets of Frame type + 4 octets of CRC + 12 octets of IFG=84 octet packet times=84 8=672 bit times

At a line rate of 1Gbps the number of packets that have to be examined is

bull 1Gbps 672 bits per packet= 1488095 ps

It is also important to consider the possibility of the packet Identification field wrapping in the WAN environment The Identification field is 16 bits in length and the analysis below shows that when using the small packet size of 46 octets at a line rate of 1Gbps the Identification Field could theoretically roll over in 44ms

bull 1488095 ps= 1 packet every 0000006720001 seconds=672 us

bull a 16 bit sequence number incrementing once per packet would roll over in

65536 6720001 us=4404 ms

If we relied strictly on the Identification field and the latency of arrival of one network relative to the other was greater than 44ms then the Discard Duplicate algorithm would not detect and discard all duplicate packets

It should be pointed out that these are extreme c o n d i t i o n s because most networks use larger size packets and the small size 46 octet packets are worst case In other words the problem is not as difficult with

larger packet sizes because fewer packets are received in a given time interval and the number of packets that need to be checked by the Discard Duplicate algorithm becomes smaller However it does point out the need for efficient buffering and processing in the Discard Duplicate algorithm

Although the probability of this wrapping phenomenon is mathematically small the probability of its occurrence can be reduced by considering other fields in the IP packet when identifying and discarding duplicates Considering and comparing other fields such as the Destination IP address the Flags and Fragment Offset and the Data field improve the robustness of the Discard Duplicate algorithm

With these enhancements PRP works in the WAN environment the same way it was designed to work in the LAN environment Data is transmitted over both core networks by a sending Red Box to its destination where a receiving Red Box looks for duplicates If one of the two networks were to fail for any reason then the packet will make its way to the destination over the other network In a routed IP network this benefit is especially useful when unpredictable six sigma events such as a black hole occur

Putting it All Together and the Benefits of PRP1+Engineers have been running predicted availability models for years and the science and mathematics are well understood The process involves breaking down components into serial and parallel paths and using proven formulas to calculate predicted availability based on MTBF and MTTR parameters When designing a core network availability is improved by creating parallel paths with physically separate equipment and circuit paths

Automatic Protection Switching (APS) or dynamic routing is also required to use physically separate paths It may seem the use of Red Boxes and PRP is all that is needed and can provide both physical and logical diversity as long as each core network is both physically and logically diverse However the robustness of the design can be improved if each one of the core networks is designed to be physically diverse within it This design implies that high availability through physical diversity is achieved on each core network of the dual core network and PRP is used to add the logical diversity feature The benefit of this added robustness is that no single high availability design is perfect and the added protection is warranted when it comes to public safety

The physical diversity in the equipment and circuit paths is extremely important but routed networks are also susceptible to logical failures and physical diversity alone is not enough to protect a mission critical safety infrastructure Adding logical diversity to high availability architecture is known as survivability

Survivability adds logical routing diversity to a network architecture by creating a logically separate routing domain Route domain failures do not occur often but when they do the impact can be devastating in that multiple sites and services are affected Routing failures can be caused by unusual and unexpected equipment failures deliberate cyber-attacks or unintended human error These types of failures are not anticipated

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

wwwapcoca | Wavelength 19

by a typical availability model based on hardware component failure rates andor fibrecopper cuts in the circuit paths Although they do not occur often widespread logical failures are unacceptable to mission safety critical services [9-12]

The principal survivability benefit of PRP is that it supports the use of both of the logically separate routing domains And since PRP is a layer 2 switched protocol (as opposed to a layer 3 routing protocol) it has no reaction to routing anomalies that may occur in one of the two route domains This point is significant because many dual core network architectures use routing protocols like Border Gateway Protocol (BGP) or multicast protocols to take advantage of the separate networks However a failure in routing can affect both networks with this type of design

Another benefit of PRP is its seamless operation PRP is similar in concept to Uni-directional Path Switched Ring (UPSR) technology which has been in use in the Synchronous Optical NETwork (SONET) world making disruptions in the network unnoticeable This is especially important to latency and jitter sensitive services like mission critical voice In a network that does not use a dual core architecture with PRP a failure in either network will normally cause a momentary disruption while routing protocols figure out how to route around the failure This phenomenon is known as route convergence and depends on many variables causing seconds (and possibly minutes) of disruption in a potentially critical voice stream When human lives are at stake a few seconds of disruption for such a critical service can be significant

Related WorkOther concepts for enhancing PRP for IP networks were presented at the IEEE international conference in Cape Town in 2013 [13] The IEEE reference addresses some of the differences between IPv4 and

IPv6 packets where additional research is needed This paper focuses on IPv4 and the capabilities described in this paper are proven and currently deployed in mission critical networks More work is needed to address networks and applications that use IPv6 Harris Corporation is also working on PRP solutions for networks using IPv6

SummaryMission critical networks used to support safety critical applications require special design considerations for high availability performance and survivability

High availability design uses proven reliability engineering with equipment redundancy and physical diversity routing of copper and circuit paths Using proven mathematical formulas and parameters a high availability objective can be accurately predicted and achieved

Historically in legacy point-to-point networks based on TDM technology high availability design was all that was needed to meet the needs for mission critical safety networks To take advantage of more flexible and more cost effective IP routed and switched networks network survivability must also be addressed Whereas availability addresses the physical aspects of the design survivability addresses the logical aspects and six sigma events that are not anticipated by the availability model Both components are needed for robust mission critical safety network architecture design

The Dual Core network architecture is a proven highly available and highly survivable design and mitigates risks associated with unpredictable logical routing anomalies Using Layer 2 PRP to connect logically separate core networks avoids the failures that can occur when the two networks are connected with Layer 3 routing protocols

References1 Murphy J Morgan TW (2006 ) Availability

Reliability and Survivability An Introduction and Some Contractual Implications The Journal of Defense Software Engineering

2 httpgcncomarticles20150213faa-fti-2aspx

3 httpswwwanpicomend-of-life-plans-for-tdm

4 (2008) Recommendation ITU-T Definition of Terms Related to Quality of Service E800

5 To M Neusy P (1994) Unavailability Analysis of Long-haul Selected Areas in Communications 12 100-109

6 Kobayashi DS Tesfaye M (1991) Availability of bi-directional line switched rings Rep

7 httpwwwsonetcomEDUupsrhtm 8 (2012) ldquoIndustrial communication

networks - High availability automation networks - Part 3 Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR)rdquo (3rdedn) IEC 62439

9 httpwwwsonetcomEDUblsrhtm 10 httpwwwhill2dot0comwikiindex

phptitle=4F-BLSR 11 Cormen TH Leiserson CE Rivest RL

Stein C (2001) Dijkstrarsquos algorithm In Introduction to Algorithms MIT Press Massachusetts USA

12 Weibel H (2011) Tutorial on Parallel Redundancy Protocol (PRP) Zurich University of Applied Sciences

13 Rentschler M Heine H (2013) The Parallel Redundancy Protocol for industrial IP networks IEEE International Conference on Industrial Technology (ICIT)

Citation Graham M (2015) Applying High Availability Design and Parallel Redundancy Protocol (PRP) in Safety Critical Wide Area Networks J Telecommun Syst Manage 4 120 doi1041722167-09191000120 Copyright copy 2015 Graham M This is an open-access article distributed under the terms of the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original author and source are credited

GRAHAM

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

20 Wavelength | wwwapcoca

Safety Checks

TRAINING

By Mike Reschny EFD EPD EMD APCO Institute Adjunct Instructor

About the AuthorMike Reschny has 34 years of experience in Emer-gency Services a former Advanced Care Paramed-ic and EMT instructor as well as 20 years of service in Fire Communications

An APCO instructor for over 10 years has a pas-sion for learning andor sharing with others with different experiences and ideas

Whether we call them ldquoSafety Checksrdquo ldquoTime Checksrdquo or ldquoPAR Checksrdquo (Personnel Accountability Report) these can be a critical part of what we do and should be recorded

Most agencies are probably doing this function at some level So a little reminder of its importance that it not only impacts the incident as the units respond but also as the incident unfolds as well Increases safety decreases liabilities and can be used as time stamped evidence

Doing ldquoTime Checksrdquo may sound like a lot of extra work but it really isnrsquot compared to the benefits to the crews the appreciation it generates from Command and the firefighters if implemented right and if tied into your CAD Oh yes but it does mean some ldquoCHANGErdquo

Having well-researched well-written and well-trained SOPs on this is well worth it

Time Checks can or should be implanted for law enforcement ldquoTactical Operationsrdquo standby with allied agencies or routine calls of longer durations

For fire service working incidents structure fires hazmat and technical rescue and can be asked for and cancelled by incident command at anytime

Time checks can be used for EMS incidents of longer duration where safety is a concern or MCI incidents

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

wwwapcoca | Wavelength 21

RESCHNY

It will take adjusting them to your agency needs as per what is deemed a ldquoworkingrdquo incident For example a small shed fire with only one engine-company involved probably isnrsquot needed or for a vehicle leaking fuel is technically a hazmat situation but wonrsquot activate the time checks Mind you having SOPs in place is the key

A house or apartment building with smoke and flames or a semi tanker leaking methethldeath will by default start the process Command can stop them as needed based on what they are dealing with on scene or usually when ldquoloss stoppedrdquo is given at a working fire

This Entire Function is in Support of Incident Comment and Crew SafetyDecent working incidents with multiple

crewscompanies with life threatening operations potential spread or exposures and environmental hazardsissues can be a lot to handle for incident command andor sector officers As an emergency communications specialist this is where we shine Providing reminders of time lapses crew accountability changes in environmental conditions and recorded in the CAD

Safety Starts and Ends with Us In CommunicationsTime flies when we are having fun and it can get away on Incident Command and sector officers when they are dealing with serious issues right in front of them Ensure they are armed with great response information updates pre-plans or location incident history

Points to Consider

Start Timebull Time of call or time of response This

gives incident command an idea of duration so they can have a time mark for how long it has been burning to estimate structural integrity fire spread heatsmoke etc and not limited to the interior crew time or attack time

10-Minute Time Checksbull Dispatch notifies Incident Command at

ldquo10 minuterdquo intervals There is no action taken or required it is simply a ldquoTIMErdquo check to assist Command as they can get very busy in large incidents When we lose track of time we can lose track of operations and crews

bull It may be reported to Incident Command but all crews on scene hear it and can do a quick mental check of their SCBA air time burn time interior sector working in the fire in the hot zone divers in the water or crews entering a trench or cave-in etc

bull The ldquoTime Checkrdquo is entered into the CAD and time stamped as a 10 min time check 20 min time check 40 50 70 80 etc Communications calls command waits for a reply and then announces ldquoCommand 40 min time checkrdquo or ldquoCommand 80 min time checkrdquo and records it in the CAD

bull It becomes a part of the incident documentation for incident reviewcritique liability protection and NFIR reports to the Office of the Fire Commission

bull If you noticed that 30 60 90 were missinghellip good

30-Minute Accountability Checksbull Communications also calls Incident

Command at ldquo30 minuterdquo intervals and calls for an ldquoAccountability Checkrdquo or ldquoPAR Checksrdquo In turn Command calls each company officer to ldquophysicallyrdquo account for their company and report back to command

bull Sounds like a lot but it isnrsquot since everyone heard the 30 minute notification from Communications each company officer is expecting to hear the request from Command Command asks Company or Sector Officers for all an ldquoAccountability Checkrdquo Each CompanySector Officer physically accounts for their crew and reports it back to Command as each officer reports back to Command communications records it in the CAD For examplendash Ladder 8 acct Engine 6 accthellip

bull Communications will ldquoANNOUNCErdquo which accountability check it is 30 60 90 120 etc ldquoCommand this is your 60-Minute accountability checkrdquo and record it in the CAD Command can

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

22 Wavelength | wwwapcoca

SAFETY CHECKS

track times not only for accountability but also if crews need to be switched out structural issues and of course is anyone missing Remember hindsight is 2020 and ldquoOopsrdquo is a four-lettered word

Think about Adding the Following

Weather Report

bull A weather report should not be something Command has to ask forhellip take the imitative and provide it For these types of incidents dispatch can include a weather report in the CAD notes as close to the start of the incident as possible However if itrsquos a hazmat incident the wind speed and direction

should be given in the dispatch if possible and at least in the post dispatch update and certainly before the responders arrive

bull A weather report should be recorded in the CAD at the start of these incidents including the time wind speed and direction temp and humidity

bull A weather update can be added to your CAD notes at each 30-minute accountability check but should be updated to Incident Command if there are any significant changes

o If the crews are doing a high angle rescue on a metal tower they might like to know that lighting storms are moving in or if the wind is picking up

o SWATERT teams should be informed of weather changes or Watches and Warnings

o Fire crews fighting a grass fire and the temp or winds are increasing or changing

o Hazmats with vapor clouds or flowing liquidshellip

bull This may sound like a lot of work but how many incidents do we get that run for multiple hourshellip and then it would be worth it

bull An important benefit is if you ever have to deal with liability or presumptive legislation issues it can be a real life-saver to the department the crews and their family You track the incidents where crews were exposed to hazards but more importantly ldquodispatchrdquo proves that it was tracked the time accountability checks provided weather reports and documented it all You know the saying ldquoif itrsquos not written downhellip it never happenedrdquo

We provide more than just service to the public and our communities we protect the crews regardless of agency or discipline We protect our department and our division

We are an ldquoInvisible Crewrdquo They may not see us but we are therehellip

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

MANAGEMENT

Physical Site Security By Daniel Morelos

A Back to Basics Approach

About the AuthorDaniel Morelos is the Director of Safety Programs with the Tucson Airport Authority Prior to his current position he served as the Director of CommunicationsDispatch Under his leadership his agency had the distinction of becoming the first airport communications center accredited by the Commission on Accreditation for Law Enforcement Agencies (CALEA) He has 28 years of service in the public safety communications industry and serves as an active member of APCO

wwwapcoca | Wavelength 23

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

When I took on the task of writing about physical site security for public safety communications centers I thought piece of cake right WRONG

Instantly thoughts of the types of information I needed to provide began to race in my head not to mention how I was going to fit it all into such a short piece There is so much information on this topic that finding the right mix is tough For instance how specific did I need to be without getting too deep into the weeds There are no cookie-cutter options available as all communications center facilities are different and one size will not fit all Did I need to be progressive by offering some whiz-bang hypothesis or new process in physical security design On the other hand did I need to show my competence by writing more of a technical related article After a few shakes of the head and a couple of deep breaths realizing I am not an expert but know just enough to be dangerous reason quickly set in and I knew immediately what I wanted to convey What I hope to communicate are a few simple concepts in a physical site security project one might consider whether this is their first time involved in the process or one of many

WHAT IS PHYSICAL SITE SECURITYBefore proceeding let us ensure we have an understanding of the term physical site security I will break the term into two distinct parts First plainly stated physical

security is the part of security concerned with measures and concepts designed to safeguard personnel It is also to prevent unauthorized access to equipment installations material and documents to safeguard them against espionage sabotage damage theft or terrorist attacks1 Physical security involves the use of multiple layers of interdependent systems which may include CCTV surveillance security guards protective barriers locks access control protocols and many other techniques Second the term lsquositersquo simply means the physical land area controlled by an entity by right of ownership leasehold interest permit or other legal conveyance upon which a facility is placed2

ENHANCE YOUR UNDERSTANDING

Whether you are part of a design team tasked with renovating the communications center or fortunate enough to be building a new one security of the facilityrsquos interior and exterior will be front and center from the initial planning and design phase through construction

all the way to final acceptance For the communications center manager it is imperative to increase knowledge in these areas 1) your agencyrsquos physical security programs 2) impacts to communications center operations 3) proper design basics and ergonomics 4) technology applications and 5) basics of project management Having gone through a few communications center builds and renovations acquiring this knowledge base was critical Ensuring my team was able to handle the demands brought by additional technology and duties was essential At the end of the day we successfully integrated physical security requirements into communications center operations A good starting point is with an easy to read guidebook Consider investing a few dollars in the Project Management Institutersquos (PMI) A Guide to the Project Management Body of Knowledge3 This easy to read guide provides a compilation of the fundamentals of project management as they apply to a wide range of projects A key point to remember is you do not have to become an expert in project management Gaining an understanding of the whole process from the project managerrsquos perspective may fill many gaps in your understanding allow you to be more effective and possibly reduce some frustration as well

STANDARDS AND BEST PRACTICES

Physical site security is a constant challenge for managers of high profile facilities This includes public safety communications centers As we know it is common practice for communications

PHYSICAL SITE SECURITY A BACK TO BASICS APPROACH

24 Wavelength | wwwapcoca

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25

MORELOS

centers to be controlled environments Many are limited access areas due to the nature of the mission and types of sensitive work performed For the most part many get it right but in our business to rest on our laurels would not be a good thing A wise man once said ldquostandards protect the foolish advance the knowledge of the wise and give good organizations baselines from which to improverdquo Aligning a physical site security program with industry standards and best practices is a prudent move There are a number of standards development bodies available to obtain standards and practices fitting for onersquos agency One source for this type of information is the Whole Building Design Guide which is a program of the National Institute of Building Sciences Through their website (wwwwbdgorg) one can find an abundance of information on integrating safesecure design standards for building occupants and assets ranging from fire and security protection occupant safety and health to integrated security technology and sustainability to name a few Another resource for which APCO is a contributing member is the Commission on Accreditation for Law

Enforcement Agencies (CALEA) which has developed basic standards related to communications center facility and equipment security4 The National Fire Protection Association (NFPA) through its 1221 standard has a section dedicated to communications center security5 As you go through the journey of finding those standards and best practices applicable to your agency be ever mindful of the fact there are many choices for one to consider Understand there is no ldquobestrdquo solution that will satisfy a broad class of situations so you have to do your homework

SECURITY ASSESSMENTSun-Tzu author of the manuscript the Art of War made the following statement ldquoIf you know the enemy and know yourself you need not fear the results of a hundred battlesrdquo This is the essence of a security assessment Having a tool of this magnitude which contains a compilation of all known and potential threats to your agency and recommendations on the layers of security necessary to thwart and protect against them is an investment worth the time money and effort If your agency has not gone through the process

of developing a security assessment you will want to push the effort It is the smart and responsible thing to do Depending on your agencyrsquos situation security assessments are performed every three to five years or as situations change

References 1 Physical Security Professional (PSP) Study

Guide -- 2007 (ASIS International) 2 Glossary of Terms Facility Security Plan An

Interagency Security Committee -- 2015 First Ed

3 Project Management Institute (PMI) httpmarketplacepmiorgPages ProductDetailaspxGMProduct=00101388701

4 CALEA Public Safety Communications Accreditation program(Section 64 in its entirety) httpwwwcaleaorgcontentstandards-titltes

5 NFPA 1221 Standard for the Installation Maintenance and Use of Emergency Services Communications Systems (2013) (Section 46 in its entirety)

This article has been reprinted from the PSC Public Safety Communications journal with the approval of APCO International and Naylor Publishing

wwwapcoca | Wavelength 25