17
STRATEGIC WHITE PAPER Best practices for achieving service assurance excellence Climbing the operational maturity ladder This paper covers the most effective way to turn your network-focused assurance operations into a successful service assurance operation. To achieve this operational transformation, Communication Service Providers (CSPs) need to climb the operations maturity ladder. By moving from a technology or network focus to a service-oriented delivery model, they can gain operational efficiencies while improving the customer experience. Both result in improve–ments to their competitive position, return on investment and revenue. Moving up the maturity ladder is a challenging and complex endeavour. That’s why CSPs are turning to standards-based frameworks for guidance to climb this ladder. Bell Labs Consulting’s Capability Reference Operating Model (CROM) defines an end-to- end framework for telecoms operations, including a comprehensive map of operational functions and capabilities. The outcome is an operations map of best practices that provides the fastest, most cost-efficient route up the maturity ladder. This paper provides an abridged version of the operations map with selected best practices. Illustrations of outcomes and deliverables are also provided.

Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

Embed Size (px)

Citation preview

Page 1: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

S T R A T E G I C W H I T E P A P E R

Best practices for achieving service assurance excellence Climbing the operational maturity ladder

This paper covers the most effective way to turn your network-focused assurance

operations into a successful service assurance operation. To achieve this operational

transformation, Communication Service Providers (CSPs) need to climb the operations

maturity ladder. By moving from a technology or network focus to a service-oriented

delivery model, they can gain operational efficiencies while improving the customer

experience. Both result in improve–ments to their competitive position, return on

investment and revenue.

Moving up the maturity ladder is a challenging and complex endeavour. That’s why

CSPs are turning to standards-based frameworks for guidance to climb this ladder.

Bell Labs Consulting’s Capability Reference Operating Model (CROM) defines an end-to-

end framework for telecoms operations, including a comprehensive map of operational

functions and capabilities. The outcome is an operations map of best practices that

provides the fastest, most cost-efficient route up the maturity ladder.

This paper provides an abridged version of the operations map with selected best

practices. Illustrations of outcomes and deliverables are also provided.

Page 2: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

Table of contents

Introduction / 1

Climbing the operational maturity ladder / 2

Best practices for achieving service assurance excellence / 2

Key service assurance best practices / 5

1. Resource management excellence / 5

2. Service Operations Center (SOC) / 5

3. Manage incidents at ticket level / 6

4. Inventory management and reconciliation / 7

5. Problem management / 8

6. SLA-driven metrics framework / 9

7. Proactive KPI dashboard / 10

8. Event management RACI and role interaction / 11

9. Service modelling / 12

Quantifiable results from client transformation engagements / 12

The Bell Labs service assurance consulting value proposition / 13

Conclusion / 14

About the authors / 14

Abbreviations / 15

Page 3: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

1

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

IntroductionEver since the liberalization of the telecommunications market, Communication Service Providers (CSPs) have evolved towards business models where customer centricity is paramount. In parallel, as technology rapidly evolved and has become widely available, communications services have become increasingly sophisticated and customer experience and innovation are now the main differentiators. Service management is a CSP operation that centers on services provided to the customers, not the resources that constitute the service.

The emergence of cloud and virtualization technologies has also shifted the focus of CSPs from infrastructure to services, while many CSPs are now implementing platform1 business models. As a result, these CSPs can better differentiate their offers.

CSPs have been able to move into service management in operational areas, such as fulfilment/service activation and billing. However, service-level assurance has traditionally been elusive for many CSPs who have remained at the resource/technology level. As illustrated in Figure 1, these CSPs have not progressed from the Ad Hoc and Reactive levels to the Proactive, Service or Value levels of the service management maturity model. This is particularly the case in areas such as fault management.

Recently, operational capabilities such as Service Level Management (SLM) and Service Operation Centers (SOCs) have been established by pioneering CSPs. Meanwhile, others remain infrastructure focused at the Ad Hoc and Reactive levels.

Figure 1. Service Management Maturity Model drivers

Management

Partners Customer experience

Change

Communities

Services

Resources

Endpoints

Purpose-built solutions

Assurance

Service assurance

Value

Service-oriented architecture

Service provider network

Quality, scalable, secure service

IMS

Residential• All IP service over FTTx

• PSTN modernization

• Business-aware communications

• IP centrex

• SIP trunking

• VoLTE

• Video comms

• RSC/RCSe

Enterprise Wireless

ServiceProactive

ReactiveAd hoc

Moving up to the Service level of the Service Management Maturity Model is challenging. It has to be done on the fly without disrupting ongoing operations. Adding to the challenge of this transformation is the identification and application of the right operational best practices in these four domains: people, processes, platforms and metrics.

1 Platform is a business model that create value primarily by enabling direct interactions between two (or more) distinct types of affiliated customer groups, who value each other’s participation, generally tapping on a network effect.2 Our detailed analysis of processing effort (time spent on event management and ticket opening) per fault type shows that it can be approximated by a Pareto distribution. This is due mostly to frequency of occurrence and the distribution of effort per fault type.

Page 4: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

2

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

Climbing the operational maturity ladderThe operational maturity ladder establishes a series of levels that define the status of an organization’s operational performance in terms of productivity and customer experience. Figure 2 depicts the levels on this ladder. Fully-fledged service management sits at level 3. At this level, operations are focused on end-to-end services, which are enabled by end-to-end integrated processes supported by defined service models.

CSPs need to fully complete the prior level before climbing to the next one. For example, this means that a CSP must achieve effective resource management (i.e., resource inventory, resource provisioning and mediation) before rising to the Service level. Similarly, CSPs should have mastered the key components of the Reactive level before stepping up to the Proactive level.

Figure 2. The operations maturity ladder

• Informal• Unpredictable• Non-repeatable

Mat

uri

ty

Ad HocLevel 0

• Formal• Repeatable BUT• Present-action focus• Not planned• No contingency

anticipation

ReactiveLevel 1

• Underlying driver focus

• Anticipative• Planned• Focused on

network resources

ProactiveLevel 2

• Service-centric• End-to-end

service focus• Defined service models• End-to-end integrated processes

ServiceLevel 3

• Customer-centric• End-to-end value chain perspective• Supported by business intelligence• Creates value add• Differentiated value proposition

ValueLevel 4

Service Operations requires climbing the maturity ladder to get to service levels, you need to reach to the previous maturity echelons (reactive, proactive)

The key difference between level 3 and the levels below it is that processes are based on instances of services as opposed to instances of resources. Additionally, climbing the maturity ladder requires specific investment at each level. Each level incrementally improves operational efficiency, which has a direct impact on OPEX. Because maturity investments follow the law of diminishing marginal returns, it is important to focus on areas that yield the most significant maturity improvements.

Best practices for achieving service assurance excellenceAs noted, moving up the maturity ladder is achieved through specific best practices covering four operational domains: people, processes, platforms and metrics. Bell Labs Consulting has helped CSPs worldwide to develop their service management operations. This is in addition to the vast experience of our Managed Services team who have implemented service management as part of our operating model.

One critical success factor is being able to analyse CSP operations and determine high-yield focus areas. At Bell Labs Consulting, our proven Capability Reference Operating Model (CROM) defines an end-to-end framework for telecoms operations. As illustrated in Figure 3, the CROM provides a comprehensive and consistent map of operational capabilities.

Page 5: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

3

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

The CROM operating model is a standards-based framework. It blends the ITIL and eTOM frameworks with Bell Labs Consulting know how. This know-how has been drawn from our extensive experience providing consultancy and managed services to operators worldwide. As a result, the CROM has both a strong theoretical foundation as well as a practical one. It applies at both the strategic level and at the day-to-day tactical level in the management of outsourced networks. The higher levels of the CROM (levels 0 and 1) are depicted in Figure 3.

To systematically identify high-yield focus areas, we apply the CROM framework to real-world situations. For each customer, we go one by one through the comprehensive set of assurance capabilities in CROM, and select the areas of operational transformation relevant to the service assurance goals. Once these areas are selected, the set of best practices that provides the highest maturity improvement yield is identified within each area. This analysis is isolated from the contextual and idiosyncratic circumstances of the individual operators. The outcome is an operations map of best practices that provides the fastest, most cost-efficient route up the maturity ladder.

Figure 3. The Bell Labs Consulting Capability Reference Operating Model (CROM)

Strategy and design Transition Operation

Continual service improvement

DeploymentDesignStrategy

Cu

sto

mer

Serv

ice

OS&R Fulfillment Assurance Billing

F-04: OrderOrchestration andService Activation

Res

ou

rce

Sup

pli

er

DS-10: VendorTender Management

O-11: SparesManagement and

Logistics

DS-09: VendorIntegration Support

DS-08: VendorProcurement

DP-07: FLS OSP

DS-08: CapacityPlanning

DS-06: ChangeManagement

DS-07: ResourceDesign

DP-06: TechnologyDeployment

DP-04: TechnologyDevelopment

DP-05: TechnologyTesting

A-16: CapacityMonitoring

A-14: PredictiveFault Management

A-15: ResourceHealth Monitoring

B-07: Billing DataMediation

B-05: Ratingand Charging

B-06: Real TimeCharging Rating

O-08: Res. SystemAdministration

O-09: ResourceInventory and Rec.

O-10: WorkforceManagement

O-06: ReleaseManagement

O-07: Res. PhysicalMaintenance

F-08: FLS - Commissioning

F-09: ResourceTesting

F-06: ResourceProvisioning

F-07: ResourceCommissioning

A-06: Security

A-07: FLS - Assurance

A-04: ProblemManagement

A-05: EventManagement

DS-05: ServiceDesign

DS-03: CustomProduct Design

DS-04: RequestManagement

DP-03: ServiceDeployment

DP-01: ServiceDevelopment

DP-02: ServiceTesting

O-05: ConfigurationManagement F-05: Fallout

Management

O-03: KnowledgeManagement

F-03: Order Test andAcceptance

O-04: PlannedEvents

A-12: ContinuityManagement

A-13: Service QualityManagement

B-03: RevenueAssurance

B-04: TariffManagement

A-03: IncidentManagement

DS-01: ProductManagement

S-01: MarketStrategy

S-02: PortfolioPlanning

S-03: TechnologyStrategy

S-04: VendorStrategy

DS-02: DemandManagement

F-01: OrderManagement

FAB-00: Customer Reqeust and Complaint Management

F-02: Subs.Provisioning

A-01: CustomerRetention and Loyalty

A-08: VendorSupport Management

B-08: Wholesale/Interconnect Billing

A-02: CustomerIncident Management

A-10: CustomerSatisfaction

A-11: SLAM

B-01: AccountReceivables

B-02: Billing

O-01: CustomerInformation Database

O-02: ServiceInventory andReconciliation

Page 6: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

4

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

The CROM has been used to provide consulting services to many operators worldwide including: a major fixed line operator in Italy; mobile operators in France and India; the incumbent operator in New Zealand; and an MSO in the United States. Using the CROM, our experience in service assurance consulting and our Managed Services team, we have identified the following focus areas illustrated in Figure 4:

1. Event Engineering

2. Policy Design and Maintenance

3. Change Management

4. Data Integrity

5. Service Management

6. Incident Management

7. Problem Management

8. Event Management

9. Performance/SLA Management

10. Proactive Fault Management

11. Governance

Figure 4. Focus areas mapped to the CROM operating model

Strategy and design Transition Operation

Continual service improvement

DeploymentDesignStrategy

Cu

sto

mer

mgm

tSe

rvic

em

gmt

OS&R Fulfillment Assurance Billing

Res

ou

rce

mgm

t

Proactive FaultManagement

DataIntegrity

ProblemManagement

EventManagement

PolicyDesign and

Maintenance

Governance

ChangeManagement

Performance/SLA

Management

IncidentManagement

EventEngineering

ServiceManagement

Within these areas, we have identified more than forty best practices, some of which have been placed in the operational map shown in Figure 5.

Page 7: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

5

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

In the rest of this paper, we discuss an abridged version of the operations map of best practices. We present a diverse best-practices selection and provide details on each. We also describe how Bell Labs Consulting has assisted CSPs to implement them. Our best practice selection includes:

1. Resource management excellence

2. Service Operations Center (SOC)

3. Manage incidents at ticket level

4. Inventory management and reconciliation

5. Problem management

6. SLA-driven metrics framework

7. Proactive KPI dashboard

8. Event management RACI and role interaction

9. Service modelling

Figure 5. Service assurance operations map of best practices (abridged)

Changemgmt

Dataintegrity

Eventengineering

Eventmgmt

Incidentmgmt

Governance Performance/SLA mgmt

Policy designand maint.

Proactivefault mgmt

Problemmgmt

Servicemgmt

Manageincidents atticket level

SLA-drivenmetrics

framework

Inventorymanagement

andreconciliation

Problemmanagement

ServiceOperations

Center

Servicemodel

Resourcemanagement

excellent

Eventmanagement

RACI androle

interaction

Key service assurance best practices1. Resource management excellenceIn practice, resource management excellence is an essential condition of service management excellence. The following examples show that this is not a mere theoretical consideration, but a practical imperative:

• Incident management is best handled at ticket level rather than alarm level. Automated ticket creation is required to effectively perform incident management at ticket level.

• Automatic ticket creation relies on sound alarm correlation at resource level to open tickets on a consolidated, single root cause alarm.

• Alarm correlation relies on an accurate resource inventory that reflects the network topology.

• The service impact assessment relies on a service inventory that accurately associates each service with its underpinning resources (i.e., implements a service topology).

• An accurate service inventory with a service topology relies on an accurate resource inventory.

2. Service Operations Center (SOC)The concept of an SOC has recently become popular with more mature CSPs. An SOC is a function that sits between the infrastructure and resource assurance operational functions (i.e., NOCs, SMCs and data centres) and the customer operations and the governance functions. The SOC provides value-added service management (see Figure 6). It also acts as a service-centric operational hub for network operations, customer operations, business managers and partners.

Page 8: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

6

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

Bell Labs Consulting has been contracted by an incumbent Western European CSP to implement its SOC, which oversees different outsourced technology operations, including access, core and transmission networks. Among other contracts, we have provided SOC-related consulting services to a new 4G operator in India and to a Western European multi-national CSP in many of its local affiliates.

Figure 6. Service Operations Center

Service Operations Centre“Value add service management”

Cu

sto

mer

rela

tio

nsh

ip

QU

ALI

TY

Co

mm

on

tech

no

logy

asse

ts

Servicedelivery and

rolloutacceptance

Incidentservice impactmanagement

Proactiveend-to-end

servicemanagement

Service qualitymonitoring and

analysts

NetworkOperations

Centre(Functional)

NetworkOperations

Centre(Regional)

Data CentreSecurity

ManagementCentre

Partnerassurance

empowerment

CO

ST

Cu

sto

mer

rela

tio

nsh

ip

Customer Care Contact Centre“Proactive outreach customer

lifecycle management”

RE

VE

NU

E

Corporate Governance“Strategy and Key Business

Objectives”

The SOC operates on service instances as distinct from resource instances, such as network elements, which are the basis of NOC operations.

The SOC performs the following activities:

• Proactive end-to-end service management

• Coordination of regional NOCs

• Management of SLAs, QoE, QoS monitoring and analytics

• Empowerment of partners

• Delivery of new service types and roll-out acceptance

• Management of incident service impacts

3. Manage incidents at ticket levelOur extensive consulting and managed services experience has shown that incident management is best performed at the ticket level instead of at alarm level. Managing incidents at ticket level requires:

• Full automation of event management (i.e., suppression, correlation, enrichment)

• Automation of service impact assessment of root cause alarms

• Automated ticket creation

Page 9: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

7

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

This means that all human interaction within the “trouble-to-resolve flow” takes place within an incident ticket update.

When automating event management and ticket creation, the law of diminishing marginal returns is taken into account. The benefits of ticketing automation show a Pareto distribution.2 Therefore, automation should concentrate on the faults that have the greatest impact. It should stop when the cost of automating an additional marginal ticket does not warrant the benefit.

Our Managed Services team has successfully implemented such systems for several customers. We have reduced the CSP workload about 20 percent through elimination of time consuming alarm monitoring and ticket opening.

4. Inventory management and reconciliationEffective alarm correlation and enrichment relies heavily on the accuracy of the network topology resource inventory. Inaccurate inventory data renders correlation impossible. Consequently, automated ticket creation and other dependent service management functions fall short.

Data inaccuracy often originates from mishaps in the fulfilment process, such as commissioning and provisioning. Other inaccuracies can arise from uncontrolled changes during the operational lifecycle of resources. Even with sound fulfilment data as well as change management discipline and control, there may be exceptional failures in the processes. In addition, inaccuracies may result in incident management situations when service restoration draws attention away from maintaining up-to-date inventory after changes and workarounds.

Data reconciliation between the network and inventory is a best practice that addresses these operational problems. It improves resource inventory accuracy by detecting and resolving discrepancies with the network configuration at regular intervals. Information is collected by the reconciliation engine and compared with that contained in the inventory.

Any discrepancies are flagged to the NOC/Engineering team. If service-related, discrepancies are relayed to the SOC team instead. Each team decides which information source (i.e., inventory or network configuration) is correct and which one should be amended. In this respect, inventory reconciliation is not just a technical OSS capability but is also a process capability. This ensures all discrepancies are handled with due diligence.

Furthermore, inventory reconciliation is not a replacement of enforced or automated data capture at commissioning/provisioning or during incident management data updates. Rather, inventory reconciliation is an additional safeguard to prevent inconsistencies.

An additional best practice in data reconciliation is using an operational data store. The data store is a replicated slave inventory. To prevent performance issues, it offloads the main inventory from non-real time data processing.3

2 Our detailed analysis of processing effort (time spent on event management and ticket opening) per fault type shows that it can be approximated by a Pareto distribu-tion. This is due mostly to frequency of occurrence and the distribution of effort per fault type.

3 The Bell Labs Consulting white paper, Optimizing enterprise fulfillment through operations best practices, describes techniques to achieve data accuracy without com-promising the flexibility and modularity of the product portfolio and its enabling processes.

Page 10: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

8

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

Figure 7 illustrates all the reconciliation functions described herein and their relationship with the rest of the processes and OSS platforms.

Figure 7. Inventory Reconciliation Architecture

Incidentmanagement

Eventmanagement

Resource inventory andreconciliation

Resourceinterface

Vendor 1 Vendor 2

NMS NMS

NE1 NE2 NE1 NE2

Resources

Falloutprocess

NE BL

Eventcorrelation

Eventenrichment

Inventory InventoryODS

Vendor 2adaptation

Vendor 1adaptation

Network discoveryand reconciliation

Ticketingsystem

Service impactassessment

Inventoryupdate

Alarmconcentrator

5. Problem managementFormalized ITIL problem management4 enables the SOC to ascend the maturity ladder to the Proactive level. It does this by minimizing the impact of reoccurring incidents. This best practice splits the trouble-to-resolve (T2R) end-to-end process into two cycles: a short cycle, which usually takes hours to restore service from incidents, as well as a long cycle, which may take days or even a few months to resolve the root causes. These might include fixing a design defect, installing a patch set or new version, or getting a fix from the vendor, using the standard change management/release management processes.

This dual-cycle process workflow includes the creation of a problem. A problem is a new entity in the operational data model, which is implemented as a problem ticket. The problem ticket is independent of the incident ticket, which has its own autonomous lifecycle. Typically, problem tickets are created after incident tickets and have a one-to-many relationship between the incident and the problem.

4 In ITIL, a problem is an undesired and typically chronic issue associated with the root cause of one or more incidents. Problem management is the process responsible for managing the problem lifecycle.

Page 11: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

9

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

Figure 8. Example of a dual-cycle service assurance process with problem management

ResourceInventory andReconciliation

CustomerIncident

Management

SparesManagementand Logistics

VendorSupport

Management

ConfigurationManagement

KnowledgeManagement

X X

Cust. Req. andComplaint

Management

O

O

EventManagement

IncidentManagement

O

XProblem

Management

O

FLS –Assurance

Customercomplaint

External NTT request

Request spareand dispatch

ProblemTrend Threshold

Breach(M2I)

Update toconfiguration

Inventory andreconciliation

Known errorsand restorationprocedure

RFC

PTT

PTTVTT

VTT

WO

NTT

Alarm

Resource TCA(M2I)

CTT

WTT

O

VendorSupport

Management

O

X

PredictiveFault

Management

O

O

O

O

ResourceCommissioning

ReleaseManagement

ChangeManagement

O

O

Alcatel-Lucent has applied this concept internally to our Managed Services Operating Model (MSOM) and to our OSS software solutions, which are used by European CSPs and in vertical markets. This approach has significantly improved operational cost and performance by:

• Better handling the diversity between the short-cycle SLAs to restore service from incidents, and the long-cycle SLAs for vendors and engineering and technology organizations to investigate and resolve underlying defects that cause incidents;

• Moving from firefighting to prevention. Focusing more strategically on root causes instead of working backwards from consequential impacts, therefore reducing the amount of operational waste.

• Helping to focus the operational staff on the root cause instead of the service restoration through monitoring problem resolution performance.

Problem management also simplifies incident resolution with the Known Errors Database (KEDB). This database captures all recognized faults along with their resolutions and workarounds. The result is reduced time to diagnose and restore.

Despite these advantages, many CSPs lack a formalized problem management process. Some fear the costs of implementing this process. Others are too busy deploying new services or technologies to focus on problem management.

6. SLA-driven metrics frameworkEnterprise customers’ stringent SLAs and the demand for quality are driving CSPs to better align processes with strategic requirements. As illustrated in Figure 9, these can include contract clauses, regulations, as well as marketing requirement sets. Meeting these demands is a matter of enforcing process compliance with the strategy and Key Business Objectives (KBO). This is done by assessing process outcomes and ensuring alignment with the SLAs that underpin the requirements. Achieving these outcomes also requires making the organizations that are in charge of the processes accountable.

Page 12: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

10

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

To effectively assess compliance, process outcomes should be measured and mapped to the KBOs. This is achieved by creating a hierarchical measurement framework. The framework maps low-level operational outcomes to high-level SLAs—in addition to mapping SLAs to strategic requirements and KBOs.

Figure 9. Hierarchical metrics framework

Requirements

KBOs

Level 1

Contracts Regulation Marketing strategy

Level 2

Key businessobjectives

Qualityobjective

Timeobjective

Costobjective

MTTRI

MTTR

MTTRI withinSLA Availability

OutageimpactMTBI

Cost ofassurance

Operationaleffort/service

Incidentrepetition

Customersaffected

Operationaleffort/NE

Incident-to-problem ratio

MTBI/MTBFMTBF

However, the development of this hierarchy can be a time consuming and resource-intensive task. The following set of principles can guide its development:

• Top-down consistency: The end-to-end hierarchical structure must consistently tie KBOs with low-level process measurements across all levels (e.g., SLA, KQIs, high-level KPIs, etc.)

• Simplicity: Consuming performance information can be onerous for senior management. The SLA set should be a small, easily maintained, comprehensive set of metrics that capture most of the KBO/strategic objectives. Re-using industry-standard metrics, wherever possible, is a best practice. For example, the TM Forum has created more than 100 metrics based on dozens of service provider inputs. These metrics can be found in the TM Forum’s BM1000 Business Performance Management System.

Alcatel-Lucent has incorporated these principles in our operations benchmarking framework. Using this framework, we have provided consulting services to several operators. These include a North American MSO and an Asia-Pacific wholesale provider.

7. Proactive KPI dashboardThe concept of a dashboard is akin to the Balanced Scorecard,5 a performance measure framework that adds strategic perspectives (e.g., customer, internal excellence and innovation) to traditional financial metrics for mangers. This concept has been refined for individual industries, including telecommunications.

5 Robert S. Kaplan, David P. Norton, 1992, ‘The Balanced Scorecard – Measures that Drive Performance’, Harvard Business Review, January-February 1992.

Page 13: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

11

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

In the CSP service assurance context, the dashboard is the reporting/visual summary of the metrics framework. It helps all levels of management make better informed decisions. We have distilled the attributes of a best-practice service assurance dashboard:

• It has a small, clearly defined set of metrics that supports most of the service assurance operational decision-making.

• It offers an effective visual representation to provide service assurance operational performance data in an easy-to-analyse, at-a-glance format.

• It ensures metrics are aligned with relevant perspectives (e.g., cost, quality, time, and excellence) and has a hierarchical report/chart link that follows the defined metrics framework hierarchy.

• It is viewable from a top-down and cross-organizational perspective (i.e., is accessible to NOCs, third-level support organizations and field-line support teams) so that the entire organization is aligned.

• It is available across all the service assurance organizations. Only if all the relevant employees have appropriate visibility of the scorecard is the maximum benefit achieved. In addition, it sends the right message: operational management and the organizations they lead will be measured by the metrics and scorecard targets.

Figure 10. Bell Labs dashboard developed for a North American MSO

L1 dashboard - Assurance

L2 alarm quality metrics rollup

L1: Cost – Assurance

L2: Cost – VoD alarm quality BP targets

% increase in actionable alarms 10%

% increase in ticket to alarm ratio 10%

% reduction in VoD alarms presented 7%

Average fulfillment and assurance cost per product

$600

$550

$500

$450

$400

$350

$300

$250

$200

2Q2014 3Q2014 4Q2014 1Q2015 2Q2015 3Q2015 4Q2015

Average incident resolution time (mins.)

100

90

80

70

60

50

40

30

20

2Q2014 3Q2014 4Q2014 1Q2015 2Q2015 3Q2015 4Q2015

Problems resolved before incident reoccurrernce

100%

90%

80%

70%

60%

50%

2Q2014 3Q2014 4Q2014 1Q2015 2Q2015 3Q2015 4Q2015

L1: Incident Resolution Time (A) (Time) L1: Reoccurrence – Problem Resolution (Quality)

L2: Quality – VoD alarms BP targets

# or % incidents restored within target

65%

% of incidents that resulted in re-work

12%

L2: Tune – VoD alarm quality BP targets

Mean time to detect 2-5 min.

% improvement in mean timeto restore incident

75%

% increase in alarms filteredand correlated

30%

At Bell Labs Consulting, we have successfully delivered KPI dashboards in consultancy engagements for many customers, including a North American MSO and an Australian access wholesaler.

8. Event management RACI and role interactionOrganizational accountability and strategic alignment with KBOs is also achieved through the implementation of RACI matrices. RACI is an acronym for a project management practice in which stakeholder roles are defined in terms of Responsibilities, Accountabilities, as well as who needs to be Consulted and Informed. The RACI matrix associates key service assurance functions and roles with processes, activities and KPIs.

Bell Labs Consulting has employed RACI matrices to establish role and function accountability for a North American MSO. This project consisted of detailed top-down design, formalizing processes, defining outcomes and KPIs to measure compliance with KBOs. The outcomes were mapped to the organization using a RACI matrix.

Page 14: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

12

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

Figure 11 illustrates a RACI matrix.6

Figure 11. RACI matrix

Order handlingimplementation

Customer Care C C AR I I AR

C R R AR I I

C C R IR AR R

AR AR

AR AR

A A A A A A

Functions(Simplified RACI chart for service fulfillment)

CC

Order handlingvalidation (ORT)

Ordergeneration Fallout Field delivery Acceptance

Service Fulfillment CentreSFC

Field Line SupportFLS

OSS engineeringOSS

Network engineering/ITENG

Head of FulfillmentHoF

Ops Readiness

SFCENGOSSNOC/TSO

Operate

OSS/BSS toolsimplementation

OSS/BSS toolsimplementation

Order handling recommendations

Network design

FLS

CC

HoF

Order handling recommendationsOrder handling recommendationsOrder flow

Order generationOrder falloutOrder acceptance

9. Service modellingService modelling is, in essence, a value-chain analysis of services provided by a CSP. This value-chain analysis identifies the contribution of each component to cost, value and delivered service quality. Components can include network elements/resources and operations—manual and OSS.

By performing service modelling, CSPs establish a sound analytical foundation to define:

• The hierarchy of KPI/KQIs to assess service performance covered by SLA management in all areas: operations excellence, customer experience and cost.

• The service topology in the service inventory that supports the service assurance impact assessment.

• The costs and benefits of each operational activity and resource component of the service, as well as their impact on the contribution margin of the service.

Quantifiable results from client transformation engagementsClimbing the maturity ladder to the Service level leads to measurable improvements on cost and customer experience for the service assurance operator. This allows the operator to differentiate on cost and quality, giving it a competitive advantage and generating more revenue.

6 Details have been eliminated to comply with the Alcatel-Lucent customer information, non-disclosure policy.

Page 15: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

13

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

Using the Bell Labs Consulting approach of targeting high-yield focus areas minimizes the investment while assuring ROI. This is especially important for service operations, where capital rationing is tighter than for network-based operations. Figure 12 shows the benefits of implementing the Bell Labs approach. These outcomes are derived from the operators mentioned previously:

Figure 12. Outcomes of maturity improvement on Assurance

Incidents repeated more than once*

41.9%

Problems dealt within SLA*

49%

Problems dealt within SLA*

65%

Customer incident resolution time*

3942 min

Customer incident resolution time*

973 min

REACTIVE

Incidents repeated more than once*

10.7%

Problems dealt within SLA*

94.4%

Customer incident resolution time*

444 min

Incidents repeated more than once*

4.5%

PROACTIVE SERVICE

The Bell Labs service assurance consulting value propositionBell Labs Consulting provides trusted guidance to take CSPs’ assurance operations to a higher level of maturity with the adoption of industry best practices. Supported by our transformation methodology, we have helped CSPs identify the most effective way to build a quality, cost-effective service operation to achieve competitive advantage in a market that demands customer experience and relentless innovation.

Figure 13 illustrates our transformation methodology, as well as its stages and deliverables. Each stage provides carefully defined and agreed-upon deliverables to ensure all stakeholders remain aligned. Assessment and definition processes are used at each stage. This enables CSPs to clearly understand and identify objectives, quantify results and mitigate risks.

Our methodology consists of the following stages:

• Identification of operational transformation KBOs and definition of key transformational operations metrics

• Assessment of enterprise fulfilment operations at the customer, service and resource levels to identify the Present Mode of Operation (PMO)

• Definition of the best fitting Future Mode of Operation (FMO) to adopt the best practices that provide the highest improvement returns.

• Definition of the path, if necessary, from PMO to FMO, including the intermediate stages of the journey as the Interim Mode of Operation (IMO)

• Evaluation of the changes’ impacts on cost and customer experience, including business processes, metrics and the OSS/IT infrastructure. Quantification of the transformation recommendations against the KBOs.

Page 16: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

14

Best practices for achieving service assurance excellence

Bell Labs Consulting Strategic White Paper

Figure 13. Bell Labs transformation consulting methodology

Objectives

Bell Labs transformation consulting methodologyFive phases in four work streams

Activities

Deliverables Report: Strategicvision and

business requirements

Report: Assessment/audit report

Report: Targetsolution description

Report: Final transformation plan with business case and benefits analysis

Vision and businessrequirements PMO discovery FMO design IMO roadmap

Financial modelbenefits analysis

WS 1: Business requirements

WS 2: Network transformation

WS 3: Operations transformationWS 4: Financial analysisand transition plan

• Capture customer’s problems, issues and aspirations

• Identify business objectives and constraints

• Discover portfolio

• Discover current architecture

• Discover PMO (organization, processes, tools)

• Portfolio rationalization and target portfolio

• Design target network and systems architecture

• Design target operation model

• Define optimal transition plan

• Propose realization plan

• Finalize financial and technology model

• Vision workshop

• Interviews with executive stakeholders

• Define key success factors and major risks foreseen

• Relevant market and industry trends

• Portfolio analysis

• Assess current capabilities

• Derive the CAPEX and OPEX model

• Review existing operating model, operations metrics and KPIs

• Compare applicable evolution scenarios and draw conclusions

• Baseline the required Future Mode of Operations based on Vision and Business requirements

• Develop transition blueprint and plan (network, systems, organization, metrics, processes)

• Develop TCO and business case

• Present financial model and business case information

The operational planning work streams are managed simultaneously to ensure coordination, as well as the speed of decision making. The final deliverable provides customers with a financially sound and market-aligned approach for the staged transformation to the Service level. This deliverable is detailed so that it can be implemented immediately. The duration of each stage ranges from one to three weeks, depending on the complexity of the operation.

ConclusionThe achievement of service assurance excellence can be a daunting challenge. In this white paper, we have shared selected best practices that offer CSPs guidance on the way to reach Service level maturity. The Capability Reference Operating Model (CROM) is a critical framework for CSPs to reach this level. Bell Labs Consulting uses this model to provide CSPs with an operations map and supporting best practices that result in the fastest, most cost-efficient route up the maturity ladder.

About the authorsCarlos Oliver, Principal Consultant, Bell Labs Consulting, is a London Business School EMBA candidate. He has 15 years of telecom experience in professional services and consulting. He has specialized in advising CXOs in technology strategy and operations. Mr. Oliver created the Capability Reference Operating Model (CROM).

John Leadley, Managing Principal, Bell Labs Consulting leads the Customer Experience practice. Prior to this, he held several senior level positions for the targeted delivery of large-scale business and operational solutions. Selected clients include: Bank of America, BBN, Boston Scientific, CSX, DEC, Esso International, Global Crossing, Kodak, NASDAQ, New York Times, One Communications, Sirius Satellite Radio, Telefonica, Windstream and Xerox.

Page 17: Best practices for achieving service assurance excellence · Best practices for achieving service assurance excellence ... Best practices for achieving service assurance ... of the

www.bell-labs.com Alcatel, Lucent, Alcatel-Lucent and the Alcatel-Lucent logo are trademarks of Alcatel-Lucent. All other trademarks are the property of their respective owners. The information presented is subject to change without notice. Alcatel-Lucent assumes no responsibility for inaccuracies contained herein. Copyright © 2015 Alcatel-Lucent. All rights reserved. PR1502008420EN (March)

The authors have written the white paper, Optimizing enterprise fulfillment through operations best practices. The paper describes the challenges faced by CSPs aiming to provide cost-effective quality fulfilment to the enterprise market. It also presents the best practices to overcome these challenges.

AbbreviationsCROM Capability Reference Operating Model

CSP Converged Service Provider or Converged

Services Platform

eTOM Enhanced Telecom Operations Map

FMO Future Mode of Operations

IMO Interim Mode of Operations

ITIL Information Technology Infrastructure Library

KBO Key Business Objectives

KEDB Known Errors Database

KQI Key Quality Indicator

MSO Mobile Service Operator

MSOM Managed Services Operating Model

MTBF Mean Time Between Failures

MTBI Mean Time Between Interruptions

MTTR Mean Time To Repair

MTTRI Mean Time to Restore Incident

NOC Network Operations Center

OSS Operational Support Systems

OPEX Operating Expenses

PMO Project Management Office

QoE Quality of Experience

QoS Quality of Service

RACI Responsible, Accountable, Consulted, Informed

SLA Service Level Agreement

SLM Service Level Management

SMC Service Management Center

TCO Total Cost of Ownership

T2R Trouble-to-Resolve