27
Network Performance Optimisation How Communications Service Providers (CSPs) can create new value from quality attenuation analytics © Predictable Network Solutions 2013

Network performance optimisation using high-fidelity measures

Embed Size (px)

Citation preview

Network Performance Optimisation

How Communications Service Providers (CSPs) can create new value

from quality attenuation analytics

© Predictable Network Solutions 2013

Dr Neil Davies Co-founder and Chief Scientist

Ex: University of Bristol (23 years).

Former technical head of joint university/research institute (SRF/PACT).

The only network performance science company in the world.

• New mathematical performance measurement and analysis techniques.

• Performance assessment methodology.

• World’s first network contention management solution.

PREDICTABLE NETWORK

SOLUTIONS

Peter Thompson CTO

Ex: GoS Networks, U4EA, SGS-Thomson, INMOS & Universities of Bristol, Warwick and Cambridge.

Authority on technical and commercial issues of converged networking.

Martin Geddes Associate Director of Business Development

Ex: BT, Telco 2.0, Sprint, Oracle, Oxford University.

Thought leader on the future of the telecommunications industry.

Presentation Outline

• CSPs are seeking to increase their profitability and return on assets.

• Predictable Network Solutions Ltd has the capability to support optimisation beyond traditional approaches to network data analytics. – This capability is built around a robust scientific method.

• CSPs can benefit greatly from enhancing the fidelity of their measurements of critical aspects of network performance. – Standard techniques fail to capture enough resolution.

• We have the missing leading-edge measurement capabilities that all CSPs need.

© Predictable Network Solutions 2013 3

CSPS’ QOE AND COST DILEMMA The need to manage to the right metrics

© Predictable Network Solutions 2013 4

What are the network optimisation goals of every CSP?

Commercial

The CSP’s revenue is ultimately bounded by the value perceived by the final end user.

• User value is derived from applications delivering fit-for-purpose outcomes (FFPOs).

• Users value consistency

– The absence of failures of service

– Bad experiences must be rare

• Every CSP’s goal is to maximise the value of FFPOs (i.e. QoE) at the minimum input cost.

Technical

CSPs need to make bad user experiences sufficiently rare, at affordable cost.

• This creates a balancing act: running the network too hot vs too cold.

– For this they need to have good proxies for QoE.

• A good proxy is one that directly relates to the delivered QoE…

– …that can also be measured, managed and predicted…

– …and must also have low operational cost to gather.

© Predictable Network Solutions 2013 5

Average

Single Point

Offered Load and Utilisation

(mean values only)

Network performance measures

© Predictable Network Solutions 2013 6

Today’s key CSP QoE proxy.

Is it a good one?

No! Reporting the number of packets on a

1Gb/s Ethernet link every five minutes is

like counting cars on a six-lane highway for two

years!

Might there be some important details about traffic conditions that

are lost? (Yes!)

Need distributions, not averages: Same bandwidth, different QoE

© Predictable Network Solutions 2013 7

The difference between these ISPs is the distribution of loss and delay. The one on right has 1/3 the capability of the left for carrying POTS-quality VoIP.

Comparison between two LLU broadband providers to same location in the UK.

1/3 THE VALUE SAME ‘BANDWIDTH’

‘Bandwidth’ is an average. It fails to

capture this non-stationarity.

Utilisation is a poor proxy for QoE

© Predictable Network Solutions 2013

This is (the first publishable) evidence comparing utilisation with a direct QoE measurement.

This is a well-run and well-managed network. Our engagements with CSPs have shown this to be a common phenomenon.

The data CSPs use:

bandwidth The data CSPs need: strong QoE

proxy

8

© Predictable Network Solutions 2013 9

High load, but no QoE

breach

Low load (<0.01%), but QoE breach

Over-provisioning just wastes

money

Over-provisioning doesn’t solve

your QoE problem

The CSP QoE and cost problem

Commercial The failure to appropriately measure QoE means there are unmanaged hazards in the current supply chains.

• These hazards can and do mature into application and network failures.

• FFPOs are dropping, and cost per FFPO is rising.

– This leads to premature upgrades, compared to the original capacity plan.

• Return on assets continues to drop…

– …so CSP share prices fall.

Technical In-life management costs increase due to the inability to manage the QoE hazards, which appear as ‘faults’. So:

• CSPs turn to arbitrary traffic management to shed load which, in turn, increases tension between customers, legislators and CSPs;

• Or, CSPs regress to previous planning and design ratios by capping access speeds due to continuing failure;

• Or, stationarity continues to decrease, reducing FFPOs and QoE, which leads to less value-in-use and tarnishes every CSP’s reputation.

© Predictable Network Solutions 2013 10

11

Failure of technology to keep

up with ever rising demand

forces shorter upgrade cycles

Rising load makes

service quality fall,

forcing upgrades

Serv

ice

Qu

alit

y

Time

Un

dep

reci

ated

Ass

et V

alu

e

Time

The CSP investment ‘cycle of doom’

Death via unserviceable

debt load

QoE declines faster than the capacity plan

predicts

Upgrade before previous

investment amortised

HOW TO OBTAIN PERFORMANCE DATA WITH REAL VALUE?

All analytic approaches are limited by the fidelity of their inputs

© Predictable Network Solutions 2013 12

FFPOs require bounded ‘quality attenuation’ (∆Q)

One-way delay (ms)

On

e-way lo

ss rate (%)

© Predictable Network Solutions 2013

Different QoE implies

different bounds on ∆Q

Median time to complete HTTP transfer in seconds

Need to manage

network to a QoE goal

We care about both

loss and delay

ΔQ accumulates along a path Example: 3G round-trip cross-sectional analysis

© Predictable Network Solutions 2013

(No service)

We want visibility of how each

network element contributes to ΔQ

Average

Single Point

Offered Load and Utilisation

(mean values only)

Multiple Point

Delay and Loss (mean and variance)

Network performance measures

PLU

S

© Predictable Network Solutions 2013 15

To get loss and delay plus path decomposition

we need multi-point measurements

(and not just multiple single-point

measurements)

There is no ‘quality’ in averaged measurements

© Predictable Network Solutions 2013 16

∆Q for 16kbit offered load at a busy international 3G location

AVERAGE DELAY CSPs need high-fidelity data to

see fast-varying QoE effects

FFPOs require strict bounds on loss and delay

© Predictable Network Solutions 2013

On

e-way lo

ss rate (%)

One-way delay (ms)

CSPs need to manage their

delivery to avoid these QoE

‘cliffs’

HTTP time to complete in seconds (95th percentile)

Just a few users falling over the ‘cliff’ generates

churn, even if the average user is OK

Average Distribution

Single Point

Offered Load and Utilisation

(mean values only) Arrival Patterns

Multiple Point

Delay and Loss (mean and variance)

Network performance measures

PLUS

© Predictable Network Solutions 2013 18

Capturing the ‘outliers’ of QoE means we need

the distribution of packet arrival

patterns.

Average Distribution

Single Point

Multiple Point

Network performance measures

© Predictable Network Solutions 2013 19

<0.01% utilisation

- yet QTA breach

high loads -

but no QTA breach

<0.01% utilisation

- yet QTA breach

high loads -

but no QTA breach

The data CSPs use

The data CSPs need

When you capture distributions via

multi-point measurements you get the strong QoE

proxy data you need.

EXPLOITING HIGH-FIDELITY MEASUREMENTS

How to measure the right things with a robust scientific method

© Predictable Network Solutions 2013 20

High-fidelity data capture is the key enabler

Commercial CSPs want to set a price floor for their services, and differentiate via network quality.

• This increases the focus on getting the trade-off between cost and QoE right.

• Current network management approaches focus on making the average experience better.

– The key is making bad experiences rare.

Performance data needs to enable CSPs to directly manage the cost/QoE trade-off.

Technical QoE depends on ∆Q…and nothing else.

• QoE certainly does not depend on averages or peak bandwidths.

– Average or peak measures like ‘bandwidth’ at best allow CSPs to manage cost vs performance.

• The current capture processes lose critical information that impacts QoE.

– CSPs don’t measure ∆Q directly.

– Current approaches try to compensate by gathering more and more data, the volume of which itself degrades the network quality!

© Predictable Network Solutions 2013 21

Average Distribution

Single Point

Limited predictive power

Temporal predictive power

(and localised assurance)

Multiple Point

Spatial predictive

power

ΔQ Temporal and spatial

predictive power

Network performance measures

© Predictable Network Solutions 2013 22

Average Distribution

Single Point

Limited predictive power

LOW FIDELITY LOW VALUE

Temporal predictive power

Multiple Point

Spatial predictive power

Represents all that can

be known about a system (by observation)

HIGH FIDELITY HIGH VALUE

Network performance measures

© Predictable Network Solutions 2013 23

NetHealthCheck™ Process

© Predictable Network Solutions 2013 24

Inject low-rate test streams

Measure test streams at

multiple points

Analyse measurements

to obtain distributions

Understand QoE/cost tradeoff

Our service that embodies

these ideas

Example client outcomes

1. Major UK mobile network operator • Was in 2nd/3rd place in its market (depending on location) for HTTP

download key performance indicator (KPI). • NetHealthCheck™ enabled a 100% improvement in this KPI without

any additional capital expenditure. • Placed MNO as definitive 1st in the market.

2. BT Operate • Applied to delivery of wholesale broadband services…

– …on a mature, highly-optimised, well-managed network.

• Revealed flexibility to optimise planning rules. • Potential for 30% increase in utilisation of key resources. • Estimated savings value of £2.3M.

25 © Predictable Network Solutions 2013

NetHealthCheck™ Benefits

© Predictable Network Solutions 2013 26

Structural capacity

optimisation: 10% - 30%

Scheduling optimisation:

25% - 75%

QoE improvement 50% - 100%

These all generate

‘slack’ to…

…sweat assets to optimise CAPEX:

get ‘free’ growth.

…improve QoE at no cost: for all customers, or specific groups.

+ =

For more information

Visit our website for detailed case studies, presentations and white papers

www.pnsol.com

Contact us

[email protected]

© Predictable Network Solutions 2013 27