Aligning the Conflicting Needs of Privacy, Malware Detection and Nework Protection

Preview:

Citation preview

1 © Nokia Solutions and Networks 2014

Aligning the Conflicting Needs of Privacy, Malware

Detection and Network Protection

Ian Oliver, Silke Holtmanns

Security Research

Nokia Networks

TrustCom 2015, Helsinki, Finland

21 Aug 2015

2 © Nokia Solutions and Networks 2014

Unjustified data collection is a major privacy problem

Public

3 © Nokia Solutions and Networks 2014

Except that we have some good reasons….if not good justifications

Public

Criminals, Terrorists

Malware

Theft

Today’s latest, moral and

safety panic…

Porn, Illicit content etc..

Monitoring for malware, network anomalies etc

But reduce the degree of surveillance

4 © Nokia Solutions and Networks 2014

Public

Minimise this set as much as possible

Minimise (optimise) the required data

for analysis

All available data

Data collected

Required data

Minimise this set of data

Data collected has a habit of increasing to

all data… …just in case

5 © Nokia Solutions and Networks 2014

Public

Problems (for a network infrastructure provider) •How to minimise data •Without compomising necessary analytics

•Malware detection •DDOS detection •LI •…

•How to justifiably increase data collection

6 © Nokia Solutions and Networks 2014

Public

•Calculate the privacy risk of a given data set •Map this to a mode of system operation •Operate at the lowest possible mode (implying lowest risk) based upon the current perceived level of danger •Justified and targetted collection and analysis of data

Data

Risk

Mode

7 © Nokia Solutions and Networks 2014

Simple Example

Public

Filter Analysis

LOW

Timestamp, IPsrc, IPdest, Protocol, Content, Length

etc…

Diff(Timestamp) K-Anon(IPsrc), K-Anon(Ipdest)

Current mode of operation

8 © Nokia Solutions and Networks 2014

Simple Example

Public

Filter Analysis

LOW

Timestamp, IPsrc, IPdest, Protocol, Content, Length

etc…

Diff(Timestamp) K-Anon(IPsrc), K-Anon(Ipdest)

Some anomaly is detected

Request to change system mode

9 © Nokia Solutions and Networks 2014

Simple Example

Public

Filter Analysis

MED

Timestamp, IPsrc, IPdest, Protocol, Content, Length

etc…

Timestamp IPsrc

IPdest [190-210.*.*.*] Protocol = HTTP

Domain(Content_URL)

10 © Nokia Solutions and Networks 2014

Simple Example

Public

Filter Analysis

Timestamp, IPsrc, IPdest, Protocol, Content, Length

etc…

Timestamp IPsrc

IPdest [190-210.*.*.*] Protocol = HTTP

Domain(Content_URL)

Anomaly is not detected ,

False positive?

Request to change system mode

LOW

11 © Nokia Solutions and Networks 2014

Simple Example

Public

Filter Analysis

Timestamp, IPsrc, IPdest, Protocol, Content, Length

etc…

Everything!

Anomaly is detected

Request to change system mode

HIGH

12 © Nokia Solutions and Networks 2014

Public

Hypothesis: If we can calculate the risk from a number of parameters, eg: sensitivity, identifiability etc then we can [partially-]order datasets based on their content

13 © Nokia Solutions and Networks 2014

Public

Metricisation of privacy

14 © Nokia Solutions and Networks 2014

Public

Metricisation of privacy

15 © Nokia Solutions and Networks 2014

Public

Metricisation of privacy

16 © Nokia Solutions and Networks 2014

Public

timestamp IPsrc IPtarg protocol length [port, verb, URL, etc..]

timestamp IPsrc IPtarg timestamp f(IPsrc) f(IPtarg) protocol

timestamp f(IPsrc) f(IPtarg) l-diverse(protocol)

DP(timestamp,e=0.01) f(IPsrc) f(IPtarg) l-diverse(protocol)

DP(timestamp,e=0.001)

17 © Nokia Solutions and Networks 2014

Public

timestamp IPsrc IPtarg protocol length [port, verb, URL, etc..]

timestamp IPsrc IPtarg timestamp f(IPsrc) f(IPtarg) protocol

timestamp f(IPsrc) f(IPtarg) l-diverse(protocol)

DP(timestamp,e=0.01) f(IPsrc) f(IPtarg) l-diverse(protocol)

DP(timestamp,e=0.001)

18 © Nokia Solutions and Networks 2014

Public

Risk and Justification •Prior justification actually lowers risk

•from a legal point of view •The set of criteria for a mode change can be better reasoned about

•the choice of filtering or anonymisation technology can be better made •the degree of anonymisation can be rationalised

Data

Risk

Mode

19 © Nokia Solutions and Networks 2014

Public

•Metricisation •Proper ontological support

•aspects of information, eg: type, provenance, purpose, usage, risk, requirements •compositional problems

•Library of techniques •differential privacy (and its suitable parameters) •k-anon,l-div,t-closeness •dynamic data pipeine construction and ordering •quasi-identifiers

Data

Risk

Mode

20 © Nokia Solutions and Networks 2014

Summary

Public

•A metrics based framework for justified data collection can be constructed •Requires

•processing methods (and infrastructure) •a metric of risk •ontological support

•Analysis and Mode change

•suitable analysis techniques (ML) that can operate over ’noisy’ or ’low semantic content’ data

•Implementation •NFV, 5G

Data

Risk

Mode

24 © Nokia Solutions and Networks 2014

Public

Everything that flows over the

network, IP addresses etc

A select sub-set of that information

according to a given signature

A description of what is required for a given use-case, eg: malware

signatures

25 © Nokia Solutions and Networks 2014

Public

Risk that we miss the malware, terrorist etc

increases

This is the process of data

minmisation

Privacy risk decreases as we employ data

minimisation (diff. K-anon etc)

26 © Nokia Solutions and Networks 2014

Public

Risk that we miss the malware, terrorist etc

increases

This is the process of data

minmisation

Privacy risk decreases as we employ data

minimisation (diff. K-anon etc)

Increase in happiness of privacy lawyers

Increase in difficulty of monitoring

27 © Nokia Solutions and Networks 2014

Public

28 © Nokia Solutions and Networks 2014

Public

Privacy Metric (one of potentially many)

29 © Nokia Solutions and Networks 2014

Public

Other metrics can be constructed too

30 © Nokia Solutions and Networks 2014

Public

Combined Privacy Metric

31 © Nokia Solutions and Networks 2014

Public

Hypothesis: If we can calculate the risk from a number of parameters, eg: sensitivity, identifiability etc then we can [partially-]order datasets based on their content

Recommended