27
Connect. Communicate. Collaborate A Network Security Service for GÉANT2 (and beyond….) Maurizio Molina, DANTE TNC 08, Brugges, 20 th May 2008

Connect. Communicate. Collaborate A Network Security Service for GÉANT2 (and beyond….) Maurizio Molina, DANTE TNC 08, Brugges, 20 th May 2008

Embed Size (px)

Citation preview

Connect. Communicate. Collaborate

A Network Security Service for GÉANT2 (and beyond….)

Maurizio Molina, DANTE

TNC 08, Brugges, 20 th May 2008

Connect. Communicate. CollaborateOutline

• The vision• Proof of concept• Supporting tools• Service Outlook

Connect. Communicate. Collaborate

The vision:enhance NRENs security

• NRENs have their CERTs to deal with security• and collaborate with each other

– Trusted Introducer– GN2 JRA2

• and DANTE can filter traffic on GN2 if NRENs request it….

! BUT !

• Can we be more proactive to NREN CERTs exploiting

the visibility of the GN2 core?

Connect. Communicate. Collaborate

The vision (cont.):enhance NRENs security

• To spot security anomalies in the GN2 core you need data• Good old SNMP? Too coarse!• Router Logs? Ok, but need to know what you’re looking for• Run a darknet?

– It’s not where a core network makes a difference– others already do it

• NetFlow? yes, but you need good tools!

• Routing data? Only as a complement of NetFlow

Connect. Communicate. Collaborate

Proof of concept: what can we see with NetFlow data?

NfSen, enhanced with self writtenAnomaly Detection extensions

Netflow collected on all peering interfaces

1 / 1,000 Sampling

3k flows/s

Connect. Communicate. Collaborate

Bits, Packets or Flows?What to use?

• Flows/s are more indicative of security incidents• But with fixed thresholds, small interesting peaks will disappear

in daily cycles!

Connect. Communicate. CollaborateOK, we’re smart… let’s filter! Connect. Communicate. Collaborate

+

K2

K1

X1 X2 +

-

Input

+

Forecast error

k1=(1-p1)*(1-p2) ; k2= (1-p1)+(1-p2)

Choice: p1=p2=0.9

•The “error” is used in control loops• Here we use it

to spot a

deviation from a

baseline

• It’s an “observer”

Connect. Communicate. Collaborate

Does it help? Not if we stick to volumes (e.g. flows/s) … Connect. Communicate. Collaborate

UDP flows (filtered)

TCP flows (filtered)

Connect. Communicate. Collaborate

Are there other more “security sensitive” features?

• Recent work on Anomaly Detection suggests focusing on the concentration or dispersion of

– Flows per IP source address– Flows per IP destination address– Flows per IP source port– Flows per IP destination port

• AKA “IP features entropies”

Connect. Communicate. Collaborate

Explanation of IP feature entropy Connect. Communicate. Collaborate

fraction of total flows received per IP address

0

0.05

0.1

0.15

0.2

0.25

1 6 11 16 21 26

IP (ranked)

fraction of total flows received per IP address

0

0.05

0.1

0.15

0.2

0.25

1 6 11 16 21 26

IP (ranked)

Normal

S

n

S

nxH i

N

i

i2

1

log)(The Entropy H is: H varies between 0 (“one point takes all”)and log2N (uniform distribution)

Traffic more focused towards a few hosts

Connect. Communicate. Collaborate

IP feature entropy (simplified) Connect. Communicate. Collaborate

Normal Traffic more focused towards a few hosts

fraction of total flows received per IP address

0

0.05

0.1

0.15

0.2

0.25

1 6 11 16 21 26

IP (ranked)

fraction of total flows received per IP address

0

0.05

0.1

0.15

0.2

0.25

1 6 11 16 21 26

IP (ranked)

f=0.6 f=0.81

•Percentage of flows associated to top N src IPs, dst IPs, src ports, dst ports• We tried N = 1, 10, 100, 500• N=10 was the best choice (anomalies appear more evident)

Connect. Communicate. Collaborate

IP features entropies (after “observer” filtering)

Connect. Communicate. Collaborate

UDP features entropies

TCP features entropies

10 days of GN2 traffic

Connect. Communicate. CollaborateDrilling downDrilling down on a TCP peak Connect. Communicate. Collaborate

-Concentration of DST IPs and DST ports receiving flows

-Dispersion of SRC IPs and SRC ports

• IRC server in Slovenia, receiving a lot of 60 bytes syn pkts on port 6667, mainly from a /16 Subnetwork of an University in the Netherlands.

• Likely a “BotNet war”?

-The “bounce” is due to the filter, and needs a state machine to be correctly interpreted!

Connect. Communicate. CollaborateDrilling downDrilling down on a UDP peak Connect. Communicate. Collaborate

- Concentration of SRC and DST IPs and SRC ports

- Dispersion of DST ports

• Portscan of host in CARNET, from 4 hosts, 29 bytes packets

-Observe again the “bounce”!

Connect. Communicate. CollaborateDrilling downAnd on smaller aggregates?“DWS” NREN example Connect. Communicate. Collaborate

-Concentration of SRC and DST IPs and SRC ports

-Dispersion of DST ports

• A few hours routing shift event (primary to backup access) triggers a lot of “noise”

• One MUST be able to correlate feature entropies & traffic shifts!

• Other than that, peaks are still very clear!

-Observe again the “bounce”!

Connect. Communicate. Collaborate

And on smaller aggregates?“NON DWS” NREN example Connect. Communicate. Collaborate

• Fewer peaks, but still evident

Connect. Communicate. CollaborateLessons learnt so far

• IP features entropies evidence also low volume anomalies, and can give an initial hint on the anomaly type, but:– need a state machine to be interpreted– fully automatic conclusions are difficult– one must not be oblivious of big volume shifts and

macroscopic events!• A lot of anomalies are “observable” on DWS connectivity

– Good reason for having a security service protecting DWS customers!

– But we’ve seen attacks/scans between NRENs as well

Connect. Communicate. CollaborateMoving forward

• With NfSen and self-written extensions we have enough evidence that:– anomalies are observable in the GÉANT2 core– Novel automatic methodologies for their classifications

are applicable• However, we are looking at commercial tools for moving to

a service– To reduce effort to engineer / maintain / evolve code– Scalability and tool support is an issue for a service

Connect. Communicate. CollaborateTools requirements

• Detection of both low and high volume anomalies– (DoS, DDoS, host and Network scans, worms, phishing

sites, etc.)• Automatic classification, collection of evidence• Detection of anomaly entry points, suggestion of ACLs• Give correct indications also in presence of sudden traffic

shifts due to routing changing/network outages• Robustness to occasional loss of NetFlow records• Work well also with sampled NetFlow

Connect. Communicate. CollaborateTools’ comparison

• Work just started, no conclusion yet• We just report “lesson learnt” so far

– on paper analysis of some tools (four in some detail)– Interaction with vendors

Connect. Communicate. CollaborateTools approaches

• IP features entropy + volume– Pros: no additional info needed, works with low sampling

rate, can catch a wide range of anomalies – Cons: needs drill down after “alert”

• Volume + “fingerprints”– Pros: precise, an alert is already “a conclusion”– Cons: won’t catch what you don’t look for

• Per host behavioural analysis– Pros: precise– Cons: scalability? robustness to low sampling?

Connect. Communicate. CollaborateTools common features

• Require NetFlow on ingress links only• Capable of doing NetFlow v5 and v9• Require SNMP access to routers to read configuration data

Connect. Communicate. Collaborate

Tools distinguishing features

• BGP processing– create POP to POP (or even prefix to prefix) matrixes– correlate big volume shifts to routing changes

• Internal routing (e.g. IS/IS) processing– traffic split of peers on internal (backbone) links

• NetFlow collection on multiple points (routing tracing)– But this is not really a plus, rather an additional burden

for NOT using routing data!

Connect. Communicate. Collaborate

Tools distinguishing features (cont.)

• different approaches to distinguish “normal” from “not normal” behaviour– Principal Component Analysis– Host type classification & rather complex “scoring”

system– moving averages– fixed thresholds

Connect. Communicate. CollaborateService Outlook

• Primary recipients: NREN CERTs• Info provided:

– security alerts about all types of discovered anomalies– Collected evidence– Suggested mitigation actions – Periodic summary reports

• Other recipients: APMs, NREN PC, EU commission– For strategic decisions

Connect. Communicate. CollaborateAcknowledgements

• Prof. Francesco Donati and Dott.sa Gabriella Caporaletti EICAS automazione S.P.A. for the useful hints on the “observer” design and tuning

• Peter Haag from SWITCH for the development of NfSen

Connect. Communicate. Collaborate

Thank you! – Questions?

[email protected]