Detecting and Correcting Malicious Data in VANETs Philippe Golle, Dan Greene, Jessica Staddon Palo Alto Research Center Presented by: Jacob Lynch

Detecting and Correcting Malicious Data in VANETs

Philippe Golle, Dan Greene, Jessica Staddon

Palo Alto Research Center

Presented by: Jacob Lynch

Table of Contents

IntroductionRelated WorkClassification of AttacksDistinguishabilityModelExampleConclusion

Introduction

Vehicular ad-hoc networks rely heavily on node-to-node communication Potential for malicious data

VANETs need a method for evaluating the validity of data

Nodes search for explanations for the data they receive and accept the data based on highest score Nodes can tell “at least some” other nodes apart from

one another Parsimony argument accurately reflects adversarial

behavior in a VANET

Introduction (2)

Each node builds a world view in an offline mode Rules: two vehicles cannot occupy the same

position at the same time, etc. Statistics: vehicles rarely travel faster than 100

MPH, etc.

Density combined with mobility supports parsimony

Related Work

Sybil attacks can foil many algorithms Resource testing (storage, computation,

communication) in MANETs Not appropriate for VANETs, attackers may

cheaply acquire resources

Node registration does not scale wellPosition verification can identify messages

coming from the same source

Classification of Attacks

Decisions based on likelihood of attack scenarios in a VANET, not accumulation of agreeing data

Distinguish attacks based on Nature Target Scope Impact

Attack Nature

Adversary may report False information about other parts of VANET False information about itself

Some attacks may be unpreventable If a node can only sense distance instead of

precise location, this gives an area that one node may successfully mount Sybil attacks

Attack Target

Local targets Close proximity to attacker Better for adversary because the likelihood of

conflicting data from neighbors is reduced Harder to maintain proximity, less likely

Remote targets Further away Data received from neighbor nodes may be

conflicting Easier for an adversary to setup

Attack Scope

Scope is measured by the area of nodes that have data of uncertain validity

Scope is limited if the area of affected nodes is small May be local or remote area to the malicious nodes

Extended attack if larger area of nodes is affected

Approach used is designed to slow local attacks growing into extended attacks by using information propagation

Attack Impact

Three outcomes of an attack Undetected

Attack is completely successful May occur when node is alone or completely surrounded

by malicious nodes Detected

Attack is detected but uncertain data remains Nodes have access to honest nodes, but insufficient

information to justify the risk in attempting to correct data Corrected

Attack is detected and corrected with no remaining uncertain data

Lots of honest nodes available, enough information to identify false information and correct the attack

Model Exploitation

Attacker may choose an attack whose effects are hidden by other incorrect explanations chosen to be more likely in the ordering relation of the model

Two ways to help prevent this Model shows these hidden attacks to be more costly

than simpler attacks Allow model to be changed, adjusts to short term and

long term changes Even though the possibility of a complicated

attack is included in the model, most attackers will use simple attacks, which makes the sophisticated attacker’s job easier

Distinguisability

In order to tell nodes apart there are four assumptions Node can bind observations of its local

environment with the communication it receives Node can tell its local neighbors apart Network is sufficiently dense Nodes can authenticate communication to one

another after coming close enough

Local Distinguishability

A node can distinguish local neighbors Node can associate a message with the physical source

of that message Node can measure relative position of the source of

message Example setup

Equip nodes with cameras and exchange messages using visible or infrared light

Estimate position by analyzing light, message tied to source because the node can tell where it came from

Also use time of arrival, angle of arrival, and received signal strength, which may be tampered with

Extended Distinguishability

Nodes will communicate local observations to nodes farther away

If multiple trusted nodes verify other further nodes as distinct, these nodes may be included in world view as distinct

Use private/public keys refreshed constantly to authenticate communication Distinguishability is lost once key is refreshed if

the node moves out of local neighborhood

Privacy

Trade-off between privacy and ability to detect and correct malicious data Changing keys increases privacy but hinders detection

and correction of malicious data An isolated node regularly reporting its position

changes its key Easy to assume the new key belongs to the same node

based on trajectories Suggestions for changing keys

Change keys at synchronized times Introduce gaps in data reported near key changes Change keys when nodes are near one another

Model

Nodes may record an observation if the location of the event is within their observation range the entire duration of the event

Assertions recorded by a node are instantaneously available to all other nodes Value of data declines the further away from the

event it is transmitted, dealing with a small area

Model (2)

To explain a set of events at a node Each event must be tagged with a hypothesis

Hypotheses are chosen from a set of hypotheses The set of hypotheses is partitioned into valid and

invalid based If all the hypotheses matched to the set of

events are valid, then the explanation is valid Explanations are ordered based on statistical

methods, for example, Occam’s razor

Example

Assume nodes are able to precisely sense the location of neighbors within communication range

There is a set of observed events K, which can included observations about nodes made by themselves

Model for the VANET will be valid if there is a reflexive observation for every node, and every non-reflexive observation agrees with the reflexive observations

Example (2)

Each node comes up with an explanation Label each observation in the set of events as

truthful, malicious, or spoof The observations made by the node constructing the

explanation are truthful Observers labeled as spoofs should not have any of

their observations recorded as truthful One added observation per reflexive observation

made be made that supplies correct location information consistent with other truthful observations

Example (3)

Score each explanation according to the number of distinct observers that are labeled malicious

The valid explanation with the fewest malicious nodes is considered the simplest and most plausible explanation

There may be enough information in the set of events to identify all the truthful and malicious nodes

Example (4)

When there are only a few malicious nodes, explanations can be computed by Treating truthful observations

as arcs in a graph and beginning a breadth first search starting at the nodes location, traverse arcs as long as the next node hasn’t been labeled as malicious

All unreached nodes will be labeled as spoofs

Algorithm terminates when it has found explanations consistent with VANET model with fewest malicious nodes

Example (5)

Second example of model included Nodes are not able to distinguish between

another nodes with the same precisionUse another breadth first search to

generate explanationsOrder explanations by looking for few

malicious nodes and a regular density as opposed to spare or dense patterns of nodes

Conclusion

Accurate and precise sensor data is important in identifying malicious nodes and data

Finding the most likely explanation in each case will be difficult Manageable when there are only a few

malicious nodes Could be accelerated by having nodes share

candidate explanations with each other

Documents

Detecting and Correcting Malicious Data in VANETs Philippe Golle, Dan Greene, Jessica Staddon Palo Alto Research Center Presented by: Jacob Lynch