26
I1.2 A Quality-of-Information Theory for Sensor Data Collection and Fusion Abdelzaher (UIUC)

Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Embed Size (px)

Citation preview

Page 1: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

I1.2 A Quality-of-Information Theory for Sensor Data Collection

and FusionAbdelzaher (UIUC)

Page 2: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Research Milestones

Due Description

Q1Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of prediction/estimation results.

Q2

Extended analysis of semantic links in information networks. Formulation of information network abstractions that are amenable to analysis as new sensors in a data fusion framework.

Q3Data pool quality metrics and impact of data fusion. Formulation of metrics for data selection when all data cannot be used/sent.

Q4 Validation of QoI theory. Documentation and publications.

Page 3: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Research Milestones

Due Description

Q1Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of prediction/estimation results.

Q2

Extended analysis of semantic links in information networks. Formulation of information network abstractions that are amenable to analysis as new sensors in a data fusion framework.

Q3Data pool quality metrics and impact of data fusion. Formulation of metrics for data selection when all data cannot be used/sent.

Q4 Validation of QoI theory. Documentation and publications.

Page 4: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

4

Signal data fusionInformation

Network Analysis

Sensors, reports, and human sources

Sensors, reports, and human sources

Trust, Social Networks

Methods:• Bayesian analysis• Maximum

likelihood Estimation

• etc.

Methods:• Ranking• Clusterin

g• etc.

Methods:• Fact-finding• Influence

analysis• etc.

Machine Learning

Methods:• Transfer

knowledge

• CCM• etc.

Fusion of hard sources

Fusion of soft sources

Fusion of text and images

Fusion from human sources

This Talk: Towards a QoI Theory for Data Fusion from Sensors + Information network links

Page 5: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Infrared motion sensor

Target

Sensor Fusion Example: Target Classification

Vibration sensors

Acoustic sensors

Different sensors (of known reliability, false alarm rates, etc) are used to classify targets

Well-developed theory exists to combine possibly conflicting sensor measurements to accurately estimate target attributes.Bayesian analysisMaximum likelihoodKalman filtersetc.

Page 6: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Information Network MiningExample: Fact-finding

Example 1:Consider a graph of who published where (but

no prior knowledge of these individuals and conferences)

Rank conferences and authors by importance in their field

Han

Abdelzaher

Roth

Sensys

KDD

WWWFusion

Example 2:Consider a graph of who said what (sources

and assertions but no prior knowledge of their credibility)

Rank sources and assertions by credibility

John

Sally

Mike

Claim4

Claim1

Claim3Claim2

Page 7: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

The ChallengeHow to combine information from sensors and

information network links to offer a rigorous quantification of QoI (e.g., correctness probability) with minimal prior knowledge?

Infrared motion sensor

TargetVibration sensors

Acoustic sensors

John

Sally

Mike

Claim4

Claim1

Claim3Claim2

+

P(armed convoy)=?

Page 8: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

ApplicationsUnderstand Civil Unrest

Remote situation assessmentUse Twitter feeds, news, cameras, …

Expedite Disaster RecoveryDamage assessment and first

responseUse sensor feeds, eye witness

reports, …

Reduce Traffic CongestionMaping traffic congestion in cityUse crowd-sourcing (of cell-phone

GPS measurements), speed sensor readings, eye witness reports, …

Page 9: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Approach: Back to the BasicsInterpret the simplest fact-finder as a classical

(Bayesian) sensor fusion problemIdentify the duality between information link

analysis and Bayesian sensor fusion (links = sensor readings)

Use that duality to quantify probability of correctness of fusion (i.e., information link analysis) results

Incrementally extend analysis to more complex information network models and mining algorithms

Page 10: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

An Interdisciplinary Team

Abdelzaher (QoI, sensor fusion)Roth (fact-finders, machine learning)Aggarwal, Han (Data mining, veracity

analysis)

Fusion TaskI1.1

QoI Mining

TaskI3.1QoI Task

I1.2

Page 11: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

The Bayesian InterpretationThe Simplest Fact-finder:

John

Sally

Mike

Claim4

Claim1

Claim3Claim2

i

j

Claimskk

ii

Sourceskk

jj

RankRank

RankRank

)Claim(1

)Source(

)Source(1

)Claim(

The Simplest Bayesian Classifier (Naïve Bayesian):

Z

)Target|Sensor(

)Target()Sensors|Target(

jSensorskjk

jj

P

PP

Page 12: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

The Equivalence Condition

We know that for a sufficiently small xk:

Z

)Target|Sensor(

)Target()Sensors|Target(

jSensorskjk

jj

P

PP

k k

kk xx 1)1(

Consider individually unreliable sensors:

1,1)Sensor(

)Target|Sensor( jkjk

k

jk xxP

P

Page 13: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

A Bayesian Fact-finder

and:

i

j

Claimskki

Sourceskkj

RankRank

RankRank

)Claim()Source(

)Source()Claim(

)1)Source((network)|Source(

)1)Claim(()network|Claim(

ii

jj

RankP

RankP

ClaimsStatesMeasured

SourcesSensors

By duality, if:

Then, Bayes Theorem eventually leads to:

Page 14: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Fusion of Sensors and Information Networks

Putting fusion of sensors and information network link analysis on a common analytic foundation:Can quantify probability of correctness of

resultsCan leverage existing theory to derive

accuracy bounds

Source1

Source3

Source2

Claim4

Claim1

Claim3 Claim2

Sensor1Sensor2

Sensor3

Fusion Result

Information Network

Page 15: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Fusion of Sensors and Information Networks

Putting fusion of sensors and information network link analysis on a common analytic foundation:Can quantify probability of correctness of

resultsCan leverage existing theory to derive

accuracy bounds

Source1

Source3

Source2

Claim4

Claim1

Claim3 Claim2

Sensor1Sensor2

Sensor3

Fusion Result

Information Network

Measurements

Measurements

Page 16: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Simulation-based EvaluationGenerate thousands of “assertions” (some true, some

false – unknown to the fact-finder)Generate tens of sources (each source has a different

probability of being correct – unknown to the fact-finder)Sources make true/false assertions consistently with their

probability of correctnessA link is created between each source and each assertion it

makesAnalyze the resulting network to determine:

The set of true and false assertionsThe probability that a source is correct

No prior knowledge of individual sources and assertions is assumed

Page 17: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Evaluation ResultsComparison to 4 fact-finders from literature Significantly improved prediction accuracy of

source correctness probability (from 20% error to 4% error)

Page 18: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

(Almost) no false positives for larger networks (> 30 sources)

Evaluation ResultsComparison to 4 fact-finders from literature

Page 19: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Below 1% false negatives for larger networks (> 30 sources)

Evaluation ResultsComparison to 4 fact-finders from literature

Page 20: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Coming up: The Apollo FactFinder

Apollo Architecture

Apollo: Towards Factfinding in Participatory Sensing, H. Khac Le, J. Pasternack, H. Ahmadi, M. Gupta, Y. Sun, T. Abdelzaher, J. Han, D. Roth, B. Szymanski, and S. Adali, demo session at ISPN10, The 10th International Conference on Information Processing in Sensor Networks, April, 2011, Chicago, IL, USA.

Abdelzaher, Adali, Han, Huang, Roth, Szymanski

Apollo: Improves fusion QoI from noisy human and sensor data. Demo in IPSN 2011 (in April) Collects data from cell-phones Interfaced to twitter Can use sensors and human text Analysis on several data sets: what really happened?

Page 21: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Apollo Datasets

Track data from cell-phones in a controlled experiment

2 Million tweets from Egypt Unrest

Tweets on Japan Earthquake, Tsunami and

Nuclear Emergency

Page 22: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Immediate ExtensionsNon-independent sources

Sources that have a common bias, sources where one influences another, etc.

Collaboration opportunities with SCNARC and Trust

Non-independent claimsClaims that cannot be simultaneously trueClaims that increase or decrease each other’s

probabilityMixture of reliable and unreliable sources

More reliable sources can help calibrate correctness of less reliable sources

Page 23: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Road AheadDevelop a unifying QoI-assurance theory for fact-finding/fusion from hard and soft sources

SourcesUse different media: signals, text, images, …Feature differ authors: physical sensors, humans

Capabilities Computes accurate best estimates of probabilities of correctness Computes accurate confidence bounds in resultsEnhances QoI/cost trade-offs in data fusion systemsIntegrates sensor and information network link analysis into a

unified analytic framework for QoI assessmentAccounts for data dependencies, constraints, context and prior

knowledgeAccount for effect of social factors such as trust, influence, and

homophily on opinion formation, propagation, and perception (in human sensing)

Impact: Enhanced warfighter ability to assess information

Page 24: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

CollaborationsFusion TaskI1.1

QoI/cost analysis (unified theory for estimation/prediction

and information network link analysis

QoI TaskI1.2

QoI Mining

TaskI3.1

(w/Jiawei Han) Consider new link analysis algorithms

OICC TaskC1.2

Community

ModelingS2.2

Sister QoI TaskC1.1

Decisions under StressS3.1

(w/Dan Roth) Account for prior knowledge and

constraints

(w/Boleslaw Szymanski and Sibel Adali)Model humans in the loop

(w/Ramesh Govindan) Improve communication resource efficiency

(w/Aylin Yener) Increase OICC

Page 25: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

CollaborationsCollaborative – Multi-institution:Q2 (UIUC+IBM): Tarek Abdelzaher, Dong Wang, Hossein Ahmadi,

Jeff Pasternack, Dan Roth, Omid Fetemieh, and Hieu Le, Charu Aggarwal, “On Bayesian Interpretation of Fact-finding in Information Networks,” submitted to Fusion 2011

Collaborative – Inter-center:Q2 (I+SC): H. Khac Le, J. Pasternack, H. Ahmadi, M. Gupta, Y. Sun,

T. Abdelzaher, J. Han, D. Roth, B. Szymanski, S. Adali, “Apollo: Towards Factfinding in Participatory Sensing,” IPSN Demo, April 2011

Q2 (I+SC): Mani Srivastava, Tarek Abdelzaher, Boleslaw Szymanski, “Human-centric Sensing,” Philosophical Transactions of the Royal Society, special issue on Wireless Sensor Networks, expected in 2011 (invited).

Invited Session on QoI at Fusion 2011(co-chaired with Ramesh Govindan, CNARC)

Page 26: Abdelzaher (UIUC). Research Milestones DueDescription Q1 Estimation-theoretic QoI analysis. Formulation of analytic models for quantifying accuracy of

Military RelevanceEnhanced warfighter decision-making ability

based on better quality assessment of fusion outputs

A unified QoI assurance theory for fusion systems that utilize both sensors and information networksOffers a quantitative understanding of the

benefits of exploiting information network links in data fusion

Enhances result accuracy and provides confidence bounds in result correctness