Current Approaches to Automated Information Evaluation and their Applicability to Priority Intelligence Requirement Answering

Fusion 2010 13th International Conference on Information Fusion EICC, Edinburgh, UK Thursday, 29 July 2010

Current Approaches to Automated Information Evaluation and their Applicability to

Priority Intelligence Requirement Answering

Outline

• Overview • Priority Intelligence Requirements• Doctrine: Reliability/Credibility• Question-Answering Technologies• Conclusion/Research Gaps• Disclaimer

VIStology | FUSION 2010 | Edinburghwww.vistology.com 2

Overview

Priority Intelligence Requirement (PIR) Answering requires STANAG 2022 assessments of information reliability/credibility/independence. Each element of information in a PIR should have an assessment of: the accuracy of the information provided, how credible it is, and how reliable the source is. STANAG 2022 explicitly adopted in US and NATO doctrine.

There are currently no real tools for making these assessments, or for reasoning with STANAG-assessed data.

Contemporary commercial question-answering tools partially address some of the necessary reliability/independence/credibility issues.

Here we survey the state of the art technology and evaluate the research gaps.

www.vistology.com VIStology | FUSION 2010 | Edinburgh 3

Priority Intelligence Requirements: Doctrine

Priority Intelligence Requirements (PIRs) are “those intelligence requirements for which a commander has an anticipated and stated priority in his task of planning and decision making” (FM 2-0 “Intelligence”, section 1-32)

• Ask a single question. • Are ranked in importance.• Are specific: Focus on a specific event, fact or activity.• Are tied to a single decision or planning task the commander has to

make.• Provide a last time by which information is of value (LTIOV).• Are answerable using available assets and capabilities.

– McDonough, LTC W. G., Conway, LTC J. A., “Understanding Priority Intelligence Requirements”, Military Intelligence Professionals Bulletin, April-June 2009.


NATO STANAG 2022

Question-Answering Technologies by Source Data Format


Information Source Format

FamiliarApplication

AdvancedApplication

Tables(Relational DBs,Spreadsheets)

Structured Query Language (SQL)

Wolfram Alpha(Mathematica)

Text Web Search Engines (Google, Yahoo!, Ask)

Systems from AQUAINT (IC) competition; IBM Watson

Tagged Text Google Patent Search Metacarta;Palantir

Logic Statements Prolog Powerset (acquired by MS Bing); Cyc

Trusted Teammates

Personal Communication

Yahoo! Answers;Vark (acquired by Google);US Army Intelligence Knowledge Network Shoutbox

Structured Data Q-A: Wolfram Alpha


Wolfram Alpha identifies Tupelo as where Elvis was born (Elvis

disambiguated as Elvis Presley) and provides map overlay and additional

info, like current city population. Reference sources listed by title on another screen, no access to source

data.

Query “Where was Elvis born?” automatically translated to Mathematica query: Elvis

Presley, place of birth.

Text: Google


Google PageRank disambiguates query:

Elvis = Elvis Presley by PageRank.Top-ranked snippets can easily be

scanned for consensus answer from independent sources:

Tupelo, MS. PageRank less useful in MI context because reports are

not hyperlinked.

Text-Based Q-A: IBM Watson


IBM’s text-based algorithms identified these

phrases as top potential “Jeopardy” answer, with

scores displayed.In “Jeopardy”, answer is in

form of question.

Query in “Jeopardy” format (including category “Musical Pastiche”)

Tagged Text: Metacarta


Query identifies documents that contain

“elvis”, “born” and a location. Answers literally

all over the map. Consensus answer not obvious from location

clusters. Documents are recent news articles.

Query: “Where was Elvis born?”

Logic-Based Q-A: Powerset


Answers involve

multiple “Elvises”.

Source data is Wikipedia

only.

Social Question-Answering: Vark


Routed to unknown user

in my ‘network’ computed as

likely to provide answer; Answer returned in less

than minute.Optimized for

mobile environment.

Feedback

Vark queries need to be over certain length,

hence this phrasing.

Comparison by TechnologySTANAG

RequirementTables:Wolfram Alpha

Text:GoogleIBM Watson

Tagged Text:MetacartaPalantir

Logic Statements:Powerset

Teammates:VarkY! Answers

Source Wolfram: Reference document title (no url)

URL of document in which info appears (usually: not Watson). No further attempt to match info to source. I.e. not: 1000 demonstrators according to police.

Teammate known. May not say where info originates.

Source Reliability Curated data: Reference works, Government data.

Centrality measures: Google PageRank (eigenvector centrality); Technorati Authority (inlink centrality);VIStology blogger authority (centrality + engagement)

Curated data: Wikipedia.

Wikipedia has PageRank: 9 out of 10 (reliable)

Track record, Reputation.Votes on answers.Longevity.# of answers

Source Independence

No. One unified datastore.

Duplicate document detection;Explicit source tracking (href; bit.ly); Leskovec meme tracking.SNA metrics of independence.

No. Single data source.

UserAuthentication.

InformationCredibility

Partial Integrity constraints.Can’t easily verify info.

Consensus answers; same answer identified in multiple distinct sources.

Could check integrity constraints; URI co-ref a problem. Contradictions halt inference.

Demonstrated area of expertise


Research Gaps(1) How best to map network-based reliability metrics to

STANAG 2022 reliability codes?(2) How to make reliability metrics derived from networks of

different scales comparable with non-estimated reliability metrics?

(3) How to automatically reason with information that has been assigned STANAG 2022 evaluation codes?

(4) How to efficiently identify independent confirmation of reports in social media and other networked sources?

(5) How to tractably identify inconsistent new reports?(6) How to adjudicate inconsistencies among reports

automatically?


Conclusions

In contemporary environments, direct evaluation of source reliability may be impossible given the proliferation of OSINT and other sources relevant to COIN fight across all PMESII-PT categories.

Networked sources make judging independence of sources and identifying influence more difficult.

Analysts may have to rely on correlated network-based metrics of reliability, credibility and inde-dence rather than evaluate many sources/reports as “Reliability cannot be judged”/”Truth cannot be judged”.


Thank You

Note: This paper does not represent an endorsement by the Army Research

Laboratory of any of the commercial products discussed.