22
Application of Confidence Intervals to Text-based Social Network Construction By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John Graham, D/BS&L

Application of Confidence Intervals to Text-based Social Network Construction

  • Upload
    azuka

  • View
    24

  • Download
    1

Embed Size (px)

DESCRIPTION

Application of Confidence Intervals to Text-based Social Network Construction. By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John Graham, D/BS&L. Agenda. The Real-World Problem Text Analysis/Social Network Analysis Solution Social Network Analysis - PowerPoint PPT Presentation

Citation preview

Page 1: Application of Confidence Intervals to Text-based Social Network Construction

Application of Confidence Intervals to Text-based Social Network Construction

By CDT Julie Jorgensen, 06, G4

Advisors: MAJ Ian McCulloh, D/MATHLTC John Graham, D/BS&L

Page 2: Application of Confidence Intervals to Text-based Social Network Construction

Agenda The Real-World Problem Text Analysis/Social Network Analysis Solution

Social Network Analysis Simple Text Analysis

A Better Solution Themed Analysis Example Case – Jihadist Texts Theme Scores

Network Construction Procedure Jihadist Network

Results Importance and Conclusions

Page 3: Application of Confidence Intervals to Text-based Social Network Construction

The Real-World Problem

Commanders need to understand “Human Terrain” Majority of ‘HT’ information is in text form

The Combating Terrorism Center receives volumes of data every day.

Harmony Database is being rapidly declassified Need an efficient way to plow through large amounts

of text data and see the linkages.

Solution: Text Analysis Displayed in Social Network Analysis

Page 4: Application of Confidence Intervals to Text-based Social Network Construction

Social Network Analysis

A mathematical method of quantifying connections between individuals or groups and drawing conclusions from those connections

Assumes rational beings are interdependent Nodes

Key Actors Links

Relationships between Nodes

Page 5: Application of Confidence Intervals to Text-based Social Network Construction

“Human Terrain” Example: 9/11 Hijacker Network

Page 6: Application of Confidence Intervals to Text-based Social Network Construction

Barzani Khamenei

Iraq Elections

Page 7: Application of Confidence Intervals to Text-based Social Network Construction

Demonstration Data Set:Jihadist Texts

Approx. 250 translated texts MEMRIFBISOther Sources

15 Authors More than 1 textNot well known

Page 8: Application of Confidence Intervals to Text-based Social Network Construction

Simple Text Analysis: The Plagiarism Check

Problem Word matching is

overly simple. Ignores context Actors can be

overly weighted by writing more

Page 9: Application of Confidence Intervals to Text-based Social Network Construction

Alternative: Themed Analysis

Traditional Network Analysis MethodsCitation AnalysisPhysical NetworkCommunication or Financial Network

Themed Analysis Relates nodes across multiple fields

One similar theme versus many similar themes

Page 10: Application of Confidence Intervals to Text-based Social Network Construction

Demonstration: Text Analysis

Page 11: Application of Confidence Intervals to Text-based Social Network Construction

Theme ScoresISLAM JIHAD SALAF INFIDEL FOREIGNERSSHEIKH BATTLEGROUNDS JEWSallah al_jihad salaf infidel united_states shaykh Afghanistan jewsreligion mujahid sunnah apostate government bosnia zionistsislam attack sallam heretic al-Saud two-rivers usurymuslim raid kuffr Australia iraq israelummah defense taghoot Britain palestinebrother plane idol Spainbook bombing Italymessenger operation Franceprophet clashmohammad fight

conflict

THEMES

*Theme Score is the sum of each word’s score per text Problem

Commander needs information in representations he/she understands.

Networks can compare authors across single themes But difficult to compare authors across multiple

themes

Page 12: Application of Confidence Intervals to Text-based Social Network Construction

Constructing a Network Across Multiple Themes

Scrub Texts Construct Theme Scores Construct Confidence Intervals Discern Similarity between Nodes

Binary or Standardized Difference of Means Create Square Matrix Draw Network

*why not ANOVA?

Page 13: Application of Confidence Intervals to Text-based Social Network Construction

Confidence Intervals 95% Confidence Interval =

Each Author, Each Theme Example:

nst

Author MugrinTheme Islam

Text Score Mean Width Low Highctc127 0.7234 0.50602 0.191819 0.314201 0.697839ctc126 0.7328ctc125 0.5387ctc124 0.668ctc123 0.2012ctc122 0.6931ctc121 0.3977ctc120 0.227ctc119 0.0553ctc118 0.823

Page 14: Application of Confidence Intervals to Text-based Social Network Construction

Relationship Scores

Each possible pair of authors per themeOverlapping Confidence Intervals

Disparate Confidence Intervals

MaxDiffActDiffMaxDiffs ji

,

0, jis

Page 15: Application of Confidence Intervals to Text-based Social Network Construction

Matrix Construction

• Multiplication of Scores for each author and each theme

• Resultant Square MatrixMugrin al-Iraqi Alshareef al Albanee Ibn Baaz Abdul Aziz Azzam At Tartusi Maqdisi Shuaibi Al-Fahd Madkhalee Madhi Al-Awdah Qaradhawi

Mugrin 1.00000 0.76695 0.00000 0.00000 0.00000 0.00000 0.84938 0.00000 0.84852 0.80676 0.83939 0.00000 0.84403 0.00000 0.00000al-Iraqi 0.76695 1.00000 0.51748 0.00000 0.00000 0.00000 0.84449 0.00000 0.69722 0.82516 0.81203 0.00000 0.72532 0.00000 0.00000

Alshareef 0.00000 0.51748 1.00000 0.75690 0.83688 0.00000 0.00000 0.00000 0.00000 0.00000 0.77599 0.00000 0.94616 0.00000 0.00000al Albanee 0.00000 0.00000 0.75690 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.90076 0.00000 0.00000 0.00000Ibn Baaz 0.00000 0.00000 0.83688 0.00000 1.00000 0.91174 0.82297 0.78024 0.80594 0.90168 0.91619 0.00000 0.86383 0.87589 0.69418

Abdul Aziz 0.00000 0.00000 0.00000 0.00000 0.91174 1.00000 0.00000 0.00000 0.73681 0.52157 0.85487 0.95733 0.88681 0.94896 0.00000Azzam 0.84938 0.84449 0.00000 0.00000 0.82297 0.00000 1.00000 0.59977 0.93159 0.81534 0.89227 0.00000 0.79010 0.00000 0.63895

At Tartusi 0.00000 0.00000 0.00000 0.00000 0.78024 0.00000 0.59977 1.00000 0.52446 0.81876 0.82699 0.00000 0.00000 0.00000 0.00000Maqdisi 0.84852 0.69722 0.00000 0.00000 0.80594 0.73681 0.93159 0.52446 1.00000 0.77203 0.86424 0.00000 0.82544 0.76400 0.77915Shuaibi 0.80676 0.82516 0.00000 0.00000 0.90168 0.52157 0.81534 0.81876 0.77203 1.00000 0.92896 0.00000 0.57030 0.64583 0.00000Al-Fahd 0.83939 0.81203 0.77599 0.00000 0.91619 0.85487 0.89227 0.82699 0.86424 0.92896 1.00000 0.00000 0.80821 0.86983 0.00000

Madkhalee 0.00000 0.00000 0.00000 0.90076 0.00000 0.95733 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000Madhi 0.84403 0.72532 0.94616 0.00000 0.86383 0.88681 0.79010 0.00000 0.82544 0.57030 0.80821 0.00000 1.00000 0.00000 0.00000

Al-Awdah 0.00000 0.00000 0.00000 0.00000 0.87589 0.94896 0.00000 0.00000 0.76400 0.64583 0.86983 0.00000 0.00000 1.00000 0.00000Qaradhawi 0.00000 0.00000 0.00000 0.00000 0.69418 0.00000 0.63895 0.00000 0.77915 0.00000 0.00000 0.00000 0.00000 0.00000 1.00000

Overall Theme Scores

Geometric Mean = nn

iia

1

1

Page 16: Application of Confidence Intervals to Text-based Social Network Construction

Themed Network

Page 17: Application of Confidence Intervals to Text-based Social Network Construction

Theme Analysis: Confidence Interval vs Average

Texts Degree NrmDegreeAl-Fahd 8 9.389 70.053Maqdisi 6 8.549 63.789Ibn Baaz 10 8.41 62.745Shuaibi 3 7.606 56.753Madhi 4 7.26 54.17Azzam 4 7.185 53.608

Abdul Aziz 4 5.818 43.41al-Iraqi 7 5.189 38.714Mugrin 10 4.955 36.971

Al-Awdah 16 4.105 30.625Alshareef 2 3.833 28.602At Tartusi 2 3.55 26.489Qaradhawi 7 2.112 15.76Madkhalee 7 1.858 13.864al Albanee 2 1.658 12.368

WeightedAuthor islam jihad salaf infidel foreigners battlegrounds sheikh jew Average Rank Overallal-Fahd 10 5 6 3 7 6 2 9 5.57 1Mugrin 6 4 12 6 1 4 11 9 6.29 2Shuaibi 11 9 11 1 6 5 3 9 6.57 3Azzam 12 2 10 7 8 3 7 8 7.00 4

Maqdisi 9 1 8 8 5 9 10 4 7.14 5al-Iraqi 8 6 13 5 9 1 11 9 7.57 6

At-Tartusi 14 8 4 2 15 10 1 5 7.71 7Abdul Aziz 5 10 9 4 10 12 5 6 7.86 8

Madhi 2 7 13 12 3 7 11 2 7.86 9Qaradhawi 15 3 13 9 11 2 4 1 8.14 10Alshareef 3 14 3 14 2 11 11 3 8.29 11

Madkhalee 4 13 2 11 12 12 6 9 8.57 12Al-Awdah 13 11 7 10 4 8 8 7 8.71 13al Albanee 1 14 1 14 14 12 11 9 9.57 14Ibn Baaz 7 12 5 13 13 12 9 9 10.14 15

Theme Ranks

Able to look at each theme individually.

Average Rank does not account for connections importance, weighting, predictors

Themes are combined

Can see connections between authors across a combination of themes.

Page 18: Application of Confidence Intervals to Text-based Social Network Construction

Method ComparisonThemed Network Analysis Plagiarism Theme Ranks Jihad Theme

Al-Fahd Al-Awdah Al-Fahd MaqdisiMaqdisi Maqdisi Mugrin AzzamIbn Baaz Al-Albanee Shuaibi QaradhawiShuaibi Al-Iraqi Azzam MugrinMadhi Azzam Maqdisi Al-Fahd

Top 5 OnceTop 5 Every Method

Page 19: Application of Confidence Intervals to Text-based Social Network Construction

Conclusions

Socially Engineered Algorithms involve extensive tradeoffs and decisions by the mathematician that can significantly impact commander’s decision-making.

Multiple views of the same data is a critical requirement.

Find Linkages in large amounts of data Find Connections across multiple fields Non-Tangible Relationships Real World: Track / Catch criminals / radical ideologues Representation of Human Terrain

Page 20: Application of Confidence Intervals to Text-based Social Network Construction

Future Work

Publish method in Journal of Computational and Mathematical Organization Theory

Integration into ORA (Organizational Risk Analysis) Statistical Software: In use by Intelligence Analysts.

Analysis of change over time

Page 21: Application of Confidence Intervals to Text-based Social Network Construction

Questions?

Page 22: Application of Confidence Intervals to Text-based Social Network Construction

References Dr. Jaret Brachman. Combating Terrorism Center,

USMA. Dr. Steven Corman. Hugh Downs School of Human

Communication, Arizona State University. http://www.checkpoint-online.ch/CheckPoint/Images/N-H

usseinCapture.jpg http://www.salmac.co.za/profile-writing-arabic.gif Wasserman, Stanley and Katherine Faust. Social

Network Analysis: Methods and Applications. New York: Cambridge University Press, 1994, 4.