Download pptx - Modeling and Detection of Sudden Spurts in Activity Profile of Terrorist Groups

University of Southern California Department of Mathematics

Modeling and Detection of Sudden Spurts in Activity Profile of

Terrorist Groups

QMDNS May 1, 2012

Vasanthan Raghavan Joint work with Aram Galstyan and Alexander Tartakovsky

1

Broad Objectives

Terrorism is no longer a shocking word to most of us! Ongoing radicalization of different interest groups Rise of social media has made tracking terrorist activity a harder task

Objective 1: Development of data-driven probabilistic models of activity of adversarial networks based on interacting hierarchical hidden Markov models

Objective 2: Develop methods for model learning inferencing – rapid detection and tracking of changes in adversarial behavioral

patterns and objectives based on optimal nonlinear filtering and quickest changepoint detection techniques

2

Features of Activity Profile Collecting terrorism data is a painful exercise!

Open-source databases: ITERATE, RDWTI, GTD, etc.

Temporal ambiguity (time-scale of modeling = Days) Attributional ambiguity (turf warfare) Data sparsity (604 attacks over 9 years) Missing data, mis-labeled data

1998 2000 2002 2004 2006 2008

Source: RAND Database on Worldwide Terrorism Incidents (RDWTI)

3

Model for Terrorist Activity

Observation Period: Observations: No. of attacks = History of group:

Model for activity profile

Desired Qualities in an Ideal Model: Describes data sufficiently accurately Motivated by a small set of hypotheses Robust to looseness in hypothesis statements Described by few parameters Robust to missing or mis-labeled data

4

Existing Models

Type 1: Classical time-series techniques Fit trend, seasonality and stationary components to time-series [Enders &

Sandler] Quadratic or cubic trend = 4 parameters, seasonality = 3, stationary part = 1 Fit lagged value of endogenous variables, and other variables [Barros] 8 or more model parameters

Key Theme: Good-to-acceptable fit for time-series at the cost of large number of parameters in a model with complicated dependencies

Type 2: Group-based trajectory analysis Identify cases with similar development trends [Nagin] Cox proportional hazards model + logistic regression methods for model

selection [LaFree, Dugan & co-workers] Key Theme: Contagion theoretic viewpoint Current activity of group is

influenced by past history of group Attacks are clustered

5

Self-exciting Hurdle Model

Type 3: Self-exciting hurdle model Puts the contagion point-of-view on a theoretical footing Motivated by similar model development in

Earthquake models – Aftershocks are function of current shock Inter-gang violence – Action-reaction violence between gangs Epidemiology – immigrants + offsprings in a cell colony

[Porter & White 2012, White, Porter & Mazerolle @ QMDNS 2010, Erik Lewis] Hurdle probability component: Accounts for few attacks Self-exciting component: Accounts for clustering of attacks Key Theme:

Excellent model-fit Explains clustering of attacks from a theoretical perspective Self-exciting component can be complicated more parameters

Is a self-exciting model necessary to explain clustering? 6

Motivating Hypotheses

Hypothesis 1: Current activity of the group depends on past history only through k dominant states (that remain hidden)

Hypothesis 2: Of these k states, the two most dominant are Its Capabilities ( ) – Manpower assets, special skills (bomb-making, IED),

propaganda warfare skills, logistics skills, coordination with other groups, ability to raise finances, etc.

Its Intentions ( ) – Guiding ideology/philosophy (e.g., Marxist-Leninist-Maoist thought, political Islam), designated enemy group, nature of high profile attacks, nature of propaganda warfare, etc.

[Cragin and Daly, “The dynamic terrorist threat: An assessment of group motivations

and capabilities in a changing world”]

7

Motivating Hypotheses Hypothesis 3:

Mature group Intentions are to attack (more or less) Change in capabilities is primarily responsible for change in attack patterns

A d-state model for Capabilities d = 2: Active state (high capability/strong), Inactive state (low capability/weak)

Observation density: Different possibilities (Poisson, shifted Zipf, geometric, etc.)

8

Hurdle/State transitions Data rarity Self-exciting comp./Diff. rates Clustering

Consequences Metrics to capture capabilities of group

Time–window: No. of days of terrorist activity in a day time-window (Xn) – measures

resilience of group [Santos]

No. of attacks in a day time-window (Yn) – measures level of coordination in group [Lindberg]

Consequence 1: where

Consequence 2: Time to next day of terrorist activity is appx. exponential The days of activity form a Poisson process for Consequence 3: Yn is compound Poisson whose density can be written in

terms of density of Mi 9

Case-Study 1: FARC Revolutionary Armed Forces of Colombia (FARC)

Oldest and largest terrorist group in the Americas Marxist-Leninist ideology, anti-establishmentist Uses guerilla warfare Actively involved in cocaine cultivation and trans-shipment to U.S. and W. Europe,

kidnapping rings, …

Database used: RAND Database on Worldwide Terrorism Incidents (RDWTI)

Time-period of interest: 1998 – 2007

Why FARC? Ans: Reliable dataset from RDWTI Dominant in Colombia Less ambiguity in terms of other groups’ attacks Anti-establishment group Strong signature in attack profile Easy to differentiate

FARC from non-FARC attacks in case of ambiguity

10

Activity Profile Why 1998 – 2007? Ans: Two key geo-political events

Spurt 1 1997: Colombia becomes leading cultivator of coca 1999–2000: Plan Colombia with U.S. aid 2001–2002: President Uribe’s election on anti-FARC plank

Spurt 2 2003–2004: Anti-FARC efforts bear fruit 2005 – 2006: President Uribe’s re-election bid and local elections

1998 2000 2002 2004 2006 2008

Elections, Uribe’s win, Plan Colombia, cocaine

fields destroyedLocal elections,

Pres. Uribe’s successful bid

Becomes leading cultivator of coca

Massive increase in Plan Colombia

funding announced

11

Models for Mi

12

Model Verification

More attacks, Good first-

order fitHeavy tails

Large inter-arrival time,

Good fit

Normal Activity Profile

Reduced inter-arrival time, Good

first-order fit

Spurt in Activity

Fewer no. of attacks, Good fit

Normal Activity Profile

Spurt in Activity

13

Case-Study 2: Shining Path Shining Path (Sendero Luminoso) of Peru

Marxist-Leninist ideology, anti-establishmentist, attacks perceived interventionist forces

Actively involved in cocaine cultivation and trans-shipment to U.S. and W. Europe Database used: RDWTI Time-period of interest: 1981 – 1996 (full cycle in Sh. Path’s evolution)

Spurt 1 Early 1980s: Becomes violent 1981–1984: Tepid response from govt., seizes initiative, stabilizes

Downfall 1 1985–1986: Attack on Pres. Elections in ‘85, COIN ops. slight lull

Spurt 2 1987–1990: Reorganization in early ‘87, stabilizes 1991: Excessively violent, Sh. Path controls center and south of Peru and in outskirts of Lima

Downfall 2 1992: Personality cult around leader, disrespect for indigenous culture, loss of support base,

institutional apathy, Guzman captured in Sept. 1992 1993–1996: Death of outfit

14

Activity Profile

15

Models for Mi

16

Model Verification

Heavy tails

Beginning of activity, Good fit

Intense activity, Good first-order fit

Stabilization, Good first-

order fit

Decay and death, Good fit

17

Quickest Changepoint Detection

Changepoint ( ): Point of transition from normal behavior to abnormal behavior

Statistical model: Density changes from f to g at Goal: Detection procedure (stopping time) to detect quickly without too

many false alarms Numerous applications: anomaly detection, failure detection,

process/quality control, intrusion detection, target detection, finance, etc. Simplest setting: Observations are i.i.d. both pre- and post-change

18

Proposed Solution 1 (EWMA)

Changepoint detection procedures Page’s Cumulative Sum (CUSUM) test Shiryaev-Roberts (S-R) test Disadvantage: Require pre- and post-change distributions Advantage: Optimality properties can be proved under certain ideal assumptions

Exponential weighted moving average (EWMA) test Introduced by Roberts in 1959 and studied in statistics literature Detects drift in mean of a sequence and tracks it (continuously) How does it work? Smoothens small changes and enhances big changes Essentially a first-order auto-regressive process Advantage:

Easy to implement in practice Works well with no specific assumptions on underlying distributions

19

EWMA Test Time-window =

No. of days of terrorist activity ( Xn ) Total no. of attacks in the time-window ( Yn )

Update equations for statistic: First-order auto-regressive process

are smoothing parameters, designed experimentally, small values work best

Stopping time: chosen to optimize trade-off between FAR & Det. delay

20

Detecting Spurts: EWMA

Conclusions: EWMA statistic detects persistent changes and tracks underlying process But short moderate changes are not tracked

21

Proposed Solution 2 (State Estimation)

Methodology: Train: Learn HMM parameters (p0, q0, Active rate and Inactive rate) – Baum-Welch/EM algorithm Test: Verify inter-arrival duration density during Active/Inactive periods – Q-Q plots, scatter

diagram Classify States: As Active/Inactive – Viterbi algorithm, non-linear filtering (small hit over VA)

22

Detecting Spurts and Downfalls: Shining Path

23

Concluding Remarks

A competing model for activity profile Desired Qualities in an Ideal Model:

Motivated by a small set of hypotheses Yes, 3 hypotheses. Are these hypotheses justified? Robust to looseness in hypothesis statements? Intentions do not matter?!

Described by few parameters Yes, 4 parameters in the simplest setting

Describes data sufficiently accurately Yes, for spurt /downfall detection and tracking activity. No, in general. Refine the model to incorporate heavy tails, non-geometric density, spatio-temporal, etc.

Robust to missing or mis-labeled data Ongoing work

24

Concluding Remarks

Ripley’s K function Indonesia/Timor-Leste attacks have abnormal tails (36 attacks on one day!) – cant see how a

2-state HMM with geometric obs. density will work for this dataset 25

Is a self-exciting model necessary to explain clustering?