University of Southern California Department of Mathematics
Modeling and Detection of Sudden Spurts in Activity Profile of
Terrorist Groups
QMDNS May 1, 2012
Vasanthan Raghavan Joint work with Aram Galstyan and Alexander Tartakovsky
1
Broad Objectives
Terrorism is no longer a shocking word to most of us! Ongoing radicalization of different interest groups Rise of social media has made tracking terrorist activity a harder task
Objective 1: Development of data-driven probabilistic models of activity of adversarial networks based on interacting hierarchical hidden Markov models
Objective 2: Develop methods for model learning inferencing – rapid detection and tracking of changes in adversarial behavioral
patterns and objectives based on optimal nonlinear filtering and quickest changepoint detection techniques
2
Features of Activity Profile Collecting terrorism data is a painful exercise!
Open-source databases: ITERATE, RDWTI, GTD, etc.
Temporal ambiguity (time-scale of modeling = Days) Attributional ambiguity (turf warfare) Data sparsity (604 attacks over 9 years) Missing data, mis-labeled data
1998 2000 2002 2004 2006 2008
Source: RAND Database on Worldwide Terrorism Incidents (RDWTI)
3
Model for Terrorist Activity
Observation Period: Observations: No. of attacks = History of group:
Model for activity profile
Desired Qualities in an Ideal Model: Describes data sufficiently accurately Motivated by a small set of hypotheses Robust to looseness in hypothesis statements Described by few parameters Robust to missing or mis-labeled data
4
Existing Models
Type 1: Classical time-series techniques Fit trend, seasonality and stationary components to time-series [Enders &
Sandler] Quadratic or cubic trend = 4 parameters, seasonality = 3, stationary part = 1 Fit lagged value of endogenous variables, and other variables [Barros] 8 or more model parameters
Key Theme: Good-to-acceptable fit for time-series at the cost of large number of parameters in a model with complicated dependencies
Type 2: Group-based trajectory analysis Identify cases with similar development trends [Nagin] Cox proportional hazards model + logistic regression methods for model
selection [LaFree, Dugan & co-workers] Key Theme: Contagion theoretic viewpoint Current activity of group is
influenced by past history of group Attacks are clustered
5
Self-exciting Hurdle Model
Type 3: Self-exciting hurdle model Puts the contagion point-of-view on a theoretical footing Motivated by similar model development in
Earthquake models – Aftershocks are function of current shock Inter-gang violence – Action-reaction violence between gangs Epidemiology – immigrants + offsprings in a cell colony
[Porter & White 2012, White, Porter & Mazerolle @ QMDNS 2010, Erik Lewis] Hurdle probability component: Accounts for few attacks Self-exciting component: Accounts for clustering of attacks Key Theme:
Excellent model-fit Explains clustering of attacks from a theoretical perspective Self-exciting component can be complicated more parameters
Is a self-exciting model necessary to explain clustering? 6
Motivating Hypotheses
Hypothesis 1: Current activity of the group depends on past history only through k dominant states (that remain hidden)
Hypothesis 2: Of these k states, the two most dominant are Its Capabilities ( ) – Manpower assets, special skills (bomb-making, IED),
propaganda warfare skills, logistics skills, coordination with other groups, ability to raise finances, etc.
Its Intentions ( ) – Guiding ideology/philosophy (e.g., Marxist-Leninist-Maoist thought, political Islam), designated enemy group, nature of high profile attacks, nature of propaganda warfare, etc.
[Cragin and Daly, “The dynamic terrorist threat: An assessment of group motivations
and capabilities in a changing world”]
7
Motivating Hypotheses Hypothesis 3:
Mature group Intentions are to attack (more or less) Change in capabilities is primarily responsible for change in attack patterns
A d-state model for Capabilities d = 2: Active state (high capability/strong), Inactive state (low capability/weak)
Observation density: Different possibilities (Poisson, shifted Zipf, geometric, etc.)
8
Hurdle/State transitions Data rarity Self-exciting comp./Diff. rates Clustering
Consequences Metrics to capture capabilities of group
Time–window: No. of days of terrorist activity in a day time-window (Xn) – measures
resilience of group [Santos]
No. of attacks in a day time-window (Yn) – measures level of coordination in group [Lindberg]
Consequence 1: where
Consequence 2: Time to next day of terrorist activity is appx. exponential The days of activity form a Poisson process for Consequence 3: Yn is compound Poisson whose density can be written in
terms of density of Mi 9
Case-Study 1: FARC Revolutionary Armed Forces of Colombia (FARC)
Oldest and largest terrorist group in the Americas Marxist-Leninist ideology, anti-establishmentist Uses guerilla warfare Actively involved in cocaine cultivation and trans-shipment to U.S. and W. Europe,
kidnapping rings, …
Database used: RAND Database on Worldwide Terrorism Incidents (RDWTI)
Time-period of interest: 1998 – 2007
Why FARC? Ans: Reliable dataset from RDWTI Dominant in Colombia Less ambiguity in terms of other groups’ attacks Anti-establishment group Strong signature in attack profile Easy to differentiate
FARC from non-FARC attacks in case of ambiguity
10
Activity Profile Why 1998 – 2007? Ans: Two key geo-political events
Spurt 1 1997: Colombia becomes leading cultivator of coca 1999–2000: Plan Colombia with U.S. aid 2001–2002: President Uribe’s election on anti-FARC plank
Spurt 2 2003–2004: Anti-FARC efforts bear fruit 2005 – 2006: President Uribe’s re-election bid and local elections
1998 2000 2002 2004 2006 2008
Elections, Uribe’s win, Plan Colombia, cocaine
fields destroyedLocal elections,
Pres. Uribe’s successful bid
Becomes leading cultivator of coca
Massive increase in Plan Colombia
funding announced
11
Models for Mi
12
Model Verification
More attacks, Good first-
order fitHeavy tails
Large inter-arrival time,
Good fit
Normal Activity Profile
Reduced inter-arrival time, Good
first-order fit
Spurt in Activity
Fewer no. of attacks, Good fit
Normal Activity Profile
Spurt in Activity
13
Case-Study 2: Shining Path Shining Path (Sendero Luminoso) of Peru
Marxist-Leninist ideology, anti-establishmentist, attacks perceived interventionist forces
Actively involved in cocaine cultivation and trans-shipment to U.S. and W. Europe Database used: RDWTI Time-period of interest: 1981 – 1996 (full cycle in Sh. Path’s evolution)
Spurt 1 Early 1980s: Becomes violent 1981–1984: Tepid response from govt., seizes initiative, stabilizes
Downfall 1 1985–1986: Attack on Pres. Elections in ‘85, COIN ops. slight lull
Spurt 2 1987–1990: Reorganization in early ‘87, stabilizes 1991: Excessively violent, Sh. Path controls center and south of Peru and in outskirts of Lima
Downfall 2 1992: Personality cult around leader, disrespect for indigenous culture, loss of support base,
institutional apathy, Guzman captured in Sept. 1992 1993–1996: Death of outfit
14
Activity Profile
15
Models for Mi
16
Model Verification
Heavy tails
Beginning of activity, Good fit
Intense activity, Good first-order fit
Stabilization, Good first-
order fit
Decay and death, Good fit
17
Quickest Changepoint Detection
Changepoint ( ): Point of transition from normal behavior to abnormal behavior
Statistical model: Density changes from f to g at Goal: Detection procedure (stopping time) to detect quickly without too
many false alarms Numerous applications: anomaly detection, failure detection,
process/quality control, intrusion detection, target detection, finance, etc. Simplest setting: Observations are i.i.d. both pre- and post-change
18
Proposed Solution 1 (EWMA)
Changepoint detection procedures Page’s Cumulative Sum (CUSUM) test Shiryaev-Roberts (S-R) test Disadvantage: Require pre- and post-change distributions Advantage: Optimality properties can be proved under certain ideal assumptions
Exponential weighted moving average (EWMA) test Introduced by Roberts in 1959 and studied in statistics literature Detects drift in mean of a sequence and tracks it (continuously) How does it work? Smoothens small changes and enhances big changes Essentially a first-order auto-regressive process Advantage:
Easy to implement in practice Works well with no specific assumptions on underlying distributions
19
EWMA Test Time-window =
No. of days of terrorist activity ( Xn ) Total no. of attacks in the time-window ( Yn )
Update equations for statistic: First-order auto-regressive process
are smoothing parameters, designed experimentally, small values work best
Stopping time: chosen to optimize trade-off between FAR & Det. delay
20
Detecting Spurts: EWMA
Conclusions: EWMA statistic detects persistent changes and tracks underlying process But short moderate changes are not tracked
21
Proposed Solution 2 (State Estimation)
Methodology: Train: Learn HMM parameters (p0, q0, Active rate and Inactive rate) – Baum-Welch/EM algorithm Test: Verify inter-arrival duration density during Active/Inactive periods – Q-Q plots, scatter
diagram Classify States: As Active/Inactive – Viterbi algorithm, non-linear filtering (small hit over VA)
22
Detecting Spurts and Downfalls: Shining Path
23
Concluding Remarks
A competing model for activity profile Desired Qualities in an Ideal Model:
Motivated by a small set of hypotheses Yes, 3 hypotheses. Are these hypotheses justified? Robust to looseness in hypothesis statements? Intentions do not matter?!
Described by few parameters Yes, 4 parameters in the simplest setting
Describes data sufficiently accurately Yes, for spurt /downfall detection and tracking activity. No, in general. Refine the model to incorporate heavy tails, non-geometric density, spatio-temporal, etc.
Robust to missing or mis-labeled data Ongoing work
24
Concluding Remarks
Ripley’s K function Indonesia/Timor-Leste attacks have abnormal tails (36 attacks on one day!) – cant see how a
2-state HMM with geometric obs. density will work for this dataset 25
Is a self-exciting model necessary to explain clustering?