29
Version 1 PREDICTIVE ANALYTICS FOR ANTICIPATORY CRIME INVESTIGATION AND INTERVENTION 2014 CASE STUDY CRIME

Using Predictive Analytics for Anticipatory Investigation and Intervention

Embed Size (px)

DESCRIPTION

The proliferation and adoption of data, sensors, mobile phones and social media technology present new ways of capturing conversations surrounding events in real-time. There is high demand for products that allow law enforcement and criminal investigators and others to explore events by monitoring many transmedia sources (social media sources like Facebook and Twitter, photos, news sources, and tweets) and relating that activity to historic data sets like neighborhood maps, crime databases and other digital records. ! Using a combination of the data-analysis products available from D8A Group, we’ve been monitoring unfolding events in real-time to illustrate the ways our technology platforms can be used by public safety officials to make data informed decisions in real-time public safety.

Citation preview

Page 1: Using Predictive Analytics for Anticipatory Investigation and Intervention

Version 1

PREDICTIVE ANALYTICS FOR ANTICIPATORY CRIME INVESTIGATION

AND INTERVENTION

2014 CASE STUDY

CRIME

Page 2: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "2Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

� "!Predictive Crime Fighting with D8A

!!! !!!!!!

CONTACT !D8A Group http://d8a.com !Phone: (520) 301-7906 Email: [email protected]

Page 3: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "3Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!!TABLE OF CONTENTS !Public Safety in the 21st Century 4 Case Studies

1- Risk Patterns 6 2- Remote Investigation 10 3- Network Analysis 12

Contextual News Discovery 14 Momentum 15 Real-Time Zeitgeist 17 Filtering by Keyword Exclusion 18 Keyword and Phrase Networks 19 Identifying Influencers 20 Sentiment Analysis 22 Geography Trends and

Locations of Interest 24 Predictive Analytics 26 Risk Mitigation and the Timing Of Information 28 !

!!

Page 4: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "4Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

Public Safety in the 21st Century !The proliferation and adoption of data, sensors, mobile phones and social media

technology present new ways of capturing conversations surrounding events in

real-time. There is high demand for products that allow law enforcement and

criminal investigators and others to explore events by monitoring many

transmedia sources (social media sources like Facebook and Twitter, photos,

news sources, and tweets) and relating that activity to historic data sets like

neighborhood maps, crime databases and other digital records.

!Using a combination of the data-analysis products available from D8A Group,

we’ve been monitoring unfolding events in real-time to illustrate the ways our

technology platforms can be used by public safety officials to make data informed

decisions in real-time crisis scenarios. The solutions used for this analysis

include:

!• SiftDeck: a product that connects online conversations to the people,

places, and things being referenced offline. This helps users manage real-

world risk to predict and avoid their offline assets from being threatened

(think staff, office locations, or property). • Themes: a product that allows users to visually sort through large

amounts of text data or streaming data to surface patterns and trends in

the content. It allows for the visual navigation of real-time data using

search, word trees, keyword & phrase network analysis, and various

filters. • Muxboard: a remixable analytic dashboard that allows researchers to

apply various algorithms and third-party APIs to real-time, ever-evolving

data sets using drag and drop ease. Muxboaard makes it easy to quickly

create dashboards tracking different people or brands, each with intricate

customizable analytics.

Page 5: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "5Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!The primary purpose of using technologies like the D8A suite of analytic products

is to monitor and capture real-time data for analysis and research that serves as

a baseline so that users can look for “spikes” of irregular activity. These solutions

are also predictive, helping to surface trends, patterns, and happenings before

one might find out about them otherwise. D8A’s products work across multiple

communication channels and with multiple types of public safety data.

!When it comes to social data, most users are primarily interested in analyzing

Twitter’s real-time global data stream, our products also work with mobile data

streams (text messages), news articles and headlines, blogs, RSS feeds, JSON

feeds, email, and can hook virtually any API made available. This makes our

products flexible for any type of online activity monitoring.

!The added advantage of D8A’s particular set of products is the ability to research,

sift through, and sort data streams in real time, allowing companies to make

data-driven decisions while events are still unfolding.

!

Page 6: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "6Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!!Case Study 1: Risk Patterns This example illustrates how our suite of products can be used to quickly build

risk models using non-traditional data. For instance, the user might upload

various datasets to our system that would contain historic crime records of a

given area, similar to what is publicly available through CrimeReports.com

!This is what we call our ‘baseline’. Historic records that show what has occurred

in the past and where. This gives the user data to begin building a sophisticated

model that triages real-time data with additional historic data.

!

Page 7: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "7Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

What type of model? Well, first let’s say we want to overlay other types of data on

the map. This might include demography, patrol car routes, information on

surrounding buildings (ex. vacant lots) or events.

!To do so, we can simply choose from data sets containing this information, or

upload our own.

!Drag ‘Upload Your Data’ on to the canvas.

Page 8: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "8Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

Choose your file and upload it. The system will automatically begin number

crunching to figure out the best way to display the file.

You can then drag and drop algorithms to augment the dataset. For instance,

using our ‘Counts’ module to find things like standard deviation patterns or mean

averages in your data.

!!

Page 9: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "9Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

The combined data set can then be used to create other maps or charts. This is

an alternative way of doing what some would call ‘data science’. Merging,

combining and ultimately displaying disparate information types — all made drag

and drop simple so that public sector staff need little to know retraining to do it.

!Rather than only looking historically, users can combine their historic crime data

and real-time happenings to create new visualizations that ordinarily would take a

contractor or consultant a lot of time (and usually a lot of money) to produce.

Page 10: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "10Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!

Case Study 2: Remote

Investigation and Online Forensics Another popular use case for our products is remote investigation. Using real-

time data to create dashboards that can be used to explore an unfolding situation

without necessarily being on the scene.

!

!

!The above map might represent the last known locations of a suspect, or group

of suspects based on data mined from their social media presence. If we know

their location what types of crimes they've committed in the past and the types of

locations or people in the area, we can triangulate potential targets for new

criminal activity.

!

Page 11: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "11Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!These are called ‘probability zones’. The areas where we have conclude things

might happen because of some early warning signal. They assist in designing

intervention strategies (ex. “We should send more patrol officers to this area.”)

!!!

Page 12: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "12Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

Case Study 3: Network Mapping It’s a common tactic of predictive analytics to use the peer networks of criminals

to identify others likely to commit criminal activity. Tracing these networks of

friends and ’friends of friends’ is called network analysis. They have shown up in

investigations for years in the form of evidence maps and can be used to connect

people to people, but also people to places and things.

!During an investigation, these network maps can become unwieldily.

!!!!!!!

Page 13: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "13Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!With D8A Group’s products, these evidence walls are dynamic and interactive —

linking to the gigs of data they represent.

!!!

Page 14: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "14Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!

Contextual News Discovery SiftDeck learns to aggregate news headlines based on keywords parsed from

aggregated content. This is different from only aggregating content based on the

keywords users enter because it provides a contextual stream of headlines

based on the real-time conversation. In other words, SiftDeck recommends

potentially related news headlines that a user may not even be aware of. So it

serves as a real-time discovery and recommendation engine.

!This feature tries to answer the question: “What if I don’t know what I’m looking

for?” Rather than the user programming every single detail into our products,

they learn from both the user and the content creators to make new suggestions

of which news items might be relevant to the investigation underway.

!

Page 15: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "15Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!!Momentum Momentum is the term we use to refer to the qualities of a conversation. Does

the conversation activity seem to be building or slowing? Are new people joining

or are they leaving? Are the people involved from the beginning conversing more

or less than they were from the start? Which keywords, influencers, and

communication channels are leading the conversation?

!

!

!The image above chronicles the drop and eventual rebound of momentum

surrounding the various keywords being tracked.

!

Page 16: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "16Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!

!Likewise, looking back over the previous days or weeks shows that there are lulls

and bumps in the flow of the conversation over time. This directly correlates to

events occurring in the real-world and the virality of news spreading online.

!!

Page 17: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "17Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!Real-Time Zeitgeist What are the recurring themes and phrases in a real-time conversation? The

words, phrases, names, and locations that repeat may allow analysts to draw

correlations between seemingly unrelated conversations.

!

!

!Were one not even looking for a scandal involving Chris Christie, if a word cloud

all of a sudden started surfacing words like ‘scandal’, ‘bridge’, ‘taxes’ and

‘campaign’ (like in the above image) they could easily determine a big story might

be breaking and take action. For a someone working in public safety a

dashboard might surface warnings of an unfolding event or situation

!!

Page 18: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "18Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

This ability to actively monitor the ‘zeitgeist’ or thematic relationships between

conversations happening across disparate communication channels often proves

powerful for organizations who have to plan suggested interventions or activities

in real-time.

!Filtering By Keyword Exclusion

More importantly, these word clouds make it possible to conditionally filter out

conversations that actually are unrelated.

!

!

!In this case, the recurrence of the word ‘Munich’ in data streams monitoring

conversations about Sudan was because of a football match between Sudanese

and German teams. After identifying messages that are skewing the research,

with our product, Themes, the user can simply click on the word (in this case

‘Munich’) and opt to exclude all data where non-relevant words appear in the

same sentences together, while keeping all other data intact.

!

Page 19: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "19Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

Organizations using other products for social media analytics often forget that

many such tools don’t allow for the selective ‘cleansing’ of datasets to remove

misleading or non-relevant content.

! Keyword and Phrase Networks

!

!Themes’ network graphs of words and phrases can provide a powerful means

for visually controlling the underlying dataset. In this case, clicking on any word in

the above graph, gives you the option to focus only on content that contains that

word, or only on the content that doesn’t contain a particular word.

!In the above example, a very large dataset was used to show how select key

words appear in high frequency in the same datasets. But by clicking on each

word, and choosing to focus or exclude certain ones, the dataset is refined as

needed.

!

Page 20: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "20Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!

!A researcher might want to only view content where the phrases ‘taxes’, ‘liberals’,

‘scandal’ and ‘work’ appear together as it relates to a criminal political scandal. If

so, it’s simply a matter of point and click (the collection of keywords in the bottom

right), and the data is re-organized to fit that criteria. Terms can just as easily be

excluded from the dataset.

!Identifying Influencers

Monitoring digital conversations allows organizations to identify potential ‘thought

leaders’, friends of suspects or other people who may be influential in a given

scenario. While it’s usually impossible to verify exactly who these actors are, and

what their motives are, it’s useful to identify them, to conduct strategies for

engagement and outreach.

Page 21: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "21Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!

!Having this information allows analysts to follow the public conversations of

specific individuals. For instance, if any of these (or other) individuals are

influential bloggers, employees of other companies, investors, journalists etc.

!!

Page 22: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "22Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!Sentiment Analysis

Sentiment analysis is a method of measuring the emotional tone of written text

using computer programs. It attempts to weight different words in a body of text

against one another, to ultimately provide a ‘score’ to the whole body of text that

is either positive, negative, or neutral.

!Why is this useful? Because it allows users to algorithmically determine whether

an online conversation is skewing positive or negative in tone.

!

!

!In the image above, it’s easy to quickly see that of the more than 6,972

messages analyzed in the first column, 1679 (25%) have been marked as being

negative in tone, while 700 (10%) are positive. If the analyst wants to focus on

Page 23: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "23Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

the dataset that’s been marked negative, they simply click on that area of the

graph.

!The content and related analysis is then sorted to focus on the ‘negative’ content.

To give a use case scenario, this would allow a researcher to view a list of

influencers leading the negative tone of a conversation. In the past, this has

allowed our users to identify individuals whom they would qualify as the

‘antagonists’ or ‘instigators’ who might be inciting violence or other unwanted

activities. Being able to sort data in this way provides a powerful lens of context

and discovery. More importantly, it allows analysts to constantly ask questions of

the data itself through our simple drag and drop interface.

!

!

!The above screenshot looks at only the analysis of content negative in tone from

a different data set than the previous image. You can see that 379 messages

represent the negative content, of which 376 comes from Twitter, 3 items come

from Google News, and we have a list of potential conversation influencers, as

Page 24: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "24Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

well as how much content they’ve contributed to the overall conversation.

Analysts can now reach out to them directly, or begin monitoring these new

sources of interest. Again, all of this is being done in real-time.

!Geography Trends and Locations

of Interest

Connecting this type of online research to offline activities and actions is a big

portion of why people use data products like the ones provided by D8A. We use

the social graph and natural language processing to algorithmically map various

locations of interest to researchers.

!

!

!!

Page 25: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "25Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!

!The power of this information is that even with the most minimal knowledge of a

situation, the maps and graphs generated tell a story. While knowing the broader

context and having professional expertise in the given subject matter is

absolutely necessary, when such knowledge is coupled with these kinds of visual

data exploration tools, it’s possible to make the job of experts faster, more

nuanced and efficient.

!D8A’s products (SiftDeck, Muxboard/MetaLayer, and Themes) are not meant to

replace professional analysts and researchers, but to save them incredible

amounts of time.

!!

Page 26: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "26Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!Predictive Analytics

When all of our products are combined, it’s possible to anticipate events,

demands, or activities that have not happened yet. This is type of anticipatory

response to data is based on an area of research called predictive analytics.

!By combining all of our insights into an informed narrative, researchers might be

able to determine the correct actions to take well before it’s obvious. As with all

systems, it’s possible these predictions can be wrong so rather than give

researchers objectives, our products serve to provide the appropriate information

for informed conversation and action.

!

! !

Page 27: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "27Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

! !

!In a scenario where an analyst is viewing multiple dashboards in an unfolding

scenario, it’s possible to piece each of these different insights together to suggest

action and give reasons for that action.

!In one use of our products in South Sudan, well before stories played out in the

media, our team identified several influencers in-country and around the world.

!We knew that the situation was no longer contained to just South Sudan, but was

now affecting the whole of the East Africa region; we knew that there appeared to

be a rapid build of momentum in the conversations on the evening of January 9th

leading into the 10th, and we know that the thematic tone of conversation was

trending towards some sort of conflict. We also had the related breaking news

stories confirming as much.

!!

Page 28: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "28Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

!Risk Mitigation and the Timing of

Information

While it’s possible to come to the same conclusions in a number of other ways,

the timing of information often dictates its value, as well as the time it takes to

aggregate all data sources to predict future conclusions.

!For a Wallstreet broker, receiving information that the CEO of a major company is

about to be fired might indicate he needs to sell his position in that companies

stock. However, receiving the information after the fact (ex. “the CEO was fired

yesterday”) is an entirely different scenario. The first scenario allows him to

mitigate risk in anticipation of a potential disaster. The other scenario allows him

to make the same decisions, but the information is less valuable because he has

less control over how the news affects things. A portion of the risk is already

realized, thereby making the information less valuable. For the Wallstreet broker,

the value of information could be valued in the millions or billions of dollars. For

humanitarian organizations and journalists, the type of risk we try to help them

mitigate might be measured in loss of life & property, or at the very least, quality

of life for the people affected by these events. For brands or public figures, their

reputation is directly correlated with their value and ability to derive revenue from

customers.

!D8A’s products are designed to shift critical analysis of any situation, event, or

phenomena from a retroactive exploration, to a real-time one. In the above

scenario, the case was made that value of information is very much related to its

timing.

!

Page 29: Using Predictive Analytics for Anticipatory Investigation and Intervention

� "29Using"Predictive"Analytics"for"Anticipatory"Investigation"and"Intervention"

Thus, even if our products only slightly move the needle in regards to the time of

information, there is a direct correlation to the amount of value that analysis

provides. Knowing how to potentially affect a situation in real-time can be

exponentially more valuable than waiting for everything to play out, only to deal

with the aftermath.

!While such actions need to be tempered with consideration for culture, context,

privacy and law, there is great value in time-shifting the analytics so that

companies can react to events more readily because they were able to anticipate

potential risk scenarios.