Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Integrating Unstructured Data Analysis into Defense and Intelligence WorkflowsJames JonesJeff WilsonTim Murphy
Every 2 days we create as much information as we did up to 2003
Eric Schmidt, 2010
What does that look like?Every minute…
Twitter sees new 350,000 tweets
Facebook has 510,000 comments posted, 293,000 statuses updated
600 Wikipedia pages are edited
3.6 million Google searches are conducted
15.2 million Text Messages are sent
954,000 new Microsoft Office documents are created
144 million e-mails are sent
What is Unstructured Data
• Does not have a recognizable structure or isloosely structured
• Can be in a variety of formats and storagemechanisms
- Word Documents
- Social Media Posts
- PowerPoint
- Share drive
Problems in Integrating Unstructured Data
• Tone can vary wildly
• Not in traditional spatial format
• May or may not contain explicit locational information
• Locational information may take many forms
- Coordinates
- Place-names
- Address
How to Integrate Unstructured Data into ArcGIS
CoordinatesCustom Locations
User defined keywords
LocationsPeople/Organizations
EventsDates
Relationships
ArcGIS Pro 2.3
Native EsriCapability
Third PartyIntegration
Natural Language Processing
What are you looking for?
What is the best tool?
How is it best used?• Data is at least somewhat understood• Data benefits from identifiable and
repeating patterns• Little to no programming experience
available/needed
• Data is not well understood• Data does not contain identifiable
and/or repeating patterns• Integration needed
Extracting Locations with ArcGIS
• LocateXT Extension for ArcGIS Desktop and Enterprise
• Available in ArcGIS Pro 2.3
• Also available for ArcMap
• Uses pattern matching (regular expressions, REGEX) to search for coordinates in a variety of formats
• Uses custom location list to match/extract other patterns (place names, codes, other terms)
• Also extracts from GPS-tagged photos (EXIF)
• Multiple ways to initiate location extraction
Extracting Locations in ArcGIS Pro
• New option added to the “Add Data” button
• Allows for a user to drag and dropdocuments or copied text into a window
• Can create a new feature class or append itto an existing one
Extracting Locations in ArcGIS Pro
• Two Geoprocessing Tools added
• Located in the Conversion Tools – To Geodatabase toolset
- Extract Locations from Document
- Extract Locations from Text
James Jones
Extracting Locations from Text in ArcGIS Pro
Extracting Custom Attributes
• Ability to create custom attributes based on content within document or near a location
- Triggered by location extraction
• Based on keywords- Tag locations based on keywords
- Scrape/harvest portions of document based on keywords
• Ability to extract based off of:- Number of characters/words
- Number of lines/blank line
- Stop string
• Built in separate LocateXT desktop application (until Pro 2.4)
Tag extracted locations based on keyword found in source document
Extracting Custom Attributes
Extracting Custom Attributes
Tag extracted locations based on keyword found in proximity to location
Custom capture text based on keywords found in proximity to location Location trigger
Location trigger
Custom capture text based on keywords found in proximity to location
Building Custom Attributes and ETL data
What is Natural Language Processing?
• Field of computer science and Artificial Intelligence since the 1950s• Machine learning algorithms for NLP introduced in the 1980s• Early focus was primarily on machine translation• Focused on four key areas:
• Syntax
• Semantics
• Discourse
• Speech
Main Fields of NLP
• Part of Speech Tagging*• Parsing• Word Segmentation• Terminology Extraction
• Automatic Summarization• Coreference resolution• Discourse Analysis
Syntax Discourse
Semantics• Machine Translation*• Named Entity Recognition*• Optical Character Recognition*• Relationship Extraction*• Sentiment Analysis*• Topic Segmentation• Text Similiarity
Speech• Speech Recognition• Speech Segmentation• Text-to-speech
NLP Integration
• Numerous 3rd Party tools exist- Open Source
- Proprietary / As A Service
• Identify and extract named entities
• Link entities and create semantic relationships
• Organizes data into an ontology
• Classify sentiment, topic identification, noun-phrase/verb extraction
APIs
Apps
Desktop
ArcGIS
NLTK
NLP Tools
Entities and RelationshipsEntities (spatial)
Saudi Arabia285 Fulton St, New York,NY 1000734 10 9.51N 73 14 32.78EHadhramaut, Yemenapproximately 5 miles northwest of Baqubah
Entities (non-spatial)Osama bin LadenTerroristUS EmbassyUS Special ForcesAugust 20, 199866 cruise missiles
LinksOsama Bin laden -- Saudi Arabia (birthplace)US Embassy -- Kenya
EventsOsama bin Laden attacked World Trade CenterAbu Musab al-Zarqawi was killed June 7, 2006
Possible Use Cases of Unstructured Data
• Deriving locations from text
• Analyzing and enhancing existing spatial data containing attributes with free-text narrative
NLP Integration with ArcGIS
How to Integrate Unstructured Data into ArcGIS Enterprise
Script outputs JSON file to a network-accessible
folder
Custom Python script leveraging LocateXTprocesses message
GeoEvent Monitors folder
GeoEvent updates features in ArcGIS
Enterprise
New message comes in to folder
Print Your Certificate of AttendancePrint Stations Located at L Street Bridge
Tuesday Wednesday12:30 pm – 6:30 pm GIS Solutions Expo Hall D
5:15 pm – 6:30 pm GIS Solutions Expo SocialHall D
10:45 am – 5:15 pm GIS Solutions Expo Hall D
6:30 pm – 9:00 pm Networking ReceptionNational Museum ofNatural History
Please Take Our Survey on the AppDownload the Esri Events app and find your event
Select the session you attended
Scroll down to find the feedback section
Complete answersand select “Submit”
Presentation TitlePresenter Names
Sample Name Here
Click HereFor DEMO
Print Your Certificate of AttendancePrint Stations Located at L Street Bridge
Tuesday Wednesday12:30 pm – 6:30 pm GIS Solutions Expo Hall D
5:15 pm – 6:30 pm GIS Solutions Expo SocialHall D
10:45 am – 5:15 pm GIS Solutions Expo Hall D
6:30 pm – 9:00 pm Networking ReceptionNational Museum ofNatural History
Please Take Our Survey on the AppDownload the Esri Events app and find your event
Select the session you attended
Scroll down to find the feedback section
Complete answersand select “Submit”