Upload
ashton-ohara
View
219
Download
3
Tags:
Embed Size (px)
Citation preview
Open Source Intelligence:
Presented by Abe Lederman, President and CTO
Deep Web Technologies, LLC
IOP ’06 Sheraton Premier, Tysons Corner, Virginia January 16-20
Access All Intelligence, in All Languages, All the Time
About Deep Web Technologies (DWT)
• Deployed first “federated search” portal in the Federal Government, 1999
• Major clients include:– DOE Office of Scientific & Technical Information– Defense Technical Information Center– Science.gov Alliance– DOE Office of Science– National Agricultural Library
DWT is a New Mexico based company focused on providing state-of-the-art software solutions which search, retrieve, aggregate, and analyze content.
Open Source Intelligence
The Problem:
• Collecting and analyzing enormous quantities of information in any language, in myriad formats, located anywhere, accessible through a large variety of means, with a majority not accessible through the Internet
Shared Challenge: OSINT and Knowledge Discovery/Diffusion
OSINT Challenges
Knowledge Discovery/ Diffusion
Challenges
DWT for the past six years has been the lead technical organization addressing these challenges in collaboration with DOE Office of Scientific & Technical Information
The DWT Proposition
To apply DWT’s technology, expertise and ongoing innovations* to address the challenges of OSINT
*Developed in partnership with DOE/OSTI
Challenges in Working with Thousands of Data Sources
Locate Reliable Sources
Categorize Sources by Content
Configure Sources for Searching
Maintain Sources
Challenges in Searching Thousands of Sources
Automatically Select Sources to Search
Perform Many Searches in Parallel
Translate, Analyze and Organize Results
Relevance Rank
Cluster/ Visualize
Extract Key Information
DWT’s State-of-the-art Federated Search Engine
• Scalable, grid-computing based federated search engine
• Sophisticated Search Conductor• Supports custom connectors• Multi-tier relevance ranking• Framework accepts integration of advanced
linguistic, analyses, and visualization modules
ResearchAssistantTM
Grid Computing: Distributing the Workload
Search ConductorSelect sources
to search
Perform search
Deliver results to user
Can I get more results from “good”
sources?
Enough good
results?
YES
YES
NO
NO
Multi-tier Relevance Ranking
• QuickRankTM – Ranks results based on occurrence of search terms in title and snippet
• MetaRankTM – Ranks results utilizing custom algorithms applied to metadata
• DeepRankTM – Downloads and indexes full-text documents
Science.gov Alliance Consortium of 12 Federal Government Agencies
Dept of Agriculture
Dept of Commerce
Dept of Defense
Dept of Education
Dept of Energy
Dept of Health/Human Services
Dept of Interior
Environmental Protection Agency
NASA
National Science Foundation
US Government Printing Office
National Archives & Records Administration
Sponsoring
Science.gov Portal
(Access to most of Federal Government R&D
Science.gov Advanced Search Page
Science.gov Results Page
A Science.gov Document
Next Steps
Identify Sponsors and development partners that can collaborate on the development of a pilot that integrates best-of-breed technologies of value to OSINT.
This pilot will result in a portal that aggregates content of different types, generating actionable intelligence.
Contact Us
Abe Lederman
122 Longview Drive
Los Alamos, NM 87544
www.deepwebtech.com
http://www.deepwebtech.com/talks/IOP.ppt