Upload
others
View
9
Download
2
Embed Size (px)
Citation preview
KAIROSKnowledge-directed Artificial Intelligence
Reasoning Over Schemas
Boyan Onyshkevych
1
January 9, 2019
Approved for Public Release. Distribution Unlimited.
2
Agenda
Start End Session
9:00 10:00 Registration
10:00 10:15 Security Briefing
10:15 11:00 Contracts Management Office Briefing Mark Jones, DARPA Contracting Management Office
11:00 12:00 KAIROS PresentationDr. Boyan Onyshkevych, Program Manager, DARPA I2O
12:00 1:15Break (Each attendee may speak for 2 minutes for teaming purposes. No slides or handouts. DARPA representatives will not be present)
1:15 2:30 Question Answering Session
Approved for Public Release. Distribution Unlimited.
3
Proposers’ Day Information
BAA Location• Posted on FedBizOpps website (http://www.fedbizopps.gov)
and Grants.gov website (http://www.grants.gov)Questions Today• Questions can be submitted until 12:00 to [email protected] or on 3x5 cards• Questions will be answered during Q&A session in the afternoonProposers Day Website• Proposers’ Day presentations will be posted• Frequently Asked Questions (FAQ) will be updated with Q/A from [email protected] precedence• If anything said or addressed during this presentation or in the FAQ conflicts with the
published solicitation, the BAA takes precedence. The Government may issue amendments to the BAA to effect any changes deemed necessary in response to the FAQ. Such amendments would be posted to FBO and Grants.gov prior to the solicitation closing date and would supersede previous versions of the solicitation.
Approved for Public Release. Distribution Unlimited.
4
Create schema-based artificial intelligence capability to enable contextual and temporal reasoning about complex real-world events in order to generate actionable understanding of these events and predict how they will unfold
KAIROS Program Goal
Approved for Public Release. Distribution Unlimited.
5
KAIROS Contribution
Static Events Temporally and Contextually Ordered Events
Truck Acquisition
Soccer GameLaptop
Acquisition
Demonstration
Truck Modification
Missile Launch Test
Marathon
Laptop Delivery
Software Loading
No way to connect these without temporal information and event patterns (schemas)
Time
Truck Acquisition
Fertilizer Acquisition
Laptop Acquisition
Demonstration
Truck Modification
Marathon
Laptop Delivery
Software Loading
Schema1. Acquisition2. Delivery3. Reuse4. Testing
Temporal information and event patterns (schemas) enable connection of seemingly disjoint events
Fertilizer Acquisition
Tire Purchase
Delivery
Truck Repair
Meeting
Software Loading
Truck Acquisition
Missile Launch Test
Approved for Public Release. Distribution Unlimited.
6
• A schema is an organized unit of knowledge for an event or series of events, first posited by the cognitive theorist Jean Piaget in 1923
• Schemas are based on past experience and are accessed to guide current understanding or action and predict future events and participants
• Schemas are dynamic – they develop and change based on new information and experiences and thereby support contextual adaptation
• Schemas have deeper levels of organization that tie them together with other schemas that share attributes
• Schemas have been applied to AI problems using First Wave and Second Wave methods
What Schemas Are
Take out jack
Take off hubcap
Remove bad tire
Put away jack
Position jack
Unscrew lug nuts
Put on new tire
Example Schema - Changing a Tire
Approved for Public Release. Distribution Unlimited.
7
Automated Schema Generation
Startracks
Instance 11. Get metal cutters, etc. 2d2. Remove Door 2h3. Buy lift, brackets, door, and frame 1mo4. Attach lift and door 5h
Instance 21. Get carpentry tools2. Buy beds, kitchen fixtures 1mo3. Find lumber 30min4. Build structure 5d5. Install structure and fixtures 2dDoItYourselfRV
Ubuntu Forums
Instance 31. Buy cameras, lidar, servers 3mo2. Make sensor frame 4h3. Get metalwork tools 10min4. Configure servers 5d5. Connect frame, lidar, camera 2d
Observed Reuse Instances Generalized Reuse SchemaAccessible Truck
Camper
Self-Driving Car
Process of Schema Generation• Intake of massive amounts of open-source event data• Detection, classification, and clustering of events from input• Automatic learning of common schemas like Purchase Schema by generalizing from
instances in big data• Composition of complex less common schemas like Reuse Schema using common
schemas as building blocks
Acquisition Schema1.Purchase 1hr-3mo2.Find 30min3.Make 2h-5d
Reuse Schema1. Acquisition 10m-3mo2. Get tools 10min-2d3. Remove old
structure 1h-2h4. Install new
structure 2h-2d…
In any order
or
Purchase Schema1.Identify 1min-3mo2.Det. Price 30min…
Approved for Public Release. Distribution Unlimited.
8
• Can we determine technological capabilities from available data, such as commercial transactions?
• Without KAIROS, we may have enough information about potentially-relevant entities and events to populate a knowledge base, but we don’t know which are important or how they fit together
• KAIROS will help us link the relevant entities, locations, events, and sub-events, and make inferences and predictions
Example
Approved for Public Release. Distribution Unlimited.
Example – Disjoint vs. Connected Events
Truck Purchase
Seller
Buyer
LocationTravel
Destination
Special Vehicle Factory
John Smith
Adam
Participant
Truck Delivery
Destination
Recipient
Discerning Event Structures• These two seemingly unconnected events may actually be the first two steps in
a more complex event• These two individuals are shown by their travel plans possibly to be the same
person
Dr. Scott
Missile FactorySpecial
Vehicle Factory
OriginEmployee
9Approved for Public Release. Distribution Unlimited.
10
Example - Status of a Complex Event
Reuse Schema1. Acquisition2. Get tools3. Remove old structure4. Install new structure
New Capability Development Schema1. Acquisition2. Delivery3. Reuse4. Testing5. Deployment
New Capability Development
Truck Acquisition
(Nov 2011)
Truck Delivery(Jan 2012)
1st Step 2nd Step
Truck Reuse Modification (Mar-Dec 2012)
3rd Step
Previously Extracted Generalized Hierarchical
Schemas
Acquisition Schema1. Purchase
i. Negotiateii. Pay
Delivery Schema1. Shipping2. Pick-up
i. Travel toii. Take possessioniii. Travel from
Time
Matthew Taylor
Special Vehicle Factory
JohnSmith
Seller
Missile Factory
Dr. Adam Scott
Negotiation(May-Nov 2011)
Travel(3 Jan 2012)
Employee
Recipient
BuyerParticipant
EmployeeOrigin
Destination
Participant
Destination Employee
Location
Approved for Public Release. Distribution Unlimited.
11
Example – Prediction
Matthew Taylor
Missile Launch Capability Development
Truck Delivery(Jan 2012)
1st Step 2nd Step
Special Vehicle Factory
John Smith
Seller
Missile Factory
Dr. Adam Scott
Truck Reuse Modification(Mar-Dec 2012)
Launcher Testing
(?)
4th Step3rd Step
New Missile Test Site???
TravelTesting Schema1. Travel2. Load equipment3. Execute4. Evaluate
…
New Capability Development Schema1. Acquisition2. Delivery3. Reuse4. Testing5. Deployment
Analyst schema specialization and search for potential test site
Negotiation(May-Nov 2011)
Travel(3 Jan 2012)
• Discover relevant generalized event schema
• Transfer learning for new scenario adaptation Time
Truck Acquisition
(Nov 2011)
Employee
Location
Recipient
BuyerParticipant
EmployeeOrigin
Destination
Participant
Destination Employee
DestinationLocation
Participant
Participant
Previously Extracted Generalized Hierarchical
Schemas
Approved for Public Release. Distribution Unlimited.
12
New Methods for Complex Event Analysis
Schema Library
Generalization
Schema Application
Instance Extraction
KnowledgeBase
Archived Data New Data
Learning Time Run Time
Bottom-up Schema LearningUsing 1st and 2nd Wave
Methods
Top-Down Contextual Inference & Temporal Reasoning
Using New Methods
Composition
Specialization
Approved for Public Release. Distribution Unlimited.
13
KAIROS Architecture
Input Analysis with Temporal
Annotation
Schema Generalization
Predictive Analysis
User Interaction
Schema Matching &
Temporal Reasoning
Multi-media Multilingual Informationin Big Data
Text
Speech
Images
Video
Learning Process
Run-time System Flow
TA1
TA2 TA3
Temporal Knowledge
Base
One-Time Triage of
Generalized Schemas
Schema Composition
Domain-Specific Schema
Curation & Specialization
Curated Schema Library
Approved for Public Release. Distribution Unlimited.
14
Focus• Analyzing complex events in terms of subsidiary elements, arguments, and temporal and
sequential information to compose and generalize event schemasInputs and Outputs• Input: multi-media, multilingual batch data produced by TA4• Output: Schemas representing the structure of events and their subsidiary elements, how the
events evolve, and what the typical durations and orderings of subsidiary elements areAdditional Requirements• Proposers to TA1 should have available previously-developed entity, relationship, and event
extraction (detection, classification, and representation) technology • TA1 algorithms must be able to communicate their results to a user in a human-readable form
and accept changes from users• TA1 performers will be responsible for a user interface to the TA1 platform and user staffing• TA1 and TA2 performers are expected to provide their KAIROS software to TA3 in a Docker
container or similar form
TA1 Generation of Schemas for Events
Learning Process
Schema Generalization
Multi-media Multilingual Informationin Big Data
TA1
One-Time Triage of
Generalized Schemas
Schema Composition
Domain-Specific Schema
Curation & Specialization
Curated Schema Library
Approved for Public Release. Distribution Unlimited.
15
Focus• Design and implementation of a temporal knowledge base for use during run-time
using the schemas developed in TA1Inputs and Outputs• Input: From multi-media, multilingual streaming or batch data produced by TA4• Output: Knowledge base containing information about instantiated schemas from the
run-time data, all the events and participants that relate to the schemas, and the temporal relations of the event elements and participants; predictions of possible subsequent events on the basis of the schemas
Additional Requirements• Proposers to TA2 should have available previously-developed entity, relationship, and
event extraction (detection, classification, and representation) technology• TTA2 performers are expected to provide their KAIROS software to TA3 in a Docker
container or similar form
TA2 Representation and Use of Temporal Knowledge and Schemas
Run-time System Flow
Input Analysis with Temporal
Annotation
Predictive Analysis
User Interaction
Schema Matching &
Temporal Reasoning
Text
Speech
Images
Video TA2 TA3
Temporal Knowledge
Base
Curated Schema Library
Approved for Public Release. Distribution Unlimited.
16
Focus• Design a platform that enables multimedia input in streaming mode or from a corpus and
allows all system components to communicate with an interface that enables users to query the system and control a visualization module
Inputs and Outputs• Input: Algorithms from TA1 and TA2 in Docker container or similar form; data in streaming
mode or from a corpus• Output: Platform able to pass data to TA1 and TA2, enable the schema library built by TA1
to be accessible to TA2, and allow interface between the users and the TA2 algorithms, enabling the users to read, edit, and/or visualize the TA2 instantiated schemas
Additional Requirements• The TA3 interface must also present relative or absolute temporal information• Proposers to TA3 must have the capability to handle classified data• Developing APIs (TA1&TA2), schema format (TA1), & knowledge base format (TA2)
TA3 System Integration and User Interface
Run-time System Flow
Input Analysis with Temporal
Annotation
Predictive Analysis
User Interaction
Schema Matching &
Temporal Reasoning
Text
Speech
Images
Video TA2 TA3
Temporal Knowledge
Base
Curated Schema Library
Approved for Public Release. Distribution Unlimited.
17
Focus• Development of novel techniques for effective creation, collection, and annotation of the data
necessary for KAIROS research, development, and evaluation.Inputs and Outputs• Input: Direction from DARPA on appropriate data selection• Output: two types of data - one for the learning of schemas and the other for the run-time
development and evaluation Additional Requirements• Learning Data: large volumes of open source material, including news and public posts,
videos, etc. • English and one additional language for which NLP training data sets are available• Roughly 1,000,000 documents containing on the order of 100 different types of events
complex enough to give rise to multi-level schemas• At least five different instances of each type of event, with multiple sources for each event
instance when possible• Proposers to TA4 should propose annotation schemes
• Run-Time Data: resources selected to contain events and schema to support training and evaluation
• English for every evaluation and one other language per evaluation• Five different scenarios: one scenario for development and four for evaluation• The annotation for the evaluation scenarios should consist of the labeling of: complex
events and schemas of interest, the participants in each event, temporal information, and all properties essential for comprehension of each event and its relation to the schema(s)
TA4 Data Creation for Training and Evaluation
Approved for Public Release. Distribution Unlimited.
18
Evaluation and Metrics
• NIST will conduct evaluations of TA1 and TA2• Evaluation Schedule
• Pilot – 9 months after kickoff• Baseline – end of phase 1• Full evaluation – mid phase 2, end of phase 2, and end of phase 3
• The TA1 schema-generation module will be evaluated for accuracy, consistency, and completeness
• The TA2 representation and use of temporal knowledge and schemas module will be evaluated to determine which schema in the schema library is related to the event being analyzed and the accuracy of all sub-events and their actors
• During Phase 1, TA2 will use manually-constructed schemas • Starting in Phase 2, TA2 will use automated schemas, with reduced improvement expected
• Accuracy targets for the different evaluations of KAIROS relative to the baselines
• Percentages represent F-value, the harmonic mean of precision and recall• Transition partner will conduct evaluation using their own data and
performance metrics
Phase 1Phase 2
MidPhase 2
FinalPhase 3
FinalTA1 – Schema
Generation7% 23% 46% 73%
TA2 – Predictive Analysis
12% 33% 54% 79%
Approved for Public Release. Distribution Unlimited.
19
Automatic Determination of Validity Periods
Date of Birth
Married to First Spouse
Employed as President
• Rate of validity decay is learned from data like Wikipedia, news, birth announcements, etc.• Examples include:
On Business Travel
Birth Death
Marriage Death
Term 1 Term 3
Mon Sun
Likelihood of current validity
Tue Wed Thu Fri Sat
Approved for Public Release. Distribution Unlimited.
20
Validity Periods and Pattern of Life Analysis
2005 2011 Now
Point of observation Likelihood of current validity
Name: Adam
Dr. Scott
DOB: 1 Jan. 1942
Nuclear PhysicsEducation:
Job: Chief Engineer & Nuclear Scientist
Workers' Party Leader
Research:Manufacturing
Nuclear Physics
Future
Location:Missile Test Site
Hometown
Research Center
?
Approved for Public Release. Distribution Unlimited.
Data Creation for Training & Evaluation
System Integration & User Interface
Temporal KB for Predictive Analysis
Evaluation
21
Schedule
Month 0 6 12 18 24 30 36 42 48 54
TA1
TA3
TA4
Baseline Eval 1 Eval 2
TA2
Schema Generation
Pilot
Phase 1 Phase 2 Phase 3
Eval 3
Approved for Public Release. Distribution Unlimited.
22
Abstracts Due (optional but helpful) January 23, 2019, 12:00 noon (ET)
Proposal Deadline February 27, 2019, 12:00 noon (ET)
Note: the primary filter to be applied in the abstract review process is for relevance and scope. A rough assessment of technical plausibility may be additionally applied.
Deadlines
Approved for Public Release. Distribution Unlimited.
23
Government-Furnished Information• For every non-English language chosen for any scenario, the Government will provide linguistic
resources and tools of a quality and composition to be determined, but consisting at least of the type and size found in a LORELEI Related Language Pack
Intellectual Property• The program will emphasize creating and leveraging open source technology and architecture.
Intellectual property rights asserted by proposers are strongly encouraged to be aligned with open source regimes.
Security Clearance Requirements• At the time of proposal submission, all proposers to TA3 must have personnel with Top Secret
clearances who are eligible for SCI, access to facilities to store and process SCI material and hold SCI discussions, and the ability to conduct experiments on classified data in government facilities
Teaming• Proposers are welcome to team up, and teaming agreements should be specified in the proposals
Non-US Entities• Non-U.S. organizations and/or individuals may participate as a prime or a sub-contractor to the
extent that such participants comply with any necessary nondisclosure agreements, security regulations, export control laws, and other governing statutes applicable under the circumstances
Miscellaneous Proposal Information
Approved for Public Release. Distribution Unlimited.
24
Travel• All proposers should expect to send appropriately-sized teams to PI meetings throughout the
continental U.S. for the kick off and then every six months• Proposers to TA3 must be prepared to travel to both CONUS and OCONUS transition partner sites
Number of Awards• DARPA anticipates multiple awards for Technical Areas 1 and 2 and single awards for Technical
Areas 3 and 4. No awards are anticipated for evaluation
Awards for Multiple TAs• Proposals for TA1 and TA2 may be combined into a single proposal, and proposals for TA3 and
TA4 may be combined into a single proposal, but no other combinations are allowed. The decision as to which proposal to consider for award is at the discretion of the Government
• While a proposer may submit proposals for all four technical areas, a particular proposer (as identified by CAGE Code), if selected for TA3 or TA4, will be dispreferred for selection for any portion of TA1 and/or TA2. This preference is intended to avoid OCI situations between the research TAs and the integration and evaluation activities, to ensure objective evaluation
Miscellaneous Proposal Information (cont.)
Approved for Public Release. Distribution Unlimited.
25
Questions Today• Questions can be submitted until 12:00 to [email protected] or on 3x5 cards• Questions will be answered during Q&A session in the afternoon• Answers will be posted on the AIDA Proposers Day website
Questions in the Future• Email questions to [email protected]• The Frequently Asked Questions section on the KAIROS Proposers Day website will
be updated as new questions come in
How to Ask Questions
Approved for Public Release. Distribution Unlimited.
www.darpa.mil
26Approved for Public Release. Distribution Unlimited.
28
No Temporal Information or
Schemas
Manually-Created Schemas
Supervised Machine
Learning of Schemas
KAIROS Multi-Layer Learning
of Schemas
Extracts Basic Event Elements
Creates Schemas without SME Bias
Discovers Specific Schemas from Data
Deals with Uncommon Events
Adapts Dynamically
Approaches to Schema Creation
X
Approach
Attribute
XXX
XX
XX
X
Approved for Public Release. Distribution Unlimited.
29
• Generalization. This will entail identifying, from a large corpus of openly-available news and other public data, schemas describing primitive and complex events. For example, we observe multiple instances of people buying sandwiches. From those instances, we generalize to produce a schema for buying a sandwich. Similarly, we could build schemas for buying books, tires, lumber, etc. for other event instances observed in the data set. From these schemas, we can generalize to produce a generic purchasetransaction schema (which has a buyer, a seller, and an exchange of payment for goods or services). This purchase transaction schema will then be available for further hierarchal generalization, for example, to create a general financial transaction schema. It is expected that generalization will be primarily automated, possibly with some manual curation in which a user can triage putative generalized schemas.
• Composition. The purchase transaction schema will also be available for composition to form complex event schemas like flip house, which would be composed of a sequence of multiple schemas, including a purchase transaction schema with a house as the goods purchased, multiple purchase transactions of building materials by the buyer of the house, and a purchase transaction in which the buyer of the house in the first purchase is the seller. Composition therefore requires the use of role constraints (e.g., the buyer of the house is likely be identical to the buyer of the building materials and the seller of the house) and temporal constraints (the buying of the house will likely take place before the buying of the building materials, which in turn must precede the selling of the house).
• Specialization. The purchase transaction schema can be specialized to include useful domain-specific knowledge about domains not observed in the original training data set. For example, if we have never observed real estate transactions in the training data set, a user can add information about the domain-specific ways in which a real estate transaction differs from other types of purchase transactions, such as adding the role of an agent, title insurance, or escrow. It is expected that specialization will be primarily accomplished through user curation, possibly with some assistance from automated suggestions for added constraints.
Generalization, Composition, & Specialization
Approved for Public Release. Distribution Unlimited.