Large-Scale Case-Based Reasoning: Opportunity and Questions David Leake School of Informatics and Computing Indiana University

Large-Scale Case-Based Reasoning: Opportunity and Questions

David LeakeSchool of Informatics and Computing

Indiana University

Overview

• Intro to case-based reasoning• Appeal of CBR for large scale data• Some challenges• Questions for the audience

What is CBR?

• Reasoning by remembering (and analogizing and adapting…)

• Common in human planning, programming, problem-solving, diagnosis, decision-making

The CBR Cycle

From Leake, Maguitman, and Reichherzer, 2005

Motivations for Using CBR(Kolodner 1993; Aamodt & Plaza 1994; Leake, 1996)

• Easing knowledge acquisition, especially when cases are already available

• Reasoning when causal connections are complex or poorly understood

• Speedup from reuse• Explainability

CBR as AI Technology

• Classic applications include force deployment planning, diagnosis, design support, help desks,…

• IU eScience example: The Phale system (Leake & Kendall-Morwick, 2008, 2009) supports workflow construction with case-based reuse of lessons from provenance traces collected by the Karma provenance collection tool (http://d2i.indiana.edu/provenance_karma; project directed by Beth Plale).

http://d2i.indiana.edu/provenance_karma

http://d2i.indiana.edu/provenance_karma

Large-Scale Challenge for Phala

• Phala’s case retrieval depends on fast structure mapping

• Structure mapping toolkit has been developed and publicly released (Structure Access Interface, Kendall-Morwick & Leake, 2011)

• Fast structure mapping remains a key issue, especially for process-oriented case-based reasoning

• Taking a step back, how does CBR fit domains with large collections of data?

The Core of CBR:Reasoning Directly from the Data

(First approximation)

• Cases are specific episodes• Lazy learning: Learning is storage • Don’t extract rules: Reason from similar cases• Don’t generalize cases • Each problem-solving episode adds a case

Large-Scale CBR

• Most CBR systems are comparatively small scale

• Questions for today: – What are the large-scale applications which might

most benefit from CBR? – What would issues would need to be addressed to

apply it?

Reasoning Directly from the Data(Second Approximation, fleshing out core issues)

• Cases are specific episodes (not necessarily pre-delineated; could be very large)

• Lazy learning: Learning is storage (+ indexing)• Don’t extract rules: Reason from similar cases (how to find

them? How to extract indices/similarity criteria? How to integrate reasoning?)

• Don’t generalize cases (adaptation)• Each problem-solving episode adds a case (scale issues,

maintenance, and case base sharing may be needed)

Scale-Up as Opportunity: Example of Potential for Big Data to Ease Case Adaptation

(Jalali & Leake, 2013)

• Problem: How to gather/generate the knowledge to adapt prior cases to new needs

• For numerical prediction, adaptations can be generated by comparing case differences

Case Difference Heuristic [Hanney & Keane, 1997]

• A knowledge-light method for adaptation acquisition• Adaptations are generated by pairwise case comparison

Extending Case Adaptation with Automatically-Generated Ensembles of Adaptation Rules Vahid Jalali and David Leake

Approaches to Instance-Based Adaptation Generation and Application

• Generation: Selecting cases from which generate adaptations

• Application: Selecting source cases to adapt

Extending Case Adaptation with Automatically-Generated Ensembles of Adaptation Rules Vahid Jalali and David Leake

Questions to Discuss

• For what large-scale tasks CBR could provide an edge?

• What are opportunities for facilitating computations underlying large-scale CBR?

Documents

Large-Scale Case-Based Reasoning: Opportunity and Questions David Leake School of Informatics and Computing Indiana University