MSIM 111 Session 5 (IBM Watson by Armen Pischdotchian)

Embed Size (px)

DESCRIPTION

IBM Watson

Citation preview

  • 2015 IBM Corporaton1

    IBM Watson and Old Dominion University

    Watson from DeepQA toDeep Learning

    By: Armen Pischdotchian

  • 2015 IBM Corporaton2

    Agenda About cognitve systems The statstcs behind DeepQA The DeepQA Pipeline in Detail From DeepQA to Deep Learning

  • 2015 IBM Corporaton3

    About Cognitve Systems

  • 2015 IBM Corporaton4

    What is common amongst cognitve systems

    The three L's: Language: are you leveraging an NLP stack? Levels: do you score or rank returned responses? Learning: do you employ machine learning technologies?

    Coming soon to the three L's is the forth L: Limbs: robotcs

  • 2015 IBM Corporaton5

    Natural Language Processing Challenges

  • 2015 IBM Corporaton6

    Deterministc vs. Probabilistc Systems

  • 2015 IBM Corporaton7

    Linear Regression Logistcal Regression

  • 2015 IBM Corporaton8

    NLP terminology

  • 2015 IBM Corporaton9

    When recall is more important than precision

    5 Relevant documents (red fsh)

    5 irrelevant documents (blue fsh)

    The search has retrieved 3 relevant

    documents out of a total of 5 relevant

    documents from the corpus and 1 irrelevant document.

    Recall = 3 / 5 = 0.6

    Precision = 3 / 4 = 0.75 (the blue fsh is not part of the equaton at all).

    These images are from www.lucidata.inc

  • 2015 IBM Corporaton10

    The case of 100% recall and low precision

    5 Relevant documents (red fsh)

    5 irrelevant documents (blue fsh)

    In Watson Discovery Advisor, this is thepreferred scenario even though there may be some irrelevant documents with a high score.

    The algorithm team will then work on increasing the precision of this system.

    What would be the preferred outcome for the Watson Engagement Advisor?

  • 2015 IBM Corporaton11

    The case of 100% precision and low recall5 Relevant documents (red fsh)

    5 irrelevant documents (blue fsh)

    Zero false positves, 100% precision No blue fsh in the net

    But there are many false negatves Many red fsh in the sea

    There are potentally many relevant documents that we will never consider.Perfect precision with poor recall is of no value to a DeepQA system.

    These images are from www.lucidata.inc

  • 2015 IBM Corporaton12

    Precision and accuracy in Jeopardy!

  • 2015 IBM Corporaton13

    Stage 2: Hypothesis Generaton Precision vs.Percentage atempted

    Copyright 2010, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602

  • 2015 IBM Corporaton14

    Search Engine vs. Questons Answering SystemA QA system demands more processing from the system and less analysis on the

    user compared to a search engine.

  • 2015 IBM Corporaton15

    The DeepQA Pipeline

  • 2015 IBM Corporaton16

    An example Jeopardy! questonIN 1698, THIS COMETDISCOVERER TOOK A

    SHIP CALLED THEPARAMOUR PINK ONTHE FIRST PURELY

    SCIENTIFIC SEA VOYAGE

    IN 1698, THIS COMETDISCOVERER TOOK A

    SHIP CALLED THEPARAMOUR PINK ONTHE FIRST PURELY

    SCIENTIFIC SEA VOYAGE

    Related Content(Structured & Unstructured)

    Primary Search

    Wilhelm TempelWilhelm Tempel

    HMS ParamourHMS Paramour

    Isaac NewtonIsaac Newton

    Halleys CometHalleys Comet

    Pink PantherPink Panther

    Christiaan HuygensChristiaan Huygens

    Peter SellersPeter Sellers

    Edmond HalleyEdmond Halley

    Candidate Answer Generation

    1) Edmond Halley (0.85)2) Christiaan Huygens (0.20)3) Peter Sellers (0.05)

    1) Edmond Halley (0.85)2) Christiaan Huygens (0.20)3) Peter Sellers (0.05)

    Merging &Ranking

    EvidenceRetrieval

    Question Analysis

    Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer)Date(1698)Took(discoverer, ship)Called(ship, Paramour Pink)

    Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer)Date(1698)Took(discoverer, ship)Called(ship, Paramour Pink)

    [0.58 0 -1.3 0.97][0.71 1 13.4 0.72][0.12 0 2.0 0.40]

    [0.84 1 10.6 0.21]

    [0.33 0 6.3 0.83][0.21 1 11.1 0.92][0.91 0 -8.2 0.61]

    [0.91 0 -1.7 0.60]EvidenceScoring

    Spat

    ial

    Tem

    pora

    l

    Lexi

    cal

    Taxo

    nom

    ic

    Models

    Models

    Models

    Models

    Models

    Models

  • 2015 IBM Corporaton17

    ScoringFinal

    MergingRanking

    Scoring

    Question

    Answer, Confidence,

    Evidence

    TrainedModels

    CandidateAnswer

    GenerationPrimarySearch

    ContextualAnswer Scoring

    AnswerScoring

    EvidenceRetrieval

    Scoring

    SearchQuestionAnalysis

    Wikipediaetc.

    ContextualAnswer Scoring

    AnswerScoring ContextualAnswer

    Scoring

    AnswerScoring

    How Watson responds to a Queston

  • 2015 IBM Corporaton18

    Queston Analysis (QA) OverviewWhat is Queston Analysis?

    Queston Analysis is the frst stage in the Watson pipeline Ultmate goal: Understand what is being asked

    Various algorithms and technologies to identfy as much as possible about theinput queston

    Named Entty Detecton Natural Language Processing (NLP) Shallow and Deep Semantc Relaton Detecton

    All downstream components rely on the annotatons produced by QA

  • 2015 IBM Corporaton19

    Stage 1: Queston Analysis Queston analysis technologies includesPart of speech parsing technologyNamed Entty DetectonRelaton ExtractonInverse Document Frequency (IDF)

  • 2015 IBM Corporaton20

    Question

    PrimarySearch

    SearchQuestionAnalysis

    Stage 2: Hypothesis Generaton

  • 2015 IBM Corporaton21

    Who is the 44th President of the United States?

    Question

    PrimarySearch

    SearchQuestionAnalysis

    Keywords:44th President United States

    Stage 2: Hypothesis Generaton Primary search

  • 2015 IBM Corporaton22

    Question

    CandidateAnswer

    GenerationPrimarySearch

    SearchQuestionAnalysis

    Barack ObamaGeorge W. BushHarvard Law SchoolIllinois

    Who is the 44th President of the United States?

    Stage 2: Hypothesis Generaton Candidate Answer Gen

  • 2015 IBM Corporaton23

    Stage 3: Hypothesis Scoring What is Hypothesis Scoring?

    Enumeraton of annotators responsible for scoring previous generated candidateanswers

    The results produced by these scorers are ranked by the Merging and Rankingcomponents to produce a ranked list of answers.

    Outcome: a confdence level of a generated hypothesis Scorers can produce results in any (reasonable) range In fnal merging step, scorers are normalized according to how well their scoring

    heuristc correlates to the correct answer Normalized to [0..1] in fnal merging

  • 2015 IBM Corporaton24

    Hypothesis & Evidence Scoring

    Hypotheses EvidenceFeaturesTextual

    Alignment

    Term andnGram

    Matching

    LogicalForm

    Analysis

    Hypothesis Scoring - components

    . . .

    Question/Topic

    Analysis

    Question

    Hypothesis &Evidence Scoring

    Answer,Confidence

    Evidence

    FinalMerging

    & Ranking

    HypothesisGeneration

    TrainedModels

  • 2015 IBM Corporaton25

    AnswerIdf scorer

    Context Independent scorer

    Uses concept referred to as Inverse Document Frequency

    Rato of total documents versus documents containing targettext

    Target text = candidate answer textLarge corpus (e.g., Wikipedia)Lucene formulaLog scale

    Scores in range (0inf)

    Higher score indicates more informatveness (answer textappears in few documents)

    Example10,000 documentsAnswer text appears in only 10 documentsLog (10,000 / 10) = Log (1,000) = 3

  • 2015 IBM Corporaton26

    Textual Alignment Answer ScorerSurface similarity measurementQuestonSupportng passage

    Dynamic programming for subsequence alignment

    Consider the following example:Who led the Allied forces on the European front during World War 2?Dwight D. Eisenhower was supreme commander of Allied forces during the D-Dayinvasion and European front during World War 2.--Overlap is signifcant

    Now, consider the example:In 1698, what comet discoverer took a ship called the Paramour Pink on the frstpurely scientfc sea voyage?Edmund Halley made probably the frst primarily scientfc voyage to study thevariaton of the magnetc compass

    --Fewer textual overlaps, likely with lower IDF scores

  • 2015 IBM Corporaton27

    Who is the 44th President of the United States?

    ScoringScoring

    Question

    ContextualAnswer Scoring

    Scoring

    SearchQuestionAnalysis

    ContextualAnswer ScoringContextual

    Answer Scoring

    Barack Obama is the 44th President of the United StatesGeorge W. Bush is the 44th President of the United StatesHarvard Law School is the 44th President of the UnitedStatesIllinois is the 44th President of the United States

    Barack Hussein Obama II (i/brk husen obm/; born August 4, 1961) is the 44th and current President of the United States.

    George Walker Bush (born July 6, 1946) is anAmerican politician who served as the 43rdPresident of the United States from 2001 to2009 and the 46th Governor of Texas from1995 to 2000.

    Barack Obama .95George W. Bush .80Harvard Law School .05Illinois.10

  • 2015 IBM Corporaton28

    ScoringFinal

    MergingRanking

    Scoring

    Question

    TrainedModels

    CandidateAnswer

    GenerationPrimarySearch

    ContextualAnswer Scoring

    AnswerScoring

    Scoring

    SearchQuestionAnalysis

    Wikipediaetc.

    ContextualAnswer Scoring

    AnswerScoring ContextualAnswer

    Scoring

    AnswerScoring

    Answer, Confidence,

    Evidence

    Stage 4: Final Merger and Ranking

  • 2015 IBM Corporaton29

    Challenge: Heterogenous feature types and values

  • 2015 IBM Corporaton30

    EvidenceRetrieval

    Who is the 44th President of the United States?

    Candidate Answer AnswerScoring Contextual AnswerScoring Confidence

    Barack Obama 0.90 0.90 .95

    George W. Bush 0.90 0.80 .65

    Harvard Law School 0.10 0.05 .05

    Illinois 0.15 0.10 .10

    Stage 4: Final Merger and Ranking confdence scoring

  • 2015 IBM Corporaton31

    Watson is Deep Learning

  • 2015 IBM Corporaton32

    University of Texas Watson university competton demo

  • 2015 IBM Corporaton33

    Watson is going Deep Learning

    Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33