Automating Understanding Strata Spring 2012 Final

Embed Size (px)

Citation preview

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    1/22

    Automated UnderstandingThe Next Evolution of Big Data Analytic

    Fo

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    2/22

    What do Alice & The USPTO have in com

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    3/22

    What is Understanding?

    Awareness

    Reading

    Relating

    Comprehending

    Inference

    Interpreta

    Prediction

    Creation

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    4/22

    UNDERSTANDING = WHAT I DO WITHMY DATA

    Big Data? What about My data?60-250 messages

    300Kb Daily average size0.521 Kb/s average rate

    30-50 Articles50Kb Daily average size0.521 Kb/s average rate

    N/A0.521 Kb/s average rate

    20-60 minutes744Kb Daily average size

    0.521 Kb/s average rate

    10-30 min

    372Kb Daily a0.521 Kb/s av

    3 - 8 c3350Kb0.521 Kb

    10 text me10Kb Daily av0.521 Kb/s av

    Daily

    5 Hours4826 kb

    0.268 Kb/s

    It takes me 144,000 hours or 16.42 years of m

    to just keep up with the data that Im consum

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    5/22

    What do I do with all of this data

    {

    People Place Time

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    6/22

    OUr Data versus All Data

    All Data = 1,887,438,800,000 Gigabytes

    Digital Text Created = 877,189,636.24 Gigabytes

    My Data = 1.68 Gigabytes

    If your data equals approximately 2 s

    All data would equal almost22 million days or just over 59,000

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    7/22

    A change must come

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    8/22

    80%

    Automated Understanding: its about the

    Awareness

    Reading

    Relating

    Comprehending

    Inference

    Interpreta

    Prediction

    Creation

    80%

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    9/22

    How do you automate understanding

    Inputs

    Integrated

    Functions OutpUnstructured

    Structured

    Social

    Multiple Languages

    Doc SummarizationAssociative NetCo-Reference

    DisambiguationLink AnalysisGeo ReasoningTemporal ReasoningFact Extraction

    NLP

    People undespace and tim

    Connectionsand their con

    Data fusion sources & typ

    Links back tdocument if n

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    10/22

    Deep Dive on automateD understandi

    Content IngestMetadata Indexing80%AwarenessReading

    Relating

    Comprehending

    80%

    NLP(NER)Doc Summariz

    Fact ExtractionGeo Reasoning

    Temporal ReasoningLink Analysis

    Associative NeCo-ReferenceDisambiguatio

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    11/22

    Synthesys Enterprise Synthesys

    Synthesys UI/Gadgets

    Business Partners

    Application Developers

    New Market SolutionsREST API REST API

    Unstru

    Structu

    Social

    Multip

    How do YOu SCALE AutomateD Underst

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    12/22

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    13/22

    Two Proof Cases(That we can talk about and show in public

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    14/22

    Understanding alice what we bu

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    15/22

    Understanding alice: what we bu

    Prediction

    Peruna

    Training

    Analyst

    Models

    Pierogi Synthesys

    Predicted Data

    Gold-standard

    Human Annotation

    Request

    models

    for livedeployment

    Every tagging round creates bettermodel

    Creates better predictions

    Speeds up tagging

    Focuses effort on key classes of error

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    16/22

    Making Sense of Patents Big Data BIGGER D

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    17/22

    Diverse domain of knowledge - alltypes of patents from electrical andprocess to biomedical andmechanical.

    Analyzed the full text and claims allpatents after 1976:

    10,945,560 patents.

    424,698 assignees

    2,522,474 Inventors

    Making Sense of Patents: Big Data, BIGGER D

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    18/22

    Integrated key structured data into system(inventors, assignees, companies, etc.)

    Created and trained categories forextracting domain terms with no rules -only tagging feedback and example words(lexicons).

    Making Sense of Patents: The case for Entity Orsimilarity ofEvery PatentEvery Word

    Every EntityEvery InventorEvery Company

    Every Technology(since 1976)

    Just the Beginning

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    19/22

    Just the Beginninghttps://www.synthesyscloud.com/

    a change is here now

    https://www.synthesyscloud.com/https://www.synthesyscloud.com/
  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    20/22

    Automated Understanding is the next wave of

    Analytics. It deals with your data (vs. your madata) andhowyou makedecisions.

    Its here now, but weve only scratched the sur

    the value it can create

    It gives us hope to reclaim our lives from the aattention and the constant worry of uncertaint

    a change is here now

    a change is here now

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    21/22

    Understandingempowers

    Understandingsecures

    Understancreates H

    a change is here now

  • 8/2/2019 Automating Understanding Strata Spring 2012 Final

    22/22

    Questions?

    www.digital

    http://www.digitalreasoning.com/http://www.digitalreasoning.com/