27
ACCELERATING MACHINE LEARNING WITH [email protected] www.coseer.com 0

Accelerating Machine Learning with Cognitive Calibration - Kalpesh Balar, Coseer

Embed Size (px)

Citation preview

A C C E L E R A T I N G M A C H I N E L E A R N I N G W I T H

[email protected]

www.coseer.com

0

C O S E E R

1

Web/ XML Pages

Social Media

Third Party Research

Proprietary Databases

Internal Documents

Automated Workflows

(Tasks/ Processes/ Decisions)

E X A M P L E S• Building healthcare product database

from 35m documents

• Compiling 3m documents every day into

actionable stock insights

• Reading through complex legal/ technical

documents to answer questions2

T A C T I C A L C O G N I T I V E C O M P U T I N G

3

4

About Cognitive Calibration

Implementation

Case Study

C O G N I T I V E C A L I B R A T I O N

Qualify data based on its value + Train differentially

5

6

I N C R E A S E D A C C U R A C Y

L O W E R C O M P U T E I N T E N S I T Y

7

A B I L I T Y T O M A N A G E O U T L I E R S

8

Sky Color1200 PST San Francisco

Sky Color ??1200 LTST Mars Pathfinder

9

About Cognitive Calibration

Implementation

Case Study

T I E R E D T R A I N I N G

10

Constrain Solution Space

Constrain Solution Space TrainFilter

Filter

Reliability: High

Medium

Low

C A L I B R A T I O N O F I N P U T S

11

Cognitive Calibration Classifier

Reliability: High

Medium

Low

M E T A D A T A C L A S S I F I E R

Map: m Reliability Tier

• Examples – Source, Author, Format (image/ text)

12

D E T E R M I N I S T I C C L A S S I F I E R

Map: f(x) Reliability Tier

• Source order as per the context

• Completeness

• Grammatical accuracy

13

P R O B A B I L I S T I C C L A S S I F I E R

f(x) p(R)

• Probabilistic distribution of the tier

• f(x) can be machine learnt independently e.g.

political correctness of statements

14

P R O B A B I L I S T I C C L A S S I F I E R

15

SculptSolution Space

SculptSolution Space TrainProbability

Distribution

Probability Distribution

Reliability: High

Medium

Low

C L O S E D L O O P C L A S S I F I E R

F(RO) F( f(x) p(R) )

• Probability distribution of a Probabilistic Classifier

is improved by results of the main machine

• Closed Loop Classifier Deterministic Classifier

16

C L O S E D L O O P C L A S S I F I E R

17

Probabilistic Classifier

Primary Trainer

Reliability: High

Medium

Low

Continuous Learning

18

About Cognitive Calibration

Implementation

Case Study

P R O D U C T D A T A B A S EClient

Problem

19

• One of the largest healthcare Companies

• 10m+ SKUs

• No standardized database comparing

attributes of products

• Previous human attempts to build

database unsuccessful

I N P U T• Product Brochures

• White Papers

• Surgical Protocols

• Sporadic human

entries by previous

attempts

20

Drill Bit 4.5 mm Cannulated Jacobs Chuck

With 135 mm Stop 165 mm Humeral Nail-

EX Instrument 03.010.089 03.010.089

DRILLBIT 2.0 MM GUIDE WIRE 4.5MM

CANNULATED DRILL BIT JC/WITH 135MM

STOP/165MM 45mm cann BIT BIT DRILL

03.010.089* BIT DRILL CANN 4.5MM X

165MM BIT DRL CANN 4.5X135MM BIT

DRL CNLD 4.5X165MM DRILL Drill Bit 4.5

mm Cnltd Jacobs Chuck 135mm…

C H A L L E N G E S• Erroneous data e.g. human entries

• Different levels of detail e.g. Metal, CoCr, CoCr46

• Extensive use of context specific shorthand e.g.

OD = outer diameter for acetabular shells,

OD = optical density for intraocular lenses

• Incomplete or missing data

21

M A N A G I N G C O N F L I C T I N G D A T A• Sources tiered based on accuracy and specificity

Product Brochures

White Papers/ Surgical Protocols

Authenticated Web Data

Human entries in the system

Non-authenticated web and other data

22

I D E N T I F Y I N G M A T E R I A L S• Confusion of materials of the part itself and

corresponding parts.

– e.g. “Coated Tube Poly Silicone”

– Unclear if coating is Poly or Tube is Poly

• Constrained solution spaces using Deterministic

Classifiers help

– e.g. “parts use plastic tubes” tubes cannot be silicone.

23

F I G U R I N G O U T S H O R T H A N D S

24

Probabilistic Classifier

• Classifies docs if other info identifies part category

• Trainer uses high reliability data to identify more classifying features

Primary Trainer

Reliability: High

All corpus“outer diameter”“optical density”

Continuous Learning

“outer diameter”

“optical density”

Reliability: High

Reliability: Low“outer diameter”“optical density”

O U T P U T• Learnt attributes and their

corresponding values for al l parts

• Impossible without multiple cognitive calibration frameworks deployed in models

• Diff icult to implement without other strengths in tactical cognitive computing

25

CategorySub-Category

DiameterStop

LengthGuide Wire Dia

CannulatedRadiolucent

CouplingPlatformReusable

Drill BitHumeral Nail4.5 mm135.0 mm165.0 mm2.0 mmYesNoJacobs ChuckSynthes (Estd.)Yes

T H A N K S

[email protected]

www.coseer.com

26