Transcript
Page 1: Best Practice Reference Architecture for Data Curation

Best practice reference architecture for

data standardization and curation

Dr. Michael Engels, OSTHUS

BioIT World, Boston – April 21st 2015

Page 2: Best Practice Reference Architecture for Data Curation

Slide 2

Agenda

OSTHUS – Who we are

Painpoints

Reference architecture

Use cases

Benefits

Page 3: Best Practice Reference Architecture for Data Curation

Slide 3

Who we are

Page 4: Best Practice Reference Architecture for Data Curation

Slide 4

Cutting edge in R&D

Global partner

Independent

Digital Lab Informatics

Innovation

Active network

Open collaboration

Customer orientation

Trust

Who we are

Page 5: Best Practice Reference Architecture for Data Curation

Slide 5

Who we are

Focus on value Concepts and

methodology

Approach &

committment

Page 6: Best Practice Reference Architecture for Data Curation

Slide 6

Agenda

OSTHUS – Who we are

Painpoints

Reference architecture

Use cases

Benefits

Page 7: Best Practice Reference Architecture for Data Curation

Slide 7

Life science data

Scientific data are

Valuable assets to NGO, academic and industries

Domain/context specific

Only interpreted by experts

Scientific data are subject of continuous change:

Growth

Formats, standards, and technology

Concept extensions

Context changes

Page 8: Best Practice Reference Architecture for Data Curation

Slide 8

Change of concepts

Phenomenological based concept Gene-based concept

Pharmacology example: Ion channels taxonomy

Page 9: Best Practice Reference Architecture for Data Curation

Slide 9

Painpoints

Data standardization, data curation, master data management,

data migration, ….

Are complex endeavor's

Are labor, and alignment-intensive

Need expert input (technical and scientific)

Are highly iterative

Are difficult to frame in time-lines or costs

How to address this challenge?

Page 10: Best Practice Reference Architecture for Data Curation

Slide 10

Agenda

OSTHUS – Who we are

Painpoints

Reference architecture

Use cases

Benefits

Page 11: Best Practice Reference Architecture for Data Curation

Slide 11

Reference architecture

Data migration

Manage

Curation runs

Manage

Results

Analysis

I

II

III

IV

…...

Manage

Dictionary

Data

Source

Sources

Copy Copy of target Working area

Transformation Glossary and Vocabulary Property Mapping

Extraction &

Loading

Data Concept

Target

Data

Source Glossary

Vocabulary

Annotation

Rules

Mapping

Rules

Transformation

Rules

Run

Configuration

Data

partitioning

Data

Processing

Filtering

Monitoring &

Audit

Logs & Observ.

Exceptions

Comments

Dashboard

Calculate

Properties

Data

Comparison

Visual

Analytics

Tag

Data

List

Management

CDC

SQL to Load

Audit Trails

Page 12: Best Practice Reference Architecture for Data Curation

Slide 12

Agenda

OSTHUS – Who we are

Painpoints

Reference architecture

Use cases

Benefits

Page 13: Best Practice Reference Architecture for Data Curation

Slide 13

Use case 1

Chemical cartridge/structure migration

Accord Mol2000

#1: racemic

#1

Big Bang

Page 14: Best Practice Reference Architecture for Data Curation

Slide 14

Use case 2

Data integration – DWH

Continuous Growth

Page 15: Best Practice Reference Architecture for Data Curation

Slide 15

Agenda

OSTHUS – Who we are

Painpoints

Reference architecture

Use cases

Benefits

Page 16: Best Practice Reference Architecture for Data Curation

Slide 16

Benefits

Benefits are

Modular set up

All functions available within one integrated framework

Separate components for technical and scientific experts alike

Data curation – part of a process not of individual data editing

Easy-to-use

Configurable toolbox tailored to any program

Integrated visual / comparative analysis between source and target data

Reduction of technical issues

Error propagation contained, roll backs possible

Focus on data, not on technology

Page 17: Best Practice Reference Architecture for Data Curation

Slide 17

Questions?

For more information:

Visit us at Booth # 451

or at Poster # 47


Recommended