26
Ariadne: Prima Facie Illya Shapoval CERN, KIPT, UNIFE, INFN-FE 2 nd LHCb Computing Workshop 4th-8th November 2013 CERN

Ariadne: Prima Facie

  • Upload
    minna

  • View
    81

  • Download
    2

Embed Size (px)

DESCRIPTION

Ariadne: Prima Facie. Illya Shapoval CERN, KIPT, UNIFE, INFN-FE. 2 nd LHCb Computing Workshop 4th-8th November 2013 CERN. Content. Objectives Approach Choice of DBMS Ariadne Use cases. Introduction. LHCb data processing implies handling of heterogeneous metadata entities - PowerPoint PPT Presentation

Citation preview

Page 1: Ariadne: Prima Facie

Ariadne:Prima Facie

Illya ShapovalCERN, KIPT, UNIFE, INFN-FE

2nd LHCb Computing Workshop4th-8th November 2013

CERN

Page 2: Ariadne: Prima Facie

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 2

Content

● Objectives● Approach● Choice of DBMS● Ariadne● Use cases

Page 3: Ariadne: Prima Facie

Introduction

● LHCb data processing implies handling of heterogeneous metadata entities– Versions of applications (in 2 dimensions)– Conditions Database states (in 2x3 dimensions)– Real data reconstruction types– MC data simulation types– Many others: trigger configurations, arch. specificators, etc.

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 3

Page 4: Ariadne: Prima Facie

Objectives: how did it start

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 4

How are entities related?

What entities are compatible?

What entities are compatible

with an application?

What entities are compatible with a

CondDB state?

Is a CondDB state consistent?

What entities are compatible with

a data processing type?

Page 5: Ariadne: Prima Facie

Requirements

● An operational space with generic way of– Expressing relationship constraints– Tracking relationships – Extracting solutions

● Ease of data management● Flexibility (to extend the area of application)

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 5

Page 6: Ariadne: Prima Facie

Approach: property graphs

● Modeling structured metadata– as nodes with attributes

(key+value)

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 6

A:3

A:1

A:2

B: “W”

R:”T”S:99

Page 7: Ariadne: Prima Facie

Approach: property graphs

● Modeling structured metadata– as nodes with attributes

(key+value)– with typed connections

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 7

A:3

A:1

A:2

B: ”W”

R:”T”S:99

…co

mpa

tible

compatiblexyz

Page 8: Ariadne: Prima Facie

Approach: property graphs

● Modeling structured metadata– as nodes with attributes

(key+value)– with typed connections

● Tracking?● Extracting?

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 8

A:3

A:1

A:2

B: ”W”

R:”T”S:99

…co

mpa

tible

compatiblexyz

Page 9: Ariadne: Prima Facie

Choice of DBMS model

Relevant characteristics Relational solution Graph solutionObject-relational impedance

mismatch problem[1]

[2] Suffers of [2] Free of

Flexibility of schema No Yes

Performance of structural queries

Poor[3] Better[3]

Scaling to data complexity Poor[4] Better[4]

Ease of data management Doable Better

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 9

[1] C. Ireland et al, A Classification of Object-Relational Impedance Mismatch, DBKDA ’09.[2] Correlated with the “Performance of structural queries” row of the table[3] C. Vicknair et al, A Comparison of a Graph Database and a Relational Database, ACM SE ’10, NY.[4] See the next slide

Page 10: Ariadne: Prima Facie

Other NoSQL DBMS models

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 10

Data complexity

Data

size Key-value dbs

Column dbs

Document dbs

Graph dbsRelational dbs

Still billions of nodes

Page 11: Ariadne: Prima Facie

Choice of graph DBMS

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 11

Taken from the “Knowledge Base of Relational and NoSQL DBMS” at: http://db-engines.com/en/ranking/graph+dbms

Neo4j

Page 12: Ariadne: Prima Facie

Neo4j customers

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 12

Page 13: Ariadne: Prima Facie

Are we alone?

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 13

Page 14: Ariadne: Prima Facie

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 14

Ariadne: system designA tracking system for relationships in LHCb metadata

(Leveled data flow diagram, shown in the Yourdon-DeMarco DFD notation)

AriadneMetadataprimitives

Topological solutions

Publish

(CL tools, web FE)

Query

(Ariadne Python API, CL tools, web FE)

Neo4j database

Context diagram

Level 0

Page 15: Ariadne: Prima Facie

Ariadne: previous security model

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 15

Neo4j Jetty

server

Admin CL tools

Neo4j admin

web interface

Public Ariadne XMLRPC server

RO

RWRW

Users’ CL tools LHCb job

?

Problem:Administration access was secured by ip-based rules, and thus was very limited and incovenient.

Page 16: Ariadne: Prima Facie

Ariadne: new security model

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 16

Neo4j Jetty

server

Admin CL tools

Neo4j admin

web interface

RO

RW

?

Apache server

with SSO IAA

Public Ariadne XMLRPC server

Users’ CL tools LHCb job

Evolution:All administration components of Ariadne integrated into CERN SSO IAA infrastructure (no LDAP!)

Page 17: Ariadne: Prima Facie

● Metadata entities that current graph contains (>500)– Applications– CondDB tags (or , and to specify partitions)– DetectorTypes(DataTypes) , RecoTypes , SimTypes– Platforms , GRID sites

● Relationships between those entities (~50k)– , , , – , ,

Ariadne: current knowledge graph

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 17

A

T TD TC TQ

R SD

P G

A T A D A R A S

D T R T S T

Page 18: Ariadne: Prima Facie

Tracking Relationships: Matching Patterns (1)● What is the full compatible set of entities for concrete real

data processing type?

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 18

A

TDTC

TQ

D

R

A

TDTC

TQ

D

R

QuerySolution

TC

TC

D

S

TD

A

Ariadne

Page 19: Ariadne: Prima Facie

Tracking Relationships: Matching Patterns (2)● What is the full compatible set of entities for concrete

application and MC data processing type?

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 19

A

TDTC

TQ

D

S

A

TDTC

TQ

D

S

QuerySolution

TC

TC

D

S

TD

A

Ariadne

Page 20: Ariadne: Prima Facie

Tracking Relationships: filtering multiple solutions (3)● What is the full compatible set of entities for concrete

application and MC data processing type?

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 20

A

TDTC

TQ

D

S

A

TDTC

TQ

D

S

Query

Set of solutions

AriadneA

TDTC

TQ

D

S

A

TDTC

TQ

D

S

Latest

…Ariadne filters multiple results according to the criterion provided

by a user.

Page 21: Ariadne: Prima Facie

Other application domains

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 21

How are entities related?

What entities are compatible?

Wha

t enti

ties

are

com

patib

le

with

an

appl

icati

on?

What entities are compatible with a

CondDB state?

Is a CondDB state consistent? W

hat e

ntitie

s ar

e co

mpa

tible

w

ith a

dat

a pr

oces

sing

type

?

…?

Page 22: Ariadne: Prima Facie

Summary

● Ariadne – a generic tracking system for relationships in LHCb metadata– Provides generic UI layer for heterogeneous metadata;– Based on the novel Neo4j graph database;– Provides powerful expressiveness when dealing with

complex data (lots of relationships);– Scalable and high performant solution for complex data.

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 22

Page 23: Ariadne: Prima Facie

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 23

Ariadne developpers:Illya Shapoval

Marco ClemencicMarco Cattaneo

Many thanks to Joel Closier :for the admin support of the

Ariadne hosting machine

Special thanks to: Regina Hunyadi

for the amazing painting “Ariadne” and authorization

to use its copy

The system was called after ancient Greek character of Ariadne, who, according to the legend, was in

charge of the Cnossian Labyrinth and assisted Theseus, with a clue of thread, to find a way back

from the labyrinth after killing the Minotaur.

Page 24: Ariadne: Prima Facie

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 24

BACKUP

Page 25: Ariadne: Prima Facie

Ariadne: system requirements(narrows down to the Neo4j’s requirements)

Minimum Recommended ActualCPU Intel Core i3 Intel Core i7 Intel(R) Xeon(R)

CPU L5640 @ 2.27GHz

Memory 2GB 16—32GB or more

48GB

Disk SATA SSD w/ SATA III SATA II

Filesystem ext4 (or similar) ext4, ZFS ext4

Software Oracle Java 7, OpenJDK 7

Oracle Java 7 Oracle Java 7

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 25

Page 26: Ariadne: Prima Facie

Compatibility of Entities: Definition

● Two entities are declared compatible if a job, that uses them simultaneously:

A. never fails because of the combination

B. is configured by the combination to work exactly in the way a developer anticipated it.

Implication:● A set of entities is declared compatible if and only if each

pair of the entities (the compatibility is tracked between) out of the set is compatible.

8-10-2013 LHCb Computing Workshop , 4th-8th November, CERN 26