21
www.modeliosoft.com Multi-cloud and multi- data stores: The challenges behind heterogeneous data models Marcos Almeida , Andrey Sadovykh, SOFTEAM | ModelioSoft CLOSER’14 1

Multi datastores - CLOSER'14

Embed Size (px)

DESCRIPTION

As part of the MODAClouds and JUNIPER FP7 EU projects we discuss our on going work on modelling big data stores.

Citation preview

Page 1: Multi datastores - CLOSER'14

www.modeliosoft.com

Multi-cloud and multi-data stores:

The challenges behind heterogeneous data models

Marcos Almeida, Andrey Sadovykh, SOFTEAM | ModelioSoftCLOSER’14

1

Page 2: Multi datastores - CLOSER'14

20 ME

2006

17,5 ME

2005

60 ME

2012

ParisRennesNantes

Sophia

SOFTEAM – We are a French IT services / Software vendor

•SOFTEAM, a growing company 20 years’ experience 700 experts Regular growth

•Specialist in OO technologies, new architectures, methodologies

•Banking, Defense, Telecom, …

www.modeliosoft.com 2

23 ME

2008

Page 3: Multi datastores - CLOSER'14

Modelio is a modelling tool for Software and Systems Engineering

•UML editor with 20 years’ historyo CloudMLo SysMLo MARTEo Code generationo Documentationo Teamwork

www.modeliosoft.com 3

• Available under open source at Modelio.org

Page 4: Multi datastores - CLOSER'14

Our problem? Heterogeneity

•Multi-cloud applications

•Different providers = different data stores

•Heterogeneous data models!

•Practical exampleo Modelio SaaS = Modelling as a Serviceo Traditional relational data

• Users, roles, projects, services, billing….o Challenge: How to store models?

• current version is SVN based– IaaS: easy to adapt– PaaS: each provider supports different “datastores”

www.modeliosoft.com 4

Page 5: Multi datastores - CLOSER'14

Context: Two projects researching on (multi)clouds and big data

www.modeliosoft.com 5

http://www.modaclouds.eu/ - 318484 http://www.juniper-project.org/ - 318763

Page 6: Multi datastores - CLOSER'14

MODAClouds: MDE to avoid vendor lock-in

•Problem:• The main keyword: Multi clouds• Multiplication of cloud providers • Threats: Multiplication of Platforms Vendor Lock-in• Opportunities: MODACloudsML to reduce vendor-lockin

•Our role o Case study provider: Modelio as a Serviceo Technology provider: Modelling applications independently from the cloud

www.modeliosoft.com

ServiceA

ServiceA

ServiceB

ServiceB

Interface IInterface I

Service A(Deployment: Paas)

Service A(Deployment: Paas)

NoSQLStore

NoSQLStore

Task Queue

Task Queue

Service A(Deployment: Google App

Engine)

Service A(Deployment: Google App

Engine)

BigTableStore

BigTableStore Google

Task Queue

Google Task

Queue

Service Oriented Architecture based model Cloud specific concepts

modelCloud provider specific

parameters model

<<required>><<provided>>

Deployable source codeDeployable source code

.WARfile

.WARfile scriptsscripts

configurationconfiguration

Page 7: Multi datastores - CLOSER'14

JUNIPER: MDE to real-time applications

•Problemo The main keyword: Big Data

• Multiple streams of data + Multiple data types + Real-time constraintso Current state of the art: NoSQL

• Pros– Optimized for non-relational data– Optimized for answering simple queries as fast as possible!

• Cons– The code is “ the model” – Multiplication of NoSQL databases, paradigms and approaches

•Our roleo Technology provider: Modelling real-time big data application

www.modeliosoft.com 7

Business Objects(UML)

Business Objects(UML)

Big Data Structure Models

(e.g. Document based Data Model)

Big Data Structure Models

(e.g. Document based Data Model)

Code(Deployment scripts, Data

Access code)

Code(Deployment scripts, Data

Access code)

Page 8: Multi datastores - CLOSER'14

The main problem is FRAGMENTATIONFRAGMENTATION!

•Many different database management systemso Ex:

• MySQL (www.mysql.com/), • Big Table (http://research.google.com/archive/bigtable.html)• SimpleDB (http://aws.amazon.com/simpledb/)• Memcached (http://memcached.org/)• …

•Many underlying data representation paradigmso Ex:

• Relational Databases• Key-value Stores• Object-oriented Databases• Big Tables• …

www.modeliosoft.com 8

Page 9: Multi datastores - CLOSER'14

The basis of our solution is MDE… Why?

•Separating the problem from the solutiono In MODAClouds we model the problemo In JUNIPER we model the solution

•Fostering automationo Analysiso Code generation

www.modeliosoft.com 9

BusinessObjects Transformation

HDFS

MySQL

MongoDB

Abstract ModelsSpecific Models / code

Transformation

Transformation

Page 10: Multi datastores - CLOSER'14

What do we get from MDE?

Pros•D

esign data once, store everywhere!

•Write your transformation once, transform anything!

Cons•T

ransformations are hard to write…

•How to make sure they are CORRECT? i.e.– Is there any data/semantic

loss?

www.modeliosoft.com 10

Page 11: Multi datastores - CLOSER'14

Understanding the problem… Why is it so HARD? (1/2)

•Target Technologies based on different paradigms

•Example:

www.modeliosoft.com 11

A

B

JPA@Entitypublic class A { @Basic public B getB(){ … }…}

SQLcreate table A (…)create table B (…)create table A_B (…)

Page 12: Multi datastores - CLOSER'14

Understanding the problem… Why is it so HARD? (2/2)

•Target structure is variable

•Example:

www.modeliosoft.com 12

A

B

ER

NoSQL

A

BAB

Here A and B are

independent entities

Here, for performance reasons, B is

embedded in AA

B

Page 13: Multi datastores - CLOSER'14

Before modelling we need to understand what to model!

•That’s the objective of this work!oExisting databasesoSupported conceptsoWhat are the trade offs?

www.modeliosoft.com 13

Page 14: Multi datastores - CLOSER'14

How? identifying the main concepts, and related expressiveness trade offs

•Concepts

•Trade offs– Expressiveness:

• What one can or cannot “say” in each database?

– Performance • What kinds of query are

usually cheaper in each data base?

– What’s the cost of going from a database that supports concept A to one that supports concept B?

www.modeliosoft.com 14

Page 15: Multi datastores - CLOSER'14

What? identify the differences in data-models supported by different data stores

www.modeliosoft.com 15

Page 16: Multi datastores - CLOSER'14

Why? to propose a cloud independent model of the application data

www.modeliosoft.com 16

Example from Modelio SaaS

Page 17: Multi datastores - CLOSER'14

Why? to support mapping cloud independent data types into specific ones

www.modeliosoft.com 17

•Current situationo EXMLo HTTPo RAMC

•Futureo NoSQL database

Page 18: Multi datastores - CLOSER'14

Conclusion

•Context: Multiplication of …o cloud providers, cloud data stores, data representation

paradigms•I

f you are a developer:o How to design my application in a cloud provider independent

way?• Ok, this doesn’t exist…

o What do I loose or gain when going from provider A to provider B?

• Expressiveness• Performance

www.modeliosoft.com 18

Page 19: Multi datastores - CLOSER'14

Future Works

•MODAClouds:o Cloud independent Data Model

•JUNIPER:o Business Object Model

• Targets Java 8o Persistence Management

www.modeliosoft.com 19

Page 20: Multi datastores - CLOSER'14

Thank you for your attention!

Marcos Almeida

SOFTEAM | ModelioSoft

[email protected]

SOFTEAM R&D Web Site:

http://rd.softeam.com

ModelioSoft Web Site:

http://www.modeliosoft.com

www.modeliosoft.com 20

Page 21: Multi datastores - CLOSER'14

M o d e l i n g s o l u t i o n s.