Multi datastores - CLOSER'14

Preview:

DESCRIPTION

As part of the MODAClouds and JUNIPER FP7 EU projects we discuss our on going work on modelling big data stores.

Citation preview

www.modeliosoft.com

Multi-cloud and multi-data stores:

The challenges behind heterogeneous data models

Marcos Almeida, Andrey Sadovykh, SOFTEAM | ModelioSoftCLOSER’14

1

20 ME

2006

17,5 ME

2005

60 ME

2012

ParisRennesNantes

Sophia

SOFTEAM – We are a French IT services / Software vendor

•SOFTEAM, a growing company 20 years’ experience 700 experts Regular growth

•Specialist in OO technologies, new architectures, methodologies

•Banking, Defense, Telecom, …

www.modeliosoft.com 2

23 ME

2008

Modelio is a modelling tool for Software and Systems Engineering

•UML editor with 20 years’ historyo CloudMLo SysMLo MARTEo Code generationo Documentationo Teamwork

www.modeliosoft.com 3

• Available under open source at Modelio.org

Our problem? Heterogeneity

•Multi-cloud applications

•Different providers = different data stores

•Heterogeneous data models!

•Practical exampleo Modelio SaaS = Modelling as a Serviceo Traditional relational data

• Users, roles, projects, services, billing….o Challenge: How to store models?

• current version is SVN based– IaaS: easy to adapt– PaaS: each provider supports different “datastores”

www.modeliosoft.com 4

Context: Two projects researching on (multi)clouds and big data

www.modeliosoft.com 5

http://www.modaclouds.eu/ - 318484 http://www.juniper-project.org/ - 318763

MODAClouds: MDE to avoid vendor lock-in

•Problem:• The main keyword: Multi clouds• Multiplication of cloud providers • Threats: Multiplication of Platforms Vendor Lock-in• Opportunities: MODACloudsML to reduce vendor-lockin

•Our role o Case study provider: Modelio as a Serviceo Technology provider: Modelling applications independently from the cloud

www.modeliosoft.com

ServiceA

ServiceA

ServiceB

ServiceB

Interface IInterface I

Service A(Deployment: Paas)

Service A(Deployment: Paas)

NoSQLStore

NoSQLStore

Task Queue

Task Queue

Service A(Deployment: Google App

Engine)

Service A(Deployment: Google App

Engine)

BigTableStore

BigTableStore Google

Task Queue

Google Task

Queue

Service Oriented Architecture based model Cloud specific concepts

modelCloud provider specific

parameters model

<<required>><<provided>>

Deployable source codeDeployable source code

.WARfile

.WARfile scriptsscripts

configurationconfiguration

JUNIPER: MDE to real-time applications

•Problemo The main keyword: Big Data

• Multiple streams of data + Multiple data types + Real-time constraintso Current state of the art: NoSQL

• Pros– Optimized for non-relational data– Optimized for answering simple queries as fast as possible!

• Cons– The code is “ the model” – Multiplication of NoSQL databases, paradigms and approaches

•Our roleo Technology provider: Modelling real-time big data application

www.modeliosoft.com 7

Business Objects(UML)

Business Objects(UML)

Big Data Structure Models

(e.g. Document based Data Model)

Big Data Structure Models

(e.g. Document based Data Model)

Code(Deployment scripts, Data

Access code)

Code(Deployment scripts, Data

Access code)

The main problem is FRAGMENTATIONFRAGMENTATION!

•Many different database management systemso Ex:

• MySQL (www.mysql.com/), • Big Table (http://research.google.com/archive/bigtable.html)• SimpleDB (http://aws.amazon.com/simpledb/)• Memcached (http://memcached.org/)• …

•Many underlying data representation paradigmso Ex:

• Relational Databases• Key-value Stores• Object-oriented Databases• Big Tables• …

www.modeliosoft.com 8

The basis of our solution is MDE… Why?

•Separating the problem from the solutiono In MODAClouds we model the problemo In JUNIPER we model the solution

•Fostering automationo Analysiso Code generation

www.modeliosoft.com 9

BusinessObjects Transformation

HDFS

MySQL

MongoDB

Abstract ModelsSpecific Models / code

Transformation

Transformation

What do we get from MDE?

Pros•D

esign data once, store everywhere!

•Write your transformation once, transform anything!

Cons•T

ransformations are hard to write…

•How to make sure they are CORRECT? i.e.– Is there any data/semantic

loss?

www.modeliosoft.com 10

Understanding the problem… Why is it so HARD? (1/2)

•Target Technologies based on different paradigms

•Example:

www.modeliosoft.com 11

A

B

JPA@Entitypublic class A { @Basic public B getB(){ … }…}

SQLcreate table A (…)create table B (…)create table A_B (…)

Understanding the problem… Why is it so HARD? (2/2)

•Target structure is variable

•Example:

www.modeliosoft.com 12

A

B

ER

NoSQL

A

BAB

Here A and B are

independent entities

Here, for performance reasons, B is

embedded in AA

B

Before modelling we need to understand what to model!

•That’s the objective of this work!oExisting databasesoSupported conceptsoWhat are the trade offs?

www.modeliosoft.com 13

How? identifying the main concepts, and related expressiveness trade offs

•Concepts

•Trade offs– Expressiveness:

• What one can or cannot “say” in each database?

– Performance • What kinds of query are

usually cheaper in each data base?

– What’s the cost of going from a database that supports concept A to one that supports concept B?

www.modeliosoft.com 14

What? identify the differences in data-models supported by different data stores

www.modeliosoft.com 15

Why? to propose a cloud independent model of the application data

www.modeliosoft.com 16

Example from Modelio SaaS

Why? to support mapping cloud independent data types into specific ones

www.modeliosoft.com 17

•Current situationo EXMLo HTTPo RAMC

•Futureo NoSQL database

Conclusion

•Context: Multiplication of …o cloud providers, cloud data stores, data representation

paradigms•I

f you are a developer:o How to design my application in a cloud provider independent

way?• Ok, this doesn’t exist…

o What do I loose or gain when going from provider A to provider B?

• Expressiveness• Performance

www.modeliosoft.com 18

Future Works

•MODAClouds:o Cloud independent Data Model

•JUNIPER:o Business Object Model

• Targets Java 8o Persistence Management

www.modeliosoft.com 19

Thank you for your attention!

Marcos Almeida

SOFTEAM | ModelioSoft

marcos.almeida@softeam.fr

SOFTEAM R&D Web Site:

http://rd.softeam.com

ModelioSoft Web Site:

http://www.modeliosoft.com

www.modeliosoft.com 20

M o d e l i n g s o l u t i o n s.

Recommended