Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Preview:

DESCRIPTION

In Memory Data Grids in Action with Oracle Coherence presented to No SQL users.The "transactions" chapter is missing as it has been rescheduled to another session.

Citation preview

In Memory Data Grid in Actionwith Oracle Coherencefor Paris NoSQL User Group

Cyrille Le Clerc

Transactions chapter will be presented during another session

Wednesday, May 25, 2011

Speaker

2

Cyrille Le Clerc

@cyrilleleclerc

blog.xebia.fr

Open Source (Apache CXF, ...)

In Memory Data Grid

Large Scale

“you build it, you run it”

Wednesday, May 25, 2011

3

Once upon a time...

Wednesday, May 25, 2011

4

- Released Coherence in 2001- Started as a distributed cache

- Released Gigaspaces XAP in 2001- Started as a data grid

On the Financial side

• Very low latency

• Rich queries & transactions

• Scalability

• Data consistency

Needs within financial market :

Wednesday, May 25, 2011

5

Let’s define an In Memory Data Grid ...

Wednesday, May 25, 2011

Let’s define an In Memory Data Grid

6

eXtreme Scale

This is an In Memory Data Grid

Wednesday, May 25, 2011

Let’s define an In Memory Data Grid

7

This is Network Attached Memory

Wednesday, May 25, 2011

Similarities with NoSQL document orientedPartitioned, distributed Hastable, schema-less, value is not opaque, scale-out scalability

Very fastIn memory (persistence coming), business logic inside the data

Consistent and AvailableTransactional, redundant

Written in Java, data are POJOs Not necessary

Clients in Java, Microsoft, etc8

Let’s define an In Memory Data Grid

Wednesday, May 25, 2011

9

Use cases for this presentation

Wednesday, May 25, 2011

Train Booking System

10

trains, stations, seats, booking and passengers

Wednesday, May 25, 2011

eCommerce Web Site

11

warehouse stocks

231

2

canon-eos: 1ipod : 1headphone : 1iphone: 1...

ipad : 1 iphone: 1

barbie : 1iphone: 1cabbage-doll: 1

121

311

12

264

637

{ "name": "Barbie Computer", "stock": 637, "weigth" : 200 }

warehouse & customers shopping carts

Wednesday, May 25, 2011

12

In Memory Data Grids Key Principles

Wednesday, May 25, 2011

Store Everything in a Mainframe !

13

3 To of RAM80 x 5.2 GHtz coresMuch more than $1,000,000

IBM z11http://ibm.com/

Wednesday, May 25, 2011

Spread on Inexpensive Servers

14

Mainframe Cheap Servers !http://1userverrack.net/

http://ibm.com/

Wednesday, May 25, 2011

Partition Data

15

MainFrame

Smallservers

Partition gamma

Partition beta

Partition alpha

Partition for scalability

Wednesday, May 25, 2011

Duplicate Data

16

sync synchronization

Duplicate data for high availability

Partition alpha

Master

Standby Backup

Wednesday, May 25, 2011

17

Data Access Patterns

Wednesday, May 25, 2011

Data Access Patterns

This is not traditional Java EE coding style !

Can apply very complex business logic inside the data

18

Stored Procedures Style

Change management challenge !

Wednesday, May 25, 2011

19

Pattern : Targeted Operation

Wednesday, May 25, 2011

Pattern: Targeted Operation

20

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

{ "train-id": "tgv-3071-20110512", "time" : 2011/05/12 12:15, "departure" : "Paris", "arrival" : "Marseille", "seats" : 3, }

Book Train Tickets

“train-id” is indexed

Wednesday, May 25, 2011

21

Pattern : Map Reduce Style Operation

Wednesday, May 25, 2011

Pattern: Map Reduce

22

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

{ "departure": "Paris", "arrival": "Marseille", "time" : 2011/05/12 12:00, "seats" : 3, }

Distributed “Search Train Ticket”Wednesday, May 25, 2011

Pattern: Map Reduce

23

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

{ "Paris -> Marseille : 12:15", "Paris -> Marseille : 13:15"}

Distributed “Search Train Ticket”

{ #NONE# }

{ "Paris -> Lyon -> Marseille : 12:40"}

Wednesday, May 25, 2011

Pattern: Map Reduce

24

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

Distributed “Search Train Ticket”

{ "Paris -> Marseille : 12:15", "Paris -> Lyon -> Marseille : 12:40", "Paris -> Marseille : 13:15"}

Wednesday, May 25, 2011

Data Access Patterns

This is not traditional Java EE coding style

Don’t forget “Map Reduce” = “Distributed Table Scan”

25

Use Indexes

Change management

Wednesday, May 25, 2011

26

CAP Theorem & In Memory Data Grids

Wednesday, May 25, 2011

CAP Theorem and In Memory Data Grid

27

Consistency

Availability

PartitionTolerance

Only 2 of these 3 properties can be

achieved at any given moment in time

Brewer’s Conjecture

http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf

Wednesday, May 25, 2011

CAP Theorem and In Memory Data Grid

28

Consistency

Availability

PartitionTolerance

Only 2 of these 3 properties can be

achieved at any given moment in time

Brewer’s Conjecture

http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf

Data Grids

Wednesday, May 25, 2011

Cross Data Center Data Consistency

29

TokyoNew York

London

World wide replicationfor financial market

Wednesday, May 25, 2011

Cross Data Center Data Consistency

30

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

Warehouse stocks

Wednesday, May 25, 2011

Cross Data Center Data Consistency

31

propagation delay !

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set stock to 146

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

Wednesday, May 25, 2011

Cross Data Center Data Consistency

32

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set stock to 146

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set weight 175reconciliation API needed !

Wednesday, May 25, 2011

Cross Data Center Data Consistency

33

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set stock to 146

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set weight 175Network partitioning

Wednesday, May 25, 2011

34

Data Modeling

Wednesday, May 25, 2011

Data Modeling

Dominant Question Driven Design

Constrained Tree Schema

Denormalized

35

Opposite to Relational which is Domain Driven Design

Because RPC matters

Due to dominant questions and CTS

Wednesday, May 25, 2011

Data Modeling

36

TrainStopdate

TrainStationcodename

Traincodetype

Seatnumberprice

Bookingreduction

Passengername

Typical relational data model

Wednesday, May 25, 2011

Data Modeling

37

Find the root entity and denormalize

TrainStopdate

Seatnumberprice

Bookingreduction

Passengername

Reference data

Duplicated in each grid node

TrainStationcodename

Root entity

Partitioning ready entities tree

Traincodetype

Wednesday, May 25, 2011

Data Modeling

38

Remove unused data

TrainStopdate

Seatnumberprice

Bookingreduction

Passengername

booked

TrainStationcodename

Traincodetype

Partitioned

Replicated

Wednesday, May 25, 2011

Data Modeling

39

TrainStopdate

TrainStationcodename

Seatnumberpricebooked

Traincodetype

Data Grid Ready data structure

Partitioned

Replicated

Wednesday, May 25, 2011

40

Data Modeling is Hard !

Wednesday, May 25, 2011

Data Modeling is Hard !

41

Two root entities for the same MoneyTransfer !

from to

CashWitdrawaldateamount

MoneyTransferiddateamount

Accountnumber

CashWitdrawaldateamount

Accountnumber

Wednesday, May 25, 2011

Data Modeling is Hard !

42

CashWitdrawaldateamount

CashWitdrawaldateamount

MoneyTransferIniddateamount

MoneyTransferOutiddateamount

Accountnumber

Accountnumber

Split MoneyTransfer

Wednesday, May 25, 2011

Data Modeling is Hard !

43

CashWitdrawaldateamount

MoneyTransferOutiddateamount

Accountnumber

CashWitdrawaldateamount

MoneyTransferIniddateamount

Accountnumber

Split MoneyTransfer

Wednesday, May 25, 2011

Data Modeling is Hard !

44

CashWitdrawaldateamount

MoneyTransferOutiddateamount

MoneyTransferIniddateamount

Accountnumber

Data Grid Ready data structure

Wednesday, May 25, 2011

45

Grid Internals

Wednesday, May 25, 2011

Data Serialization

Used for data transfer and byte oriented storage

Hot topic like Apache Thrift, Apache Avro, Google Protocol Buffer

46

Must support evolvable data structure

Wednesday, May 25, 2011

Data Storage

Store Java Beans in the grid

Store byte arrays in the grid

47

No need to unmarshall for inprocess operations

Beware of garbage collector !

Pay unmarshalling at each read and write

Slightly more garbage collector friendlyLow-level / byte-oriented APIs to read data

Wednesday, May 25, 2011

Communication Protocols

UDP Multi Cast (Coherence, Gigaspaces)

TCP/IP (Websphere eXtreme Scale)

48Wednesday, May 25, 2011

Topology

Partitions made of shards : 1 primary + 0..* backups)

Dynamic shards location (changes at runtime and at restart)

Can use dedicated “directory servers” or embed it in the “data nodes”

49Wednesday, May 25, 2011

JVM and Memory

Many editors recommend tiny 1.4 Go JVM !

More than ten JVM per server

50

Garbage collector hell

Management hell

More and more IMDG support large heaps

Wednesday, May 25, 2011

51

APIs

Wednesday, May 25, 2011

Raw Java Mapping with Oracle Coherence

52

hand-coded serializationJUnit is your friend !

public class Train extends AbstractEvolvable implements PortableObject { enum Type { HIGH_SPEED, NORMAL }

/** Key of the Cache */ String code;

/** Indexed */ String name;

Type type;

List<Seat> seats = new ArrayList<Seat>();

int version;

List<TrainStop> trainStops = new ArrayList<TrainStop>();

@Override public int getImplVersion() { return 1; }

@Override public void readExternal(PofReader pofReader) throws IOException { this.code = pofReader.readString(0); this.name = pofReader.readString(1); this.type = (Type) pofReader.readObject(2); pofReader.readCollection(3, this.seats); pofReader.readCollection(4, this.trainStops); this.version = pofReader.readInt(5); }

@Override public void writeExternal(PofWriter pofWriter) throws IOException { pofWriter.writeString(0, this.code); pofWriter.writeString(1, this.name); pofWriter.writeObject(2, this.type); pofWriter.writeCollection(3, this.seats, Seat.class); pofWriter.writeCollection(4, this.trainStops, TrainStop.class); pofWriter.writeInt(5, this.version); }}

TrainStopdate

Seatnumberpricebooked

Traincodetype

Wednesday, May 25, 2011

JPA Style Mapping with Websphere eXtreme Scale

53

sub entities can have cross relations

@Entity(schemaRoot=true)public class Train { @Id String code; @Index @Basic String name; @OneToMany(cascade=CascadeType.ALL) List<Seat> seats = new ArrayList<Seat>(); @Version int version;

...}

TrainStopdate

Seatnumberpricebooked

Traincodetype

Wednesday, May 25, 2011

Map API with Oracle Coherence

54

NamedCache trainCache = CacheFactory.getCache("train-cache");

/** Save */ void persist(Train train) { trainCache.put(train.getCode(), train); } /** Find by key */ Train findByCode(String code) { return (Train) trainCache.get(code); }

/** Find by Query Language */ Train findByTrainName(String name) { Filter filter = QueryHelper.createFilter("name = :name" , Collections.singletonMap("name", name)); Set<Map.Entry<String, Train>> trainEntrySet = trainCache.entrySet(filter); if (trainEntrySet.isEmpty()) { return null; } else { return trainEntrySet.iterator().next().getValue(); } }

Map API

Wednesday, May 25, 2011

JPA Style with Websphere eXtreme Scale

55

/** Save */void persist(Train train) { entityManager.persist(train);}

/** Find by key */Train findByCode(String code) { return (Train) entityManager.find(Train.class, code);}

/** Query Language */Train findByTrainName(String name) { Query q = entityManager.createQuery("select t from Train t where t.name=:name"); q.setParameter("name", name);

return (Train) q.getSingleResult();}

JPA Style Entity Manager

Wednesday, May 25, 2011

Creating Indexes

56

Map reduce (without index) = Distributed Table Scan !

Wednesday, May 25, 2011

Indexes with Oracle Coherence

57

class Train { String name;

Collection<String> getTrainStationsCodes() { return Collections2.transform(trainStops, ...); }

...}

{ NamedCache trainCache = CacheFactory.getCache("train-cache");

trainCache.addIndex(new ReflectionExtractor("getName"), false, null); trainCache.addIndex(new ReflectionExtractor("getTrainStationsCodes"), false, null);}

Wednesday, May 25, 2011

Indexes with Websphere eXtreme Scale

58

@Entity(schemaRoot=true)class Train { @Index @Basic String name;

@Index Collection<String> getTrainStationsCodes() { return Collections2.transform(trainStops, ...); }

...}

Query query = em.createQuery("select t from Train t where t.name=:name");query.getPlan();

eXtreme Scale

for q2 in Train ObjectMap using INDEX on name = ( ?name) filter ( q2.c[0] = ?name ) returning new Tuple( q2 )

This is an execution plan

Wednesday, May 25, 2011

More APIs

Another Java EE versus Spring battle ? JSR 347 Data Grids vs. Spring Data

59

Unified API ontop of NoSQL stores ?

Serialization / Object to Tuple Mapping API ?

Wednesday, May 25, 2011

60

Data Grid <-> Relational Database Interactions

Wednesday, May 25, 2011

Data Grid <-> Relational Database

61

Data Grids are “In Memory” -> we need to persist data on disk !

Wednesday, May 25, 2011

Data Grid <-> Relational Database

62

update / insert / delete

“select directly modified in DB”

Wednesday, May 25, 2011

Data Grid <-> Relational Database

63

backend DB

Highly available write behind queues+ SQL batched statements

Data Grid -> Relational Database

Wednesday, May 25, 2011

Data Grid <-> Relational Database

64

TrainStopdate

TrainStationcodename

Seatnumberpricebooked

Traincodetype

Constrained Tree Schema <-> Relational Impedance Mismatch

Data Grid -> Relational Database

Wednesday, May 25, 2011

Data Grid <-> Relational Database

DB writes MUST succeed !

65

Align the database on the Data Grid model !

Denormalize the databaseRemove the foreign keys, use same PKs in DB and data gridSupport unordered SQL statements

Prefer raw SQL rather than reused business logic

Wednesday, May 25, 2011

Data Grid <-> Relational Database

66

backend DB

Data Grid Originated Scheduled Refresh(Oracle System Change Number, etc)

select * from train where last_modif > ?

Relational Database -> Data Grid

Wednesday, May 25, 2011

Data Grid <-> Relational Database

67

backend DB

Database Originated PushJMS = durable subscription(Oracle Database Change Notification, etc)

Relational Database -> Data Grid

Wednesday, May 25, 2011

Data Grid <-> Relational Database

In Memory -> prepare for reloading after maintenance operations !

Prepare consistency checkers

68

Need for “graceful shutdown with disk persistence”

Wednesday, May 25, 2011

69

Transactions

Wednesday, May 25, 2011

70

We didn’t have the time to talk about transaction.

Another session is planned at Paris No SQL User Group for this.

Wednesday, May 25, 2011

71

Let’s go live !

Wednesday, May 25, 2011

Data Grids and Operations

Standard packaging?

Limited Management

Limited debugging tools

JVM pandemia

72

Do It Yourself (layout, scripts, etc)

Do It Yourself (stop/start, detecting data loss, etc)

Dozens of JVM to manage !

Do It Yourself (debugging consoles, troubleshooting agents)

Wednesday, May 25, 2011

Data Grids and Operations

Dev / Ops collaboration is required

Experts only !

73Wednesday, May 25, 2011

74

The right tool for the right job

Wednesday, May 25, 2011

The right tool for the right job

Incredibly fast ! Even with transactions !

Scalable

Good at data replication (when it implements it)

Very geeky on both dev and ops side

“Quite” expensive

75

Not an enterprise grade data store

Reconciliation api, etc

Requires very skilled people + change management

If you solve the data loading issue

Wednesday, May 25, 2011

76

?

Questions / Answers

Wednesday, May 25, 2011

Recommended