45
OpenBlend Ljubljana September 15th, 2011 Sanne Grinovero Software Engineer at Red Hat

Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Embed Size (px)

Citation preview

Page 1: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

OpenBlend LjubljanaSeptember 15th, 2011

Sanne GrinoveroSoftware Engineer at Red Hat

Page 2: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

About me• Hibernate

• Hibernate Search

• Hibernate OGM

• Infinispan

• Lucene Directory

• Infinispan Query

in.relation.to/Bloggers/Sanne

Twitter: @SanneGrinovero

Page 3: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

What is Hibernate OGM ?

JPA for NoSQL

• initially Key/Value store• in particular Infinispan

Page 4: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Relational Databases• Transactions • Referential integrity• Simple Types• Well understood- tuning, backup, resilience

Page 5: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Relational Databases

But scaling is hard!-Replication-Multiple instances w/ shared disk-Sharding

Page 6: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Relational Databases on a cloud

Master/replicas: which master?

A single master? I was promised elasticity

Less reliable “disks”

IP in configuration files? DNS update times?

Who coordinates this? How does that failover?

Page 7: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

¬SQLbeing a not-only-thatone

basically makes it a definition of “everything else too”

“no-category”

Page 8: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

No-SQL goalsVery different• Large datasets• High availability• Low latency / higher throughput• Specific data access pattern• Specific data structures• ...

Page 9: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

• Document based stores • Column based • Graph oriented databases• Key / value stores• Full-Text Search

NotOnlySQL

Page 10: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Flexibility at a cost

•Programming model•one per product :-(

•no schema => app driven schema•query (Map Reduce, specific DSL, ...)•data structure transpires•Transaction•durability / consistency

Page 11: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Quick Infinispan introduction

Distributed Key/Value store•(or Replicated, local only efficient cache, invalidating cache)

Each node is equal•Just start more nodes, or kill some

No bottlenecks•by design

Cloud-network friendly•JGroups•And “cloud storage” friendly too!

Page 12: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Infinispan ABC

map.put( “user-34”, userInstance );

map.get( “user-34” );

map.remove( “user-34” );

Page 13: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

It's a ConcurrentMap !

map.put( “user-34”, userInstance );

map.get( “user-34” );

map.remove( “user-34” );

map.putIfAbsent( “user-38”, another );

Page 14: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Something more about Infinispan

● Support for Transactions (XA)● CacheLoaders

●Cassandra, JDBC, Amazon S3 (jclouds),...● Tree API for JBossCache compatibility● Lucene integration

● Two-fold● Some Hibernate integrations

● Second level cache● Hibernate Search indexing backend

Page 15: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Cloud-hack experiments

Let's abuse of Hibernate's second level cache design, using Infinispan's implementation:- usually configured in clustering mode INVALIDATION. Let's use DIST instead.- Disable expiry/timeouts.

What's the effect on your cloud-deployed database?

Page 16: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Cloud-hack experiments

Now introduce Hibernate Search: - full-text queries should be handled by Lucene, NOT by the database.

Hibernate Search identifies hits from the Lucene index, but loads them by PK. *by default

Page 17: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Cloud-hack experiments

Load by PK ->second level cache ->

Key/Value store

FullText query ->Hibernate Search ->

Lucene Indexes

Page 18: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Cloud-hack experiments

Load by PK ->second level cache ->

Key/Value store

FullText query ->Hibernate Search ->

Lucene Indexes

So what if you shut down the database?

Page 19: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Cloud-hack experimentsLoad by PK ->

second level cache ->Key/Value store

FullText query ->Hibernate Search ->

Lucene Indexes

So what if you shut down the database?•No relational/SQL queries•You won't be able to write!

Page 20: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)
Page 21: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Goals

•Encourage new data usage patterns•Familiar environment•Ease of use•easy to jump in•easy to jump out•Push NoSQL exploration in enterprises•“PaaS for existing API” initiative

Page 22: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

What it does

•JPA front end to key/value stores•Object CRUD (incl polymorphism and associations)•OO queries (JP-QL)

•Reuses•Hibernate Core•Hibernate Search (and Lucene)•Infinispan

•Is not a silver bullet•not for all NoSQL use cases

Page 23: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Concepts

Page 24: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Schema or no schema?

•Schema-less•move to new schema very easy•app deal with old and new structure or migrate all

data•need strict development guidelines

•Schema•reduce likelihood of rogue developer corruption•share with other apps•“didn’t think about that” bugs reduced

Page 25: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Entities as serialized blobs?

•Serialize objects into the (key) value•store the whole graph?

•maintain consistency with duplicated objects•guaranteed identity a == b•concurrency / latency•structure change and (de)serialization, class definition

changes

Page 26: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

OGM’s approach to schema

•Keep what’s best from relational model•as much as possible•tables / columns / pks

•Decorrelate object structure from data structure•Data stored as (self-described) tuples•Core types limited

•portability

Page 27: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

OGM’s approach to schema

•Store metadata for queries•Lucene index

•CRUD operations are key lookups

Page 28: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

• Entities are stored as tuples (Map<String,Object>) • The key is composed of

• table name• entity id

• Collections are represented as a list of tuple- The key is composed of:

• table name hosting the collection information• column names representing the FK• column values representing the FK

How does it work?

Page 29: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)
Page 30: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Queries

•Hibernate Search indexes entities•Store Lucene indexes in Infinispan•JP-QL to Lucene query transformation

•Works for simple queries•Lucene is not a relational SQL engine

Page 31: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

select a from Animal a where a.size > 20

> animalQueryBuilder.range().onField(“size”).above(20).excludeLimit().createQuery();

select u from Order o join o.user u where o.price > 100 and u.city = “Paris”> orderQB.bool() .must( orderQB.range() .onField(“price”).above(100).excludeLimit().createQuery() ) .must( orderQB.keyword(“user.city”).matching(“Paris”) .createQuery()).createQuery();

Page 32: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Demo

Page 33: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Why Infinispan?

•We know it well•Supports transactions (!)

•Research is going on to provide “cloud transactions” on more platforms

•It supports Lucene indexes distribution•Easy to manage in clouds•It's a key/value store with support for Map/Reduce

•Simple•Likely a common point for many other “databases”

Page 34: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Why Infinispan?

•Map/Reduce as an alternative to indexed queries•Might be chosen by a clever JP-QL engine

•Supports – experimentally – distributed Lucene queries•Since ISPN-200, merged last week

Page 35: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)
Page 36: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)
Page 37: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)
Page 38: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)
Page 39: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)
Page 40: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Why all this ?

Developers will only need to think about

• JPA models

• JP-QL queries

Everything else is perfomance tuning, including:

•Move to/from different NoSQL implementations

•Move to/from a SQL implementation

•Move to/from clouds/laptops

•JPA is a well known standard: move to/from Hibernate :-)

Page 41: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Summary•JPA for NoSQL•Reusing mature projects•Keep the good of the relational model•Query via Hibernate Search

•JP-QL support on its way•Still early in the project•Only Infinispan is integrated: contributions welcome!

Page 42: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Summary

•Performance / scalability is different•Isolation is different

Page 43: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

http://www.hibernate.org/subprojects/ogm.html

Page 44: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

http://www.jboss.org/jbw2011keynote.htmlhttps://github.com/Sanne/tweets-ogm

Page 45: Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero (JBoss by RedHat)

Q + A