Introduction to Spring Data Dr. Mark Pollack. The current data landscape Project Goals Project Tour...

Preview:

Citation preview

Introduction to Spring DataDr. Mark Pollack

• The current data landscape• Project Goals• Project Tour

Agenda

2

Enterprise Data Trends

3

Enterprise Data Trends

4

Unstructured Data•No predefined data model•Often doesn’t fit well in RDBMS

Pre-Aggregated Data•Computed during data collection•Counters•Running Averages

• Value from Data Exceeds Hardware & Software costs

• Value in connecting data sets– Grouping e-commerce users by user agent

The Value of Data

5

Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/418.9 (KHTML, like Gecko) Safari/419.3

• Extremely difficult/impossible to scale writes in RDBMS– Vertical scaling is limited/expensive– Horizontal scaling is limited or requires $$

• Shift from ACID to BASE– Basically Available, Scalable, Eventually Consistent

• NoSQL datastores emerge as “point solutions”– Amazon/Google papers– Facebook, LinkedIn …

The Data Revolution

6

NoSQL

7

“Not Only SQL”NOSQL \no-seek-wool\ n. Describes ongoing trend

where developers increasingly opt for non-relational databases to help solve their problems, in an effort to use the right tool for the right job.

Query Mechanisms:

Key lookup, map-reduce, query-by-example, query language, traversals

• “Big data” refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.

• A subjective and moving target. • Big data in many sectors today range from 10’s of TB to

multiple PB

Big Data

8

Reality Check

9

Reality Check

10

Project Goals

11

• Data access landscape has changed considerably• RDBMS are still important and predominant

– but no longer considered a “one size fits all” solution

• But they have limitations– Hard to scale

• New data access technologies are solving problems RDBMS can’t– Higher performance and scalability, different data models– Often limited transactional model and relaxed consistency

• Polyglot persistence is becoming more prevalent– Combine RDBMS + other DBs in a solution

Spring Data - Background and Motivation

12

• Spring has always provided excellent data access support– Transaction Management– Portable data access exception hierarchy– JDBC – JdbcTemplate– ORM - Hibernate, JPA, JDO, Ibatis support– Cache support (Spring 3.1)

• Spring Data project started in 2010• Goal is to “refresh” Spring’s Data Access support

– In light of new data access landscape

Spring and Data Access

Spring Data Mission Statement

14

89% of all virtualized applications in the world run on VMware.

Gartner, December 2008“Provides a familiar and consistent

Spring-based programming model for Big Data, NoSQL, and relational stores while retaining store-specific features and capabilities.

Spring Data Mission Statement

15

89% of all virtualized applications in the world run on VMware.

Gartner, December 2008“Provides a familiar and consistent

Spring-based programming model for Big Data, NoSQL, and relational stores while retaining store-specific features and capabilities.

Spring Data Mission Statement

16

89% of all virtualized applications in the world run on VMware.

Gartner, December 2008“store-specific

features and capabilities.

• Relational– JPA– JDBC Extensions

• NoSQL– Redis– HBase– Mongo– Neo4j– Lucene– Gemfire

• Big Data– Hadoop

• HDFS and M/R• Hive• Pig• Cascading

– Splunk

• Access– Repositories

– QueryDSL

– REST

Spring Data – Supported Technologies

17

• Database specific features are accessed through familiar Spring Template pattern– RedisTemplate– HBaseTemplate– MongoTemplate– Neo4jTemplate– GemfireTemplate

• Shared programming models and data access mechanisms– Repository Model

• Common CRUD across data stores

– Integration with QueryDSL• Typesafe query language

– REST Exporter• Expose repository over HTTP in

a RESTful manner.

Spring Data – Have it your way

Project Tour

19

JDBC and JPA

• Fast Connection Failover

• Simplified configuration for Advanced Queuing JMS support and DataSource

• Single local transaction for messaging and database access

• Easy Access to native XML, Struct, Array data types

• API for customizing the connection environment

Spring Data JDBC Extensions – Oracle Support

QueryDSL

22

“Enables the construction of type-safe SQL-like queries for multiple backends including JPA, JDO, MongoDB, Lucence, SQL and plain collections in Java

http://www.querydsl.com/ - Open Source, Apache 2.0

• Using strings is error-prone• Must remember query syntax, domain classes, properties

and relationships• Verbose parameter binding by name or position• Each back-end has its own query language and API• Note: .NET has LINQ

Problems using Strings for a query language

• Code completion in IDE• Almost no syntactically invalid queries allowed• Domain types and properties can be references safely (no

Strings)• Helper classes generated via Java annotation processor• Much less verbose than JPA2 Criteria API

QueryDSL Features

24

QCustomer customer = QCustomer.customer;JPQLQuery query = new JPAQuery(entityManger)Customer bob = query.from(customer) .where(customer.firstName.eq(“Bob”) .uniqueResult(customer)

• Incorporate code-generation into your build process– To create a query meta-model of domain classes or Tables (JDBC)

• For SQL

Using QueryDSL for JDBC

QAddress qAddress = QAddress.address;

SQLTemplates dialect = new HSQLDBTemplates();

SQLQuery query = new SQLQueryImpl(connection, dialect).from(qAddress).where(qAddress.city.eq("London"));

List<Address> results = query.list(new QBean<Address>(Address.class, qAddress.street, qAddress.city,

qAddress.country));Querydsl Predicate

• Wrapper around JdbcTemplate that supports– Using Querydsl SQLQuery classes to execute queries

– Integrates with Spring’s transaction management

– Automatically detects DB type and set SQLTemplates dialect

– Spring RowMapper and ResultSetExtractors for mapping to POJOs

– Executing insert, updates and deletes with Querdsl’s SQLInsertClause, SQLUpdateClause, and SQLDeleteClause

Spring JDBC Extension – QueryDslJdbcTemplate

Spring JDBC Extension – QueryDslJdbcTemplate

// Query with joinQCustomer qCustomer = QCustomer.customer;SQLQuery findByIdQuery = qdslTemplate.newSqlQuery()

.from(qCustomer)

.leftJoin(qCustomer._addressCustomerRef, qAddress)

.where(qCustomer.id.eq(id));

JPA and Repositories

28

Repositories

29

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.http://martinfowler.com/eaaCatalog/repository.html

• We remove the busy work of developing a repository

Spring Data Repositories

30

For Example…public interface CustomerRepository {

Customer findOne(Long id);

Customer save(Customer customer);

Customer findByEmailAddress(EmailAddress emailAddress);}

@Entitypublic class Customer {

@Id @GeneratedValue(strategy = GenerationType.AUTO) private Long id; @Column(unique = true) private EmailAddress emailAddress;

@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true) @JoinColumn(name = "customer_id") private Set<Address> addresses = new HashSet<Address>(); // constructor, properties, equals, hashcode omitted for brevity}

Traditional JPA Implementation@Repositorypublic class JpaCustomerRepository implements CustomerRepository {

@PersistenceContext private EntityManager em;

@Override public Customer findOne(Long id) { return em.find(Customer.class, id); }

public Customer save(Customer customer) { if (customer.getId() == null) { em.persist(customer); return customer; } else { return em.merge(customer); } }

...

Traditional JPA Implementation. . .

@Override public Customer findByEmailAddress(EmailAddress emailAddress) {

TypedQuery<Customer> query = em.createQuery("select c from Customer c where c.emailAddress = :email", Customer.class); query.setParameter("email", emailAddress);

return query.getSingleResult(); }}

• A simple recipe1. Map your POJO using JPA

2. Extend a repository (marker) interface or use an annotation

3. Add finder methods

4. Configure Spring to scan for repository interfaces and create implementations

• Inject implementations into your services and use as normal…

Spring Data Repositories

Spring Data Repository Example

or

public interface CustomerRepository extends Repository<Customer, Long> { // Marker Interface

Customer findOne(Long id);

Customer save(Customer customer);

Customer findByEmailAddress(EmailAddress emailAddress);}

@RepositoryDefinition(domainClass=Customer.class, idClass=Long.class)public interface CustomerRepository { . . . }

• Boostratp with JavaConfig

• Or XML

• And Spring will create an implementation the interface

Spring Data Repository Example

@Configuration@EnableJpaRepositories@Import(InfrastructureConfig.class)public class ApplicationConfig {

}

<jpa:repositories base-package="com.oreilly.springdata.jpa" />

• Wire into your transactional service layer as normal

Spring Data JPA - Usage

• How does findByEmailAddres work…

Query Method Keywords

Spring Data Repositories - CRUD

39

public interface CrudRepository<T, ID extends Serializable> extends Repository<T, ID> {

T save(T entity);

Iterable<T> save(Iterable<? extends T> entities);

T findOne(ID id);

boolean exists(ID id);

Iterable<T> findAll();

long count();

void delete(ID id);

void delete(T entity);

void delete(Iterable<? extends T> entities);

void deleteAll();}

Paging, Sorting, and custom finders

40

public interface PagingAndSortingRepository<T, ID extends Serializable> extends CrudRepository<T, ID> { Iterable<T> findAll(Sort sort);

Page<T> findAll(Pageable pageable);}

• Query methods use method naming conventions– Can override with Query annotation

– Or method name references JPA named query

Spring Data JPA – Customize Query Methods

• Specifications using JPA Criteria API• LockMode, override Transactional metadata, QueryHints• Auditing, CDI Integration• QueryDSL support

Spring Data JPA – Other features

42

• Easier and less verbose and JPA2 Criteria API– “equals property value” vs. “property equals value”

– Operations via a builder object

Querydsl and JPA

CriteriaBuilder builder = entityManagerFactory.getCriteriaBuilder();CriteriaQuery<Person> query = builder.createQuery(Person.class);Root<Person> men = query.from( Person.class );Root<Person> women = query.from( Person.class );Predicate menRestriction = builder.and(

builder.equal( men.get( Person_.gender ), Gender.MALE ), builder.equal( men.get( Person_.relationshipStatus ),

RelationshipStatus.SINGLE ));

Predicate womenRestriction = builder.and( builder.equal( women.get( Person_.gender ), Gender.FEMALE ),

builder.equal( women.get( Person_.relationshipStatus ),RelationshipStatus.SINGLE ));

query.where( builder.and( menRestriction, womenRestriction ) );

verus…

Querydsl and JPA

JPAQuery query = new JPAQuery(entityManager);QPerson men = new QPerson("men");QPerson women = new QPerson("women");

query.from(men, women).where(men.gender.eq(Gender.MALE), men.relationshipStatus.eq(RelationshipStatus.SINGLE),

women.gender.eq(Gender.FEMALE), women.relationshipStatus.eq(RelationshipStatus.SINGLE));

Querydsl Predicates

QueryDSL - Repositories

45

public interface ProductRepository extends Repository<Product,Long>, QueryDslPredicateExecutor<Product> { … }

Product iPad = productRepository.findOne(product.name.eq("iPad"));

Predicate tablets = product.description.contains("tablet");

Iterable<Product> result = productRepository.findAll(tablets);

public interface QueryDSLPredicateExecutor<T> {

long count(com.mysema.query.types.Predicate predicate); T findOne(Predicate predicate);

List<T> findAll(Predicate predicate);

List<T> findAll(Predicate predicate, OrderSpecifier<?>... orders);

Page<T> findAll(Predicate predicate, Pageable pageable);

}

Tooling Support

46

Code Tour - JPA

47

NoSQL Data Models

48

• Familiar, much like a hash table• Redis, Riak, Voldemort,…• Amazon Dynamo inspired

Key/Value

49

• Extended key/value model– values can also be key/value pairs

• HBase, Cassandra• Google Bigtable inspired

Column Family

• Collections that contain semi-structured data: XML/JSON• CouchDB, MongoDB

Document

51

{ id: ‘4b2b9f67a1f631733d917a7b"),’ author: ‘joe’, tags : [‘example’, ‘db’], comments : [ { author: 'jim', comment: 'OK' }, { author: ‘ida', comment: ‘Bad' } ]

{ id: ‘4b2b9f67a1f631733d917a7c"), author: ‘ida’, ...

{ id: ‘4b2b9f67a1f631733d917a7d"), author: ‘jim’, ...

• Nodes and Edges, each of which may have properties• Neo4j, Sones, InfiniteGraph

Graph

52

• Advanced key-value store• Values can be

– Strings (like in a plain key-value store).

– Lists of strings, with O(1) pop and push operations.

– Sets of strings, with O(1) element add, remove, and existence

test.

– Sorted sets that are like Sets but with a score to take

elements in order.

– Hashes that are composed of string fields set to string values.

Redis

53

• Operations– Unique to each data type – appending to list/set, retrieve slice

of a list…

– Many operations performed in (1) time – 100k ops/sec on entry-level hardware

– Intersection, union, difference of sets

– Redis is single-threaded, atomic operations

• Optional persistence• Master-slave replication• HA support coming soon

Redis

54

• Provide ‘defacto’ API on top of multiple drivers• RedisTemplate

– Connection resource management

– Descriptive method names, grouped into data type categories• ListOps, ZSetOps, HashOps, …

– No need to deal with byte arrays• Support for Java JDK, String, JSON, XML, and Custom serialization

– Translation to Spring’s DataAccessException hierarchy

• Redis backed Set, List, Map, capped Collections, Atomic Counters

• Redis Messaging

• Spring 3.1 @Cacheable support

Spring Data Redis

55

• List Operations

RedisTemplate

56

@AutowiredRedisTemplate<String, Person> redisTemplate;

Person p = new Person("George", “Carlin");redisTemplate.opsForList().leftPush("hosts", p);

• JDK collections (java.util & java.util.concurrent)– List/Set/(Blocking)Queues/(Blocking)Deque

• Atomic Counters– AtomicLong & AtomicInteger backed by Redis

Redis Support Classes

57

Set<String> t = new DefaultRedisSet<String>(“timeline“, connection);t.add(new Post("john", "Hello World"));

RedisSet<String> fJ = new DefaultRedisSet<String>("john:following", template);RedisSet<String> fB = new DefaultRedisSet<String>("bob:following", template);

// followers in commonSet s3 = fJ.intersect(fB);

Code Tour - Redis

58

• Column-oriented database– Row points to “columns” which are actually key-value pairs

– Columns can be grouped together into “column families”• Optimized storage and I/O

• Data stored in HDFS, modeled after Google BigTable• Need to define a schema for column families up front

– Key-value pairs inside a column-family are not defined up front

HBase

59

Using HBase

60

$ ./bin/hbase shell> create 'users', { NAME => 'cfInfo'}, { NAME => 'cfStatus' }> put 'users', 'row-1', 'cfInfo:qUser', 'user1'> put 'users', 'row-1', 'cfInfo:qEmail', 'user1@yahoo.com'> put 'users', 'row-1', 'cfInfo:qPassword', 'user1pwd'> put 'users', 'row-1', 'cfStatus:qEmailValidated', 'true‘> scan 'users'ROW COLUMN+CELLrow-1 column=cfInfo:qEmail, timestamp=1346326115599, value=user1@yahoo.comrow-1 column=cfInfo:qPassword, timestamp=1346326128125, value=user1pwdrow-1 column=cfInfo:qUser, timestamp=1346326078830, value=user1row-1 column=cfStatus:

Configuration configuration = new Configuration(); // Hadoop configuration objectHTable table = new HTable(configuration, "users");Put p = new Put(Bytes.toBytes("user1"));p.add(Bytes.toBytes("cfInfo"), Bytes.toBytes("qUser"), Bytes.toBytes("user1"));table.put(p);

• HTable class is not thread safe• Throws HBase-specific exceptions

HBase API

61

Configuration configuration = new Configuration(); // Hadoop configuration HTable table = new HTable(configuration, "users");Put p = new Put(Bytes.toBytes("user1"));p.add(Bytes.toBytes("cfInfo"), Bytes.toBytes("qUser"), Bytes.toBytes("user1"));p.add(Bytes.toBytes("cfInfo"), Bytes.toBytes("qEmail"), Bytes.toBytes("user1@yahoo.com"));p.add(Bytes.toBytes("cfInfo"), Bytes.toBytes("qPassword"), Bytes.toBytes("user1pwd"));table.put(p);

• Configuration support• HBaseTemplate

– Resource Management– Translation to Spring’s DataAccessException hierarchy– Lightweight Object Mapping similar to JdbcTemplate

• RowMapper, ResultsExtractor

– Access to underlying resource• TableCallback

Spring Hadoop - HBase

62

HBaseTemplate - Configuration

63

<configuration id="hadoopConfiguration"> fs.default.name=hdfs://localhost:9000</configuration>

<hbase-configuration id="hbaseConfiguration" configuration-ref="hadoopConfiguration" />

<beans:bean id="hbaseTemplate" class="org.springframework.data.hadoop.hbase.HbaseTemplate"> <beans:property name="configuration" ref="hbaseConfiguration" /></beans:bean>

HBaseTemplate - Save

64

public User save(final String userName, final String email, final String password) { return hbaseTemplate.execute(tableName, new TableCallback<User>() { public User doInTable(HTable table) throws Throwable { User user = new User(userName, email, password); Put p = new Put(Bytes.toBytes(user.getName())); p.add(CF_INFO, qUser, Bytes.toBytes(user.getName())); p.add(CF_INFO, qEmail, Bytes.toBytes(user.getEmail())); p.add(CF_INFO, qPassword, Bytes.toBytes(user.getPassword())); table.put(p); return user; } });}

HBaseTemplate – POJO Mapping

65

private byte[] qUser = Bytes.toBytes("user");private byte[] qEmail = Bytes.toBytes("email");private byte[] qPassword = Bytes.toBytes("password");

public List<User> findAll() { return hbaseTemplate.find(tableName, "cfInfo", new RowMapper<User>() { @Override public User mapRow(Result result, int rowNum) throws Exception { return new User(Bytes.toString(result.getValue(CF_INFO, qUser)), Bytes.toString(result.getValue(CF_INFO, qEmail)), Bytes.toString(result.getValue(CF_INFO, qPassword))); } });}

Code Tour - HBase

66

• Document Database– JSON-style documents– Schema-less

• Documents organized in collections• Full or partial document updates• Index support – secondary and compound• Rich query language for dynamic queries• GridFS for efficiently storing large files• Geo-spatial features• Map/Reduce for aggregation queries

– New Aggregation Framework in 2.2

• Replication and Auto Sharding

MongoDB

67

• MongoTemplate– Fluent Query, Criteria, Update APIs

– Translation to Spring’s DataAccessException hierarchy• GridFSTemplate• Repositories• QueryDSL• Cross-store persistence• JMX• Log4J Logging Adapter

Spring Data - MongoDB

68

MongoOperations Interface

69

MongoTemplate - Usage

70

• Sample document

• MapFunction – count the occurance of each letter in the array

MongoTemplate - MapReduce

71

{ "_id" : ObjectId("4e5ff893c0277826074ec533"), "x" : [ "a", "b" ] }{ "_id" : ObjectId("4e5ff893c0277826074ec534"), "x" : [ "b", "c" ] }{ "_id" : ObjectId("4e5ff893c0277826074ec535"), "x" : [ "c", "d" ] }

function () { for (var i = 0; i < this.x.length; i++) { emit(this.x[i], 1); }}

• Reduce Function – sum up the occurrence of each letter across all docs

• Execute MapReduce

MongoTemplate - MapReduce

72

function (key, values) { var sum = 0; for (var i = 0; i < values.length; i++) sum += values[i]; return sum;}

MapReduceResults<ValueObject> results = mongoOperations.mapReduce("collection", "classpath:map.js",

"classpath:reduce.js", ValueObject.class);

• @Document– Marks an entity to be mapped to a document (optional)

– Allows definition of the collection the entity shall be persisted to

– Collection name defaults to simple class name

• @Id– Demarcates id properties

– Properties with names id and _id auto-detected

Mapping Annotations

73

• @Index / @CompoundIndex– Creates Indexes for one or more properties

• @Field– Allows customizing the key to be used inside the document– Define field order

• @DBRef– Creates references to entities in separate collection– Opposite of embedding entities inside the document (default)

Mapping Annotations

74

• Same as before with JPA• Added functionality that is MongoDB specfic

– Geolocation, @Query

Mongo Repositories

75

Code Tour - Mongo

76

• Graph Database – focus on connected data– The social graph…

• Schema-free Property Graph• ACID Transactions• Indexing• Scalable ~ 34 billion nodes and relationships, ~1M/traversals/sec• REST API or embeddable on JVN• High-Availability• Declarative Query Language - Cypher

Neo4j

77

• Use annotations to define graph entitles

• Entity state backed by graph database

• JSR-303 bean validation• Query and Traversal API

support• Cross-store persistence

– Part of object lives in RDBMS, other in Neo4j

• Exception translation• Declarative Transaction

Management• Repositories• QueryDSL• Spring XML namespace• Neo4j-Server support

Spring Data Neo4j

78

Classic Neo4j Domain class

79

Spring Data Neo4j Domain Class

80

@NodeEntitypublic class Tag { @GraphId private Long id; @Indexed(unique = true) private String name;}

• @NodeEntity– Represents a node in the

graph– Fields saved as properties on

node– Instantiated using Java ‘new’

keyword, like any POJO– Also returned by lookup

mechanisms– Type information stored in the

graph

Spring Data Neo4j Domain class

81

@NodeEntitypublic class Tag { @GraphId private Long id; @Indexed(unique = true) private String name;}

Spring Data Neo4j Domain Class

82

@NodeEntitypublic class Customer {

@GraphId private Long id; private String firstName, lastName;

@Indexed(unique=true) private String emailAddress;

@RelatedTo(type=“ADDRESS”) private Set<Address> addresses = new HashSet<Address>();}

• Resource Management• Convenience Methods• Declarative Transaction Management• Exception Translation to DataAccessException hierarchy• Works also via REST with Neo4j-Server• Multiple Query Languages

– Cypher, Gremlin

• Fluent Query Result Handling

Neo4jTemplate

83

• Implicitly creates a Neo4jTemplate instance in the app

Neo4jTemplate - Usage

84

Customer dave = neo4jTemplate.save(new Customer("Dave", "Matthews", "dave@dmband.com"));

Product iPad = neo4jTemplate.save(new Product("iPad", "Apple tablet device").withPrice(499));

Product mbp = neo4jTemplate.save(new Product("MacBook Pro", "Apple notebook").withPrice(1299));

neo4jTemplate.save(new Order(dave).withItem(iPad,2).withItem(mbp,1));

<bean id="graphDatabaseService" class="org.springframework.data.neo4j.rest.SpringRestGraphDatabase"> <constructor-arg value="http://localhost:7474/db/data" /></bean>

<neo4j:config graphDatabaseService="graphDatabaseService" />

• Export CrudRepository methods via REST semantics– PUT, POST = save()

– GET = find*()

– DELETE = delete*()

• Support JSON as the first-class data format• JSONP and JSONP+E support• Implemented as Spring MVC application

Spring Data REST

85

• Discoverability– “GET /” results in a list of resources available from this level

• Resources are related to one another by “links”– Links have a specific meaning in different contexts

– HTML and Atom synidcation format has <link rel=“” href=“”/>

• Use Spring HATEOAS as basis for creating representations– https://github.com/SpringSource/spring-hateoas

Spring Data REST

86

Spring Data REST - Example

87

curl -v http://localhost:8080/spring-data-rest-webmvc/{ "links" : [{ "rel" : "person", "href" : "http://localhost:8080/spring-data-rest-webmvc/person" }]}

curl -v http://localhost:8080/spring-data-rest-webmvc/person

{ "content": [ ], "links" : [ { "rel" : "person.search", "href" : "http://localhost:8080/spring-data-rest-webmvc/person/search" } ]}

Spring Data REST - Example

88

curl -v http://localhost:8080/spring-data-rest-webmvc/person/search

{ "links" : [ { "rel" : "person.findByName", "href" : "http://localhost:8080/spring-data-rest-webmvc/person/search/findByName" } ]}

curl -v http://localhost:8080/spring-data-rest-webmvc/person/search/findByName?name=John+Doe

[ { "rel" : "person.Person", "href" : "http://localhost:8080/spring-data-rest-webmvc/person/1"} ]

Spring Data REST - Example

89

curl -v http://localhost:8080/spring-data-rest-webmvc/person/1

{ "name" : "John Doe", "links" : [ { "rel" : "profiles", "href" : "http://localhost:8080/spring-data-rest-webmvc/person/1/profiles" }, { "rel" : "addresses", "href" : "http://localhost:8080/spring-data-rest-webmvc/person/1/addresses" }, { "rel" : "self", "href" : "http://localhost:8080/spring-data-rest-webmvc/person/1" } ], "version" : 1}

• Hadoop has a poor out of the box programming model

• Applications are generally a collection of scripts calling command line apps

• Spring simplifies developing Hadoop applications

• By providing a familiar and consistent programming and configuration model

• Across a wide range of use cases– HDFS usage– Data Analysis

(MR/Pig/Hive/Cascading)• PigTemplate• HiveTemplate

– – Workflow (Spring Batch)– Event Streams (Spring Integration)

• Allowing you to start small and grow

Spring for Hadoop - Goals

90

Relationship with other Spring Projects

91

Books

92

Free Spring Data JPA Chapter – http://bit.ly/sd-book-chapter

O’Reilly Spring Data Book - http://bit.ly/sd-book

• Spring Data– http://www.springsource.org/spring-data– http://www.springsource.org/spring-hadoop

• Querydsl– http://www.querydsl.com

• Example Code– https://github.com/SpringSource/spring-data-book– https://github.com/SpringSource/spring-data-kickstart– Many more listed on individual project pages

Resources

93

Thank You!

Recommended