20
Copyright Red Hat © 2007-2009 Hibernate Search Googling your persistence domain model Emmanuel Bernard Doer JBoss, a division of Red Hat Monday, April 20, 2009

Hibernate Search Seam 1.5

Embed Size (px)

DESCRIPTION

How many times has a customer told you they want to search in their application “like Google“? How many times was the search engine in your application too slow? Hibernate Search brings full-text search capabilities to a persistent domain model, providing Google-like search capabilities while avoiding the traditional cost and difficulties to set up such solutions. In this session, you will learn what problems Hibernate Search can solve and you will follow the steps of adding it to a Hibernate based application. You will build your own application specific full-text search engine. We will also explore advance subjects such as clustering and the underlying of phonetic approximation. About the speaker - Emmanuel Bernard Emmanuel is a Lead developer at JBoss, a division of Red Hat. After graduating from Supelec (French “Grande Ecole”), Emmanuel has spent a few years in the retail industry where he started to be involved in the ORM space. He joined the Hibernate team 4 years ago. Emmanuel is the lead developer of Hibernate Annotations and Hibernate EntityManager, two key projects on top of Hibernate core implementing the Java Persistence(tm) specification, as well as Hibernate Search and Validator. Emmanuel is a member of the EJB 3.0 expert group and the spec lead of JSR 303: Bean Validation. He is a regular speaker at various conferences and JUGs, including JavaOne, JBoss World and JavaPolis and the co-author of Hibernate Search in Action from Manning.

Citation preview

Page 1: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Hibernate Search Googling your persistence domain model

Emmanuel BernardDoerJBoss, a division of Red Hat

Monday, April 20, 2009

Page 2: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Search: left overof today’s applications

Add search dimension to the domain model

2

“Frankly, search sucks on this project” -- Anonymous’ boss

Why? How to fix that? For cheap?

Monday, April 20, 2009

Page 3: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Integrate full-text search and persistent domain

SQL Search vs Full-text SearchObject model / Full-text Search mismatches

Demo Hibernate Search architectureConfiguration and MappingFull-text based object queriesSome more features

3

Monday, April 20, 2009

Page 4: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

SQL search limits

Wildcard / word search‘%hibernate%’

Approximation (or synonym)‘hybernat’

Proximity‘Java’ close to ‘Persistence’

Relevance or (result scoring)multi-”column” search

4

Monday, April 20, 2009

Page 5: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Full Text Search

Search informationby wordinverted indices (word frequency, position)

In RDBMS enginesportability (proprietary add-on on top of SQL)flexibility

Standalone engineApache Lucene(tm) http://lucene.apache.org

5

Monday, April 20, 2009

Page 6: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Mismatches with a domain model

Structural mismatchfull text index are text onlyno reference/association between document

Synchronization mismatchkeeping index and database up to date

Retrieval mismatchthe index does not store objectscertainly not managed objects

6

Appl

Fwk

Persistence

Search

Domain

Model

Monday, April 20, 2009

Page 7: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009 7

Demo Let’s google our application!

Monday, April 20, 2009

Page 8: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Hibernate Search

Under the Hibernate platformLGPL

Built on top of Hibernate CoreUse Apache Lucene(tm) under the hood

In top 10 downloaded at ApacheVery powerful but low leveleasy to use it the “wrong” way

Solve the mismatches

8

ApplFwk

Persi

sten

ce

DomainModel

Search

JBoss Seam

Monday, April 20, 2009

Page 9: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Architecture

Transparent indexing through event system (JPA)PERSIST / UPDATE / DELETE

Convert the object structure into Index structureOperation batching per transaction

better Lucene performance“ACID”-ity(pluggable scope)

Backendsynchronous / asynchronous modeLucene, JMS

9

Monday, April 20, 2009

Page 10: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Architecture (Backend)

Lucene Directorystandalone or symmetric cluster (limited scalability)immediate visibilitycan affect front end runtime

10

Database

Lucene Directory(Index)

Hibernate +

Hibernate Search

Search requestIndex update

Hibernate +

Hibernate Search

Search requestIndex update

Monday, April 20, 2009

Page 11: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Architecture (Backend)

JMS (Cluster)Search processed locally / Change sent to masterasynchronous indexing (delay)No front end extra cost / good scalability

11

DatabaseHibernate +

Hibernate Search

JMS queue

Lucene Directory(Index)Master

Hibernate +

Hibernate SearchProcessIndex update

Index update order

Lucene Directory(Index)Copy

Search request

Copy

Slave

Master

Monday, April 20, 2009

Page 12: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Configuration and Mapping

Configurationevent listener wiringtransparent in Hibernate Annotations

Backend configuration

Mapping: annotation based@Indexed@Field(store, index)@IndexedEmbedded@FieldBridge

12

Monday, April 20, 2009

Page 13: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Query

Retrieve objects, not documentsno boilerplate conversion code!

Objects from the Persistence Contextsame semantic as a JPA-QL or Criteria query

Use org.hibernate.Query/javax.persistence.Querycommon API for all your queries

Query on correlated objects“JOIN”-like query "author.address.city:Atlanta"

13

Monday, April 20, 2009

Page 14: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Query possibilities

bring the “best” document firstrecover from typosrecover from faulty orthographyfind from words with the same meaningfind words from the same familyfind an exact phrasefind similar documents

14

Monday, April 20, 2009

Page 15: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

More query features

Fine grained fetching strategySort by property rather than relevanceProjection (both field and metadata)

metadata: SCORE, ID, BOOST etcno DB / Persistence Context access

Filterssecurity / temporal data / category

Total number of results

15

Monday, April 20, 2009

Page 16: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Some more features

Automatic index optimizationIndex shardingManual indexing and purging

non event-based systemShared Lucene resourcesNative Lucene access

16

Monday, April 20, 2009

Page 17: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009

Full-Text search without the hassle

Transparent index synchronizationAutomatic Structural conversion through MappingNo paradigm shift when retrieving data

Clustering capability out of the box

Easier / transparent optimized Lucene use

17

Monday, April 20, 2009

Page 18: Hibernate Search Seam 1.5

Road Map

Programmatic Mapping API Perf

Multi threaded mass index

Clustering

Query

Dictionary and spellchecker

Easier query building

MoreLikeThis?

Statistics

18

Monday, April 20, 2009

Page 19: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009 19

Q&A

Monday, April 20, 2009

Page 20: Hibernate Search Seam 1.5

Copyright Red Hat © 2007-2009 20

Hibernate Searchhttp://search.hibernate.orgHibernate Search in Action

Apache Lucenehttp://lucene.apache.orgLucene In Action

http://in.relation.tohttp://blog.emmanuelbernard.com

For More Information

Monday, April 20, 2009