Upload
prasoon-kumar
View
2.761
Download
1
Tags:
Embed Size (px)
DESCRIPTION
How many times has a customer told you they want to search in their application “like Google“? How many times was the search engine in your application too slow? Hibernate Search brings full-text search capabilities to a persistent domain model, providing Google-like search capabilities while avoiding the traditional cost and difficulties to set up such solutions. In this session, you will learn what problems Hibernate Search can solve and you will follow the steps of adding it to a Hibernate based application. You will build your own application specific full-text search engine. We will also explore advance subjects such as clustering and the underlying of phonetic approximation. About the speaker - Emmanuel Bernard Emmanuel is a Lead developer at JBoss, a division of Red Hat. After graduating from Supelec (French “Grande Ecole”), Emmanuel has spent a few years in the retail industry where he started to be involved in the ORM space. He joined the Hibernate team 4 years ago. Emmanuel is the lead developer of Hibernate Annotations and Hibernate EntityManager, two key projects on top of Hibernate core implementing the Java Persistence(tm) specification, as well as Hibernate Search and Validator. Emmanuel is a member of the EJB 3.0 expert group and the spec lead of JSR 303: Bean Validation. He is a regular speaker at various conferences and JUGs, including JavaOne, JBoss World and JavaPolis and the co-author of Hibernate Search in Action from Manning.
Citation preview
Copyright Red Hat © 2007-2009
Hibernate Search Googling your persistence domain model
Emmanuel BernardDoerJBoss, a division of Red Hat
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Search: left overof today’s applications
Add search dimension to the domain model
2
“Frankly, search sucks on this project” -- Anonymous’ boss
Why? How to fix that? For cheap?
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Integrate full-text search and persistent domain
SQL Search vs Full-text SearchObject model / Full-text Search mismatches
Demo Hibernate Search architectureConfiguration and MappingFull-text based object queriesSome more features
3
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
SQL search limits
Wildcard / word search‘%hibernate%’
Approximation (or synonym)‘hybernat’
Proximity‘Java’ close to ‘Persistence’
Relevance or (result scoring)multi-”column” search
4
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Full Text Search
Search informationby wordinverted indices (word frequency, position)
In RDBMS enginesportability (proprietary add-on on top of SQL)flexibility
Standalone engineApache Lucene(tm) http://lucene.apache.org
5
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Mismatches with a domain model
Structural mismatchfull text index are text onlyno reference/association between document
Synchronization mismatchkeeping index and database up to date
Retrieval mismatchthe index does not store objectscertainly not managed objects
6
Appl
Fwk
Persistence
Search
Domain
Model
Monday, April 20, 2009
Copyright Red Hat © 2007-2009 7
Demo Let’s google our application!
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Hibernate Search
Under the Hibernate platformLGPL
Built on top of Hibernate CoreUse Apache Lucene(tm) under the hood
In top 10 downloaded at ApacheVery powerful but low leveleasy to use it the “wrong” way
Solve the mismatches
8
ApplFwk
Persi
sten
ce
DomainModel
Search
JBoss Seam
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Architecture
Transparent indexing through event system (JPA)PERSIST / UPDATE / DELETE
Convert the object structure into Index structureOperation batching per transaction
better Lucene performance“ACID”-ity(pluggable scope)
Backendsynchronous / asynchronous modeLucene, JMS
9
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Architecture (Backend)
Lucene Directorystandalone or symmetric cluster (limited scalability)immediate visibilitycan affect front end runtime
10
Database
Lucene Directory(Index)
Hibernate +
Hibernate Search
Search requestIndex update
Hibernate +
Hibernate Search
Search requestIndex update
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Architecture (Backend)
JMS (Cluster)Search processed locally / Change sent to masterasynchronous indexing (delay)No front end extra cost / good scalability
11
DatabaseHibernate +
Hibernate Search
JMS queue
Lucene Directory(Index)Master
Hibernate +
Hibernate SearchProcessIndex update
Index update order
Lucene Directory(Index)Copy
Search request
Copy
Slave
Master
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Configuration and Mapping
Configurationevent listener wiringtransparent in Hibernate Annotations
Backend configuration
Mapping: annotation based@Indexed@Field(store, index)@IndexedEmbedded@FieldBridge
12
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Query
Retrieve objects, not documentsno boilerplate conversion code!
Objects from the Persistence Contextsame semantic as a JPA-QL or Criteria query
Use org.hibernate.Query/javax.persistence.Querycommon API for all your queries
Query on correlated objects“JOIN”-like query "author.address.city:Atlanta"
13
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Query possibilities
bring the “best” document firstrecover from typosrecover from faulty orthographyfind from words with the same meaningfind words from the same familyfind an exact phrasefind similar documents
14
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
More query features
Fine grained fetching strategySort by property rather than relevanceProjection (both field and metadata)
metadata: SCORE, ID, BOOST etcno DB / Persistence Context access
Filterssecurity / temporal data / category
Total number of results
15
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Some more features
Automatic index optimizationIndex shardingManual indexing and purging
non event-based systemShared Lucene resourcesNative Lucene access
16
Monday, April 20, 2009
Copyright Red Hat © 2007-2009
Full-Text search without the hassle
Transparent index synchronizationAutomatic Structural conversion through MappingNo paradigm shift when retrieving data
Clustering capability out of the box
Easier / transparent optimized Lucene use
17
Monday, April 20, 2009
Road Map
Programmatic Mapping API Perf
Multi threaded mass index
Clustering
Query
Dictionary and spellchecker
Easier query building
MoreLikeThis?
Statistics
18
Monday, April 20, 2009
Copyright Red Hat © 2007-2009 19
Q&A
Monday, April 20, 2009
Copyright Red Hat © 2007-2009 20
Hibernate Searchhttp://search.hibernate.orgHibernate Search in Action
Apache Lucenehttp://lucene.apache.orgLucene In Action
http://in.relation.tohttp://blog.emmanuelbernard.com
For More Information
Monday, April 20, 2009