12
Apache Solr vs Oracle Endeca 07-01-2015 Pedro Melo Pereira

Apache Solr vs Oracle Endeca

Embed Size (px)

Citation preview

Page 1: Apache Solr vs Oracle Endeca

Apache Solr vs Oracle Endeca

07-01-2015

Pedro Melo Pereira

Page 2: Apache Solr vs Oracle Endeca

1 Key concepts

– Enterprise search platform

– Faceted search

2 Projects overview

3 Apache Solr

4 Oracle Endeca

5 Feature comparison

6 Conclusion

2

Summary

Page 3: Apache Solr vs Oracle Endeca

• Enterprise search platform

The practice of identifying and enabling specific content across the enterprise to be indexed, searched

and displayed to authorized users.

3

1. Key Concepts

1 Collection

2 Indexing

3 Query

Parser

4 Query

Engine

5 Post

Processor

6 Formatter

Content

Indexing

Query

Processing

1. Crawls directories and

websites, extracts content

from databases and other

repositories. Arranges for

content to be transferred to it

on a regular basis so it can

notify the search engine that

new information is available

2. Creates a searchable index

from all the content, often with

some value added processing

such as metadata extraction

and auto-summarization

(groups information into logical

categories)

3. Accepts searcher queries and

encodes them for optimal use

4. Passes query over index and

finds documents matching

search criteria

5. Sorts documents and applies

logic to the results such as

categorization, clustering and

recommendations

6. Streams out and formats

results

How do they work

Page 4: Apache Solr vs Oracle Endeca

• Faceted search

– Its the dynamic clustering of items or search results into categories that let users drill into search

results by any value in any field. Each facet displayed shows the number of hits that match that

category. Users can “drill down” by applying specific constraints to the search results. Also called

faceted browsing, faceted navigation, guided navigation and parametric search.

The example started out with all digital cameras, then the user selected the constraints “$400-$500”

and “SLR” from the Price and Digital camera type facets.4

1. Key Concepts

Page 5: Apache Solr vs Oracle Endeca

• Faceted search benefits

– Superior feedback: Users can see at a glance a summary of the search results and how those results

break down by different criteria

– No surprises or dead ends: Users know how many results match before they click. Values with zero

counts are normally removed to reduce visual noise and eliminate the possibility of a user accidentaly

selecting a constraint that would lead to no results

– No selection hierarchy is imposed: Users are generally free to add or remove constraints in any order

5

1. Key Concepts

Page 6: Apache Solr vs Oracle Endeca

Apache Solr Oracle Endeca

An open source community supported

tool that allows IT to implement a

faceted search capability based on text

queries to an index of your data model

(e.g. products)

A mature product that provides all the

GUI based tools needed to allow IT and

business to quickly deploy search and

navigation built on queries to text and

object based data model.

More extensible Faster time to market

Faceted search – text search based Guided navigation – data model based

Limited tools Robust integrated tool set

6

2. Projects overview

Page 7: Apache Solr vs Oracle Endeca

Solr is a highly popular open source enterprise search platform from Apache. It uses the Lucene Java search

library at its core for full-text indexing and search, it has REST-like HTTP/XML and JSON apis that make it

usable from most programming languages.

Apache Lucene and Apache Solr projects were merged in 2010.

Strengths

• Free

• More powerful and extensible (e.g. freedom to build custom ranking algorithms)

• Larger adoption by the industry

• Larger community / modules / documentation

• Based on industry proven modules

Weaknesses

• No out of the box GUI for business users. Has to be implemented by IT

• No reporting

• It’s considered a framework not a product7

3. Apache Solr

Page 8: Apache Solr vs Oracle Endeca

“Oracle had struggled to develop a strategy for enterprise search that would define it as a Leader. To do this, it has repurposed

Oracle Secure Enterprise Search as a tool that informs all its applications.

The acquisition of Endeca catapults Oracle forward in terms of search facility, though, at Oracle, Endeca is more prominent as

a means of improving business intelligence than as a search product.”

Strengths – Gartner report 2013/05

• Oracle offers strong flexibility for the design of conversational search capabilities to reduce the ambiguity

of results

• Oracle has very strong experience in e-commerce use cases

• Oracle has invested particularly strongly in the searching and analysing of structured data for hybrid

structured / unstructured use cases

Weaknesses

• Oracle has changed the model of pricing by data record to a price by processor (Oracle’s long standing

model). Clients indicate that they are often dissatisfied with this new model.

• Oracle is positioning Endeca as a search technology in the e-commerce arena, which might

weaken its development as a stand-alone enterprise search engine.8

4. Oracle Endeca

Page 9: Apache Solr vs Oracle Endeca

Feature Apache Solr Oracle Endeca

Data modeling XML editing GUI tool set that supports configuration

and joining data from multiple sources

Index inspection Velocity based application that

supports search

Robust reference application to inspect

data and explore features

Business users n/a GUI based business suite to manage

configurations

Merchandising n/a GUI to manage merchandising rules

Reporting n/a Out of the box reports for search,

navigation and merchandising

Relevance ranking Extend a class to create what you

want

Limited to adjusting modules

XQuery n/a Xquery based ad-hoc querying with XML

support

9

5. Feature comparison

Page 10: Apache Solr vs Oracle Endeca

Feature Apache Solr Oracle Endeca

Aggregating records n/a Rollup records based on a property to

support variants

Hierarquical dimensions n/a Possible to define hierarchies for ranges

Internationalization Out of the box only supports

English. Has to use external

modules to support it

Licensed support for multiple languages

Clustering Manually configured by IT by using

external modules

Automatic organization of search results

into sets that share attributes

Scalability Based on Apache Zookeeper. Easy

to scale up. More powerful

Linear scalability out of the box. Easier

to manage

10

5. Feature comparison

Page 11: Apache Solr vs Oracle Endeca

Apache Solr

Strengths

• Fully integrated with Lucene (same project, different

modules).

• More freedom to customize and adapt to business

needs.

• More powerful api.

• Larger adoption / community.

Weaknesses

• No out of the box features for business users.

• More time to market for IT to implement features

(e.g. reporting, business Backoffice).

11

6. Conclusion

Oracle Endeca

Strengths

• Aligned with Oracle’s long-term goals to make it the

e-commerce reference for enterprise search.

• Out of the box features for business users

(backoffice).

Weaknesses

• Separate index. No integration with Lucene.

• Api more constrained. Possibly more difficult to

integrate to diverse business needs.

• Smaller adoption / community.

Page 12: Apache Solr vs Oracle Endeca

Q&A