29
DBSight Instant Scalable Full Text Search on Any Databases for Any Page

DBSight Introduction

Embed Size (px)

DESCRIPTION

An excellent tool to create Database search quickly, with high performance and scalability, easy to maintain, flexible result rendering, api or javascript integration, etc.

Citation preview

Page 1: DBSight Introduction

DBSight

Instant Scalable Full Text Searchon Any Databases

for Any Page

Page 2: DBSight Introduction

Everything Easy!

• Create Search with SQL• Integrate Search with Javascript• Automatically managed• Lots of ways to customize

• Mostly via UI

Page 3: DBSight Introduction

Some Customers• Websites

• Workopolis.com, biggest job site in Canada• Twenga.com, busy • Current.com

• Corporate• eBay• Costco• FactSet• Genetech

• Consulting• Computer Science Corporation

• Federal• Federal Procurement Data System

• Banking• European Central Bank, International Counterfeit Deterrence

Centre(ECB-ICDC)

Page 4: DBSight Introduction

Features• Add Search when you want to

• Database Independent• Language Independent• Very Easy to Create• Very Easy to Modify• Very Easy to Monitor/Maintain

• Feature Rich• Facet Search• Real Time Search• Flexible ranking• …

• High Performance• Linearly Scalable

Page 5: DBSight Introduction

The Pain to Add Search

• What slow SQL you use to search? select * from tableA where column1 like ‘%abc%’ or column2 like ‘%abc%’ or …

• Some database has search features, but still• Not easy to customize• No facet search• Database specific solution

• Some Lucene based search, but• Not cover development life cycle• Hard to maintain.• Will it scale if your data grows?• Too closely coupled with your program

Page 6: DBSight Introduction

DBSight Design Goals

• Very fast to create and adjust• “Knobs” to tune search• Features beyond basic search

o Facet Searcho Results ranking by attributes like “price”, “time”!

• Minimal administration• Low total cost of ownership

• Off-the-shelf, no expensive consulting fees• Flexible

• Customizable analyzers, similarities, data retrieval, UI scaffolding, different output formats, APIs.

Page 7: DBSight Introduction

2 steps to Create Search

1.Select with SQL2.Generate Search Configuration,

Results Template

Page 8: DBSight Introduction

Simple to Create

• Web UI too set SQL to retrieve content

Page 9: DBSight Introduction

Simple to Create• Scaffolding to generate search result

template

Page 10: DBSight Introduction

Simple to Create

• Keep it DRY – Do not Repeat Yourself• Make full use of existing metadata

• From your SQL• Generate most Lucene configuration• Generate most search options• Generate most rendering templates

• UI to fine-tune customization

Page 11: DBSight Introduction

Why DBSight?

• A whole Solution• No consulting fees• UI to manage everything• Set it up, and leave it run. No babysitting.

• Agnostic of programming languages or frameworks• Create and maintain, with basic SQL• Built-in Usage Statistics• Scalable

• Linearly Scalable Sharded Search• Separated Indexing and Searching

• Many Customization point• Search Results templates easy to customize• API for deep integration

Page 12: DBSight Introduction

DBSight covers SDLC

• No existing production ready solutions in the market covers the whole software development life cycle.o DBSight make search a separated concerno Change easily when database schema changeso Enterprise Ready

Monitoring Portable for Dev=>Test => Stage => Production

environments

Page 13: DBSight Introduction

DBSight – Loosely Coupled

• Database Independent• Programming language Independent• Framework Independent• Works during the whole software life cycle

• Allows frequent adjustment• Easily re-create the whole index• Linearly scalable for high concurrency and for

high data volume

Page 14: DBSight Introduction

DBSight – Enterprise Ready

• Easy to move deployment environmentso Development, Testing(QA), Staging, Production

• Secureo access controlo sensitive database passwords

• Package-able Solutiono Import/Export configurationo Customizable enterprise-specific scaffolding

Page 15: DBSight Introduction

DBSight – SQL Friendly

• Incremental Indexingo Handle New/Updated/Deleted recordso Find Deleted Records Efficiently!

Support Hard Deleted Records Support Soft Deleted Records

• Flexible User Defined SQLo Star-schema like content retrievalo No too-smart auto discovery

• Efficiento Multi-threaded data retrievalo Caching to minimize database loado Customizable number of SQL connections

Page 16: DBSight Introduction

Easy to maintain

• Scheduled Jobso Incremental indexingo Re-Create indexingo Build spell checking, synonyms, stop words

dictionaries• Web UI to

o Monitor Indexing processo Monitor Search Usage

Page 17: DBSight Introduction

Facet Search

• Not all facet search are the same!• Single Value• Multiple Value• Range Value

• Fast!• Memory Efficient!• More Features!

• List most-used facet according to usage!• Sum()/Avg() functions

Page 18: DBSight Introduction

Fast Facet Search

• In Memory Facets• Several cache for More Speed

• Cache for top facets• Cache for recent facets

• Automatically Pre-warm up

Page 19: DBSight Introduction

Examples of Facet Search

• Basic Facet Search• Example: Category

• Classic ( 23 matches)

• Multi-Valued Facet Search• Example: Tag, or Tag Cloud. Several tags for one record

• Configurable dynamic facet groupingo Example: Price Range

$2 ~$3 (7 matches)

• Average/Sum for each faceto Example: Average price for the Year Range

1970~1980(6 matches, average price $1,245.34)

Page 20: DBSight Introduction

Feature: Seperated Indexing and Searching

• Problem: Search Pauses!o Indexing is CPU and Disk intensiveo Searching is CPU, Disk, and Memory intensiveo Resource competition

• Solution: Separated Indexing and Searchingo Different JVM processes

Easier to manage resources via JVM settingso Indexing and Searching can be on different machines.

Improve performance No hiccup because of CPU, memory, disk resources

contension. Cluster of Searching nodes for better scalable performance!

Page 21: DBSight Introduction

DBSight Architecture

• Crawl database via user defined SQL• Multiple database tables• Support star-schema like outer joins

• Create and Maintain Lucene index• Incremental Indexing• Re-Creating Indexing

• Serve Search results via• user defined templates

• XML/HTML/JSON/JSONP• API, protocol buffer for Java and other languages

• Linear Scalability

Page 22: DBSight Introduction
Page 23: DBSight Introduction

Avoids Global GC Pause!

• Separated Indexing and Searching processes• Fast Index Update• Avoids memory GC pause!• No user waiting, even when index is updating!• More Scalable

• Java has its shortcomings.• Global stop-the-world garbage collection simply

keeps user waiting.• Common solutions, open source solutions, or home

grown solutions often fail to address this issue. Or hoping you would not notice it.

Page 24: DBSight Introduction
Page 25: DBSight Introduction

DBSight Common Setup

• Embedded• Multiple Indexes on single node• Single Node, Indexing + Searching• Two Nodes, Separated Indexing and Searching

• Setup for LAN• Setup for WAN

• Cluster of Searching nodes via Replication, one Indexing nodes

• Cluster of Sharded Nodes, each with Indexing +Searching

• Cluster of Searching nodes via Sharding, one or several indexing nodes.

Page 26: DBSight Introduction
Page 27: DBSight Introduction
Page 28: DBSight Introduction
Page 29: DBSight Introduction