SQL for Elasticsearch

Preview:

DESCRIPTION

Slides of the Talk Jodok Batlogg held at the Elasticsearch Meetup in Berlin

Citation preview

SQL on Elasticsearch?

How all started

You know, for searchquerying 24 000 000 000 Records in 900ms

@jodok

6 ES Master Nodesc1.xlarge

40 ES nodes per zonem1.large8 EBS Volumes

6 Node Hadoop Cluster+ Spot Instances

3 AP server / MCc1.xlarge

Elastic Search as Primary Storage?

NoSQL Roadshow 2013 Jodok Batlogg

• Security Model? • Transactions? • Data security? • Toolsets? • Larger Computations? • Availability?

D I S T R I B U T E D D A T A S T O R E W I T H S Q L . S I M P L E . R E L I A B L E . S C A L A B L E .

Open Source (Apache 2.0)

shared nothing

is high available and cheap to operate.

not NOSQL but SQL

NOFS but distributed BLOBs

Storage

Data Aggregation

Query

Client

Network/Cluster

CRATE Dashboard Python JavaDB-API

SQLAlchemyCRATE Shell

ES native

Transport

FB Presto SQL Parser

Query planner

Bulk import/export

BLOB streaming

Distributed SQL

ES Transport protocol

ES Discovery and state

Lucene BLOB storageES

CRATE DATA – Module overview

3rd party Open

Source Module

s

CRATE

BLOB streaming support

Netty

ES Scatter/Gather

Distributed reduce Data transformation and reindex support

ES Sharding

Ruby

S T A R T A C L U S T E R I N 1 M I N

H T T P S : / / C R A T E . I O

BLOB Storage

Distributed Accurate Aggregations

Partitioned Tables

Import/Export

Update by Query

Insert by Query

Integrated Admin-UI

How is Crate Data different than Elasticsearch?

Thank you

Jodok Batlogg, @jodok, jodok@crate.io

github.com/crate, #crate / freenode, @cratedata

Demo Video

http://bigdatanerd.files.wordpress.com/2011/12/cap-theorem.jpg

• Basically Available - you always get an response

• Soft State - it’s not consistent all the time.

• Eventually Consistent - it becomes consistent at a later point in time

BASE & CAP

Recommended