Upload
insidehpc
View
278
Download
0
Embed Size (px)
DESCRIPTION
In this slidecast, Robin Purohit of Clustrix describes the company's leading scale-out SQL database engineered for the cloud. "Clustrix provides the scale, flexibility, simplicity, availability, and raw power that have given both enterprise and fast-growth organizations the ability to innovate faster -- and drive those innovations to market sooner than their competition. As the most mature of the primary databases, Clustrix is the leading scale-out SQL database engineered for the cloud. With Clustrix, organizations can scale transactions, run real-time analytics, and simplify operations." Learn more: http://www.clustrix.com Watch the presentation video: http://inside-bigdata.com/2013/09/06/clustrix-scaleout-sql-database-engineered-cloud/
Citation preview
The Leading Scale-out SQL Database Engineered for the Cloud
Robin Purohit
CEO and President
SCALE-OUT DATABASES ARE THE RIGHT APPROACH
UNLESS YOU HAVE UMLIMITED MONEY TO SPEND
NoSQL NewSQL Hadoop
FOR HYPER-SCALE WEB AND MOBILE APPLICATIONS
Cloud Makes It Possible Do This Quickly and Pay-as-you-go
Great Idea Billions of Transactions and Rows
Smarter Application
Ad HocReporting
SCALE-OUT SQL DATABASE FOR OPERATIONAL DATA
MASSIVE TRANSACTIONVOLUME
REAL-TIME ANALYTICS
ACID, SQL AND MYSQL
SELF-MANAGING
BUILT-IN INSTRUMENTATION
SCALE-OUT SQL
Add nodes as demand grows
Automated recovery on failure
OPERATIONAL DATABASE
E-commerce
EXAMPLES APPLICATION SEGMENTS
BATTLED TESTED LESSONS
Consumer Web Advertising Analytics
BUSTING THE MYTH - SQL CAN SCALE
• 20 million+ users / 70,000+ TPS• Write heavy workload; 1TB+ writes / day
Massive Transaction Scale Real-Time Analytics
MIXED WORKLOADS
IF YOU DON’T BELIEVE US – BELIEVE GOOGLE
F1 Based on “SPANNER” for Ad Words
http://www.theregister.co.uk/2013/08/30/google_f1_deepdive/
“100s of applications on over 100TB serving up 100s of thousands of requests per second
+ SQL queries that scans tens of trillions of data rows a day”
HOW TO CHOOSE THE RIGHT TOOL FOR THE JOB?
E-COMERCE EXAMPLE (SQL NORMALIZATION + JOIN = GOOD)
Customers(many)
Products(many or few
& may require flexibility)
Orders(many)
Reviews(many)
Problem is naturally relational - Orders, Reviews are for products by customers
What questions do you have?• Do you want to know all reviews for a product
along with the customer who wrote it (Product X Review X Customer)
• What about most popular products in San Francisco, or last 10 orders by a customer?
What Flexibility do you need? • Maybe all products have different attributes
WHAT DATA and WHAT QUESTIONS?
How SIMPLY do the QUESTIONS need to be answered?
MAP REDUCE OR SQL?
And how many lines of code?
WHEN do you want the QUESTIONS answered?
How COMPLEX is the Question?
NoSQLKey-Value, Document
NewSQLe.g. Clustrix
Warehousing AnalyticsHadoop, Vertica, Redshift
Query Complexity
In Memory Analytics
Reads and Writes Real-Time Analytics Batch Analytics
milliseconds secondsminutes Hours
ETL
HadoopKey-Value
SQL Warehousing
Vertica
SIZE and FLEXIBILITY and QUERIES
SIZE FLEXIBILITY
NewSQL10s of TBS
100s of TBS
PetabytesKey-ValueHadoop
Document / Tabular
Relational Schema,Online schema
changes
Schema-less
NEWSQL
Rows with different columns
QUERY ABILITY
Simple lookup
Indexed lookup
Joins and complex Analytics
With Flexibility,you Lose the sophisticated
SQL Query optimizer
RIGHT TOOL FOR THE JOB
NoSQL NewSQL Hadoop Columnar
OPERATIONAL DATA BATCH ANALYSIS
With Alot More SQL
Clustrix Technical Resources
docs.clustrix.com