Real World NoSQL (by Chris Yuen)

Real World NoSQL x Big Data

OverviewIntroduction

Motivation for NoSQLThe NoSQL landscape

Experience sharingHBaseMongoDBCassandra

Tying it up – how does it really matter

MotivationToo much data – the need to “scale out”

CAP theorem

PerformanceRDMBS joining is slowDenormalization

Key value data store

Alternative data representationSchemaless “No SQL”

CAP theorem

PerformanceRDMBS joining is slowDenormalization

Key value data store

Alternative data representationSchemaless “No SQL”

Document data store

HBaseBuilds on top of HDFS

Consistent “big-data” database

Automatically scales out

HBase… but we didn’t use it in the end

HBaseA nightmare to set up and maintain

Depends on Hadoop, HDFS, Zookeeper

HBaseA nightmare to set up and maintain

Depends on Hadoop, HDFS, Zookeeper

No secondary index

“Table” alteration requires downtime

Not spectacular latency for OLTP usage

MongoDBDe-facto “big-data” “NoSQL” database

Document based data representation

MongoDBDe-facto “big-data” “NoSQL” database

Document based data representation

MongoDBA good balance of “traditional” usage and

“NoSQL” usageSupports secondary indexRange query

Can do table scan

MongoDB“Big-data” features: sharding, replica set

MongoDB… but it got ugly pretty fast

Devil’s in the detailsReplica set management fiascoSharding is difficult to set up and poorly

implementedhttps://github.com/kizzx2/mongolab

MongoDB

MongoDBReality – it doesn’t scale beyond one machine

Replica set

CassandraColumn Family data store

More “NoSQL” than MongoDB. Less features

Column data store – strictly key/value query

CassandraAuto-sharding just works

Replica set requires 0 configuration

Append only, LSM-tree based storage formatGood for SSDHigh insert throughput

For storing analytic data

CassandraHas rudimentary support for secondary index

Difficult to do table scan or range scan

Require substantial application / paradigm shift

Real World ImplicationsWhy does NoSQL matter to Big Data?

Schemaless storage modelPerformanceScalability

Rapidly incorporate unstructured new data sources without extensive planning

How to ChooseMaintenance / Scalability

Supported operations

OLAP vs. OLTP

Thank YouChris Yuen

http://cfc.kizzx2.com

http://github.com/kizzx2

@kizzx2

chris@kizzx2.com

Real World NoSQL (by Chris Yuen)

Technology

SANG YUEN ENTERPRISES INTL LTD.. COMPANY NAME Sang Yuen Enterprises International Ltd. Sang Yuen Enterprises International Ltd. REGISTRATION ON 1997.06

Exhibition pictures June 4 2009 · Award Winners Michelle Lum Jonathan Yuen Chris Cinkornpumin Jonathan Chan Energy Harvesting from Exercise Machines

Jeffery Yuen Luo Notes

NoSQL and Big Data Analytics at NOSQL NOW! 2013

NoSQL Now! NoSQL Architecture Patterns

Mr Leung Ka Yin (HKTA The Yuen Yuen Institute No.2 Secondary School)

Lana Yuen - Withersworldwide

How to Use NoSQL - nosqlroadshow.comnosqlroadshow.com/dl/NoSQL-Road-Show/slides/Howt... · How to Use NoSQL in Enterprise Java Applications Patrick Baumgartner NoSQL Roadshow | Zürich

«NoSQL benchmarking v2.0. Исследование производительности современных NoSQL-решений»

PLK HKTA Yuen Yuen Primary School - yyps.edu.hk · PLK HKTA Yuen Yuen Primary School . School Report . School Year . 2008-2009

Reasoning Activities Using 0.999... = 1 in Developmental Mathematics Chris L. Yuen, CLYUEN@buffalo.edu SUNY University at Buffalo November 13, 2014

(Site B19) Yuen Long

China's Labor Cost Problem Ang, Yuen Yuen The

(Site D13) Yuen Long

Yuen Long Report

Chris Ward - Understanding databases for distributed docker applications - NoSQL matters Dublin 2015

Q y // NoSQL’Road’Show,’Zurich’nosqlroadshow.com/dl/NoSQL-Road-Show/slides/nosql... · NoSQL,’NewSQL’and’Beyond ... •’OrientDB ’ •’NuvolaBase ... •’ScaleBase

Who am I?assets.astrails.com/.../wtf-is-mysql.pdf · Twitter Rackspace Digg Everybody LinkedIn Wednesday, June 16, 2010. NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL NoSQL

LINDA YUEN-CHING LIM

CCC Chuen Yuen College