Who am I?
• Founder of Astrails
• 19 years of industry experience
• Building Web Apps since 2005
Vitaly Kushner
Wednesday, June 16, 2010
NoSQLwhen, why and how?
Vitaly Kushnerastrails.com
Wednesday, June 16, 2010
Wednesday, June 16, 2010
Yahoo
Wednesday, June 16, 2010
Yahoo
Amazon
Wednesday, June 16, 2010
Yahoo
Amazon
Wednesday, June 16, 2010
Yahoo
Amazon
Wednesday, June 16, 2010
Yahoo
Amazon
Digg
Wednesday, June 16, 2010
Yahoo
Amazon
Rackspace
Digg
Wednesday, June 16, 2010
Yahoo
Amazon
Rackspace
Digg
Wednesday, June 16, 2010
Yahoo
Amazon
Rackspace
Digg
LinkedInEverybody
Wednesday, June 16, 2010
NoSQL
NoSQL
NoSQLNoSQL
NoSQL
NoSQL
NoSQL
NoSQL
NoSQL
NoSQL
Wednesday, June 16, 2010
WTF is NoSQL?and why should you care?
Wednesday, June 16, 2010
Non relational
Wednesday, June 16, 2010
Non relationalDocument based
Wednesday, June 16, 2010
Non relationalDocument based
Key-Value store
Wednesday, June 16, 2010
Non relationalDocument based
Key-Value store
column-based
Wednesday, June 16, 2010
Non relationalDocument based
Key-Value store
Graph DB
column-based
Wednesday, June 16, 2010
Non relationalDocument based
Key-Value store
Distributed
Graph DB
column-based
Wednesday, June 16, 2010
Non relationalDocument based
Key-Value store
Schema-less
Distributed
Graph DB
column-based
Wednesday, June 16, 2010
Non relationalDocument based
Key-Value store
Schema-less
Distributed
BASE is not ACID Graph DB
column-based
Wednesday, June 16, 2010
Why & When
Wednesday, June 16, 2010
Massive Data Volume100K servers in a cluster
Wednesday, June 16, 2010
Massive Data Volume100K servers in a cluster
Twitter: 7+T/day
Wednesday, June 16, 2010
High query workloadMongoDB: 8M operations/sec
Wednesday, June 16, 2010
Flexible Schemaon the fly schema changes
Wednesday, June 16, 2010
Massive Scale
Wednesday, June 16, 2010
Availability
Wednesday, June 16, 2010
Everyone wantAvailability
Wednesday, June 16, 2010
RDBMScan deliver
Wednesday, June 16, 2010
high price
Wednesday, June 16, 2010
Not ACID anymore
Wednesday, June 16, 2010
• Consistent
• Available
• Partition tolerant
CAP theoremPick two
Wednesday, June 16, 2010
ScaleHow?
Wednesday, June 16, 2010
Throw hardware money at it!
Wednesday, June 16, 2010
Par-ti-tion
Wednesday, June 16, 2010
MySQL + Memcached=
“square wheel” cassandra
Wednesday, June 16, 2010
How?
Wednesday, June 16, 2010
Which one?
• document based
• column or key-value store
• advanced storage schemas
Wednesday, June 16, 2010
Cassandra
• built by Facebook
• very high write throughput
• OLTP
• automatic horizontal scaling
• no single point of failure
Wednesday, June 16, 2010
HBase
• Apache project
• Consistent
• Optimized for analytics (OLAP)
• Has single point of failure
Wednesday, June 16, 2010
MongoDB
• probably easiest to move to from SQL
• document based
• on-demand queries
• automatic sharding
• no single-node durability
Wednesday, June 16, 2010
CoachDB
• document based
• map-reduce javascript querying/filtering.
• has some replication and scaling problems
Wednesday, June 16, 2010
REDIS
• key-value store
• advanced data types: list, set
• atomic operations
Wednesday, June 16, 2010
Schema
Wednesday, June 16, 2010
Users: {
vitaly: {
email: [email protected],
company: astrails,
password: secret
},
michael: {
email: [email protected],
company: astrails,
password: superduper
},
...
}
Wednesday, June 16, 2010
UsersByEmail: { "[email protected]": "vitaly", "[email protected]": "michael", ...}
Wednesday, June 16, 2010
Migrations
Wednesday, June 16, 2010
Start Slow
Wednesday, June 16, 2010
NoSQL can helpYou
Wednesday, June 16, 2010
WTF is NoSQL?
Vitaly Kushnerastrails.com
@astrails @vkushner
Q & A
Wednesday, June 16, 2010