Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
NoSQL: Need and Evolution
Shubham Kumar MakeMyTrip
Who am I
1.Evolution of NoSQL
2.Use Cases
3.Challenges
4.Path Ahead
+
Abstract
PreSQL : NoSQL • 60’s : Integrated Data Store
• Navigational Database(Navigating through links to Data)
• Used for Flight’s Reservation
• Pointers + Records : chase manually
• Hard to evolve schema
• High Performance
SQL • 70’s
• Separate Data from Code
• Pointer chasing programmatically. High level code.
• Shift from Assembly to C
• Move to SQL
• Data outlasts any particular Implementation
NoSQL
• Late 1970’s the decade where Unix originated with the intent of being a File System
• Dbm from Ken Thompson in 1979
• Berkley Db
• Start with 1995 And 2K decade
What is NoSQL Database which does not adhere to the traditional relational database management system (RDMS) structure .
Why NoSQL
q Scalability and Performance
q Cost
q Data Modeling : Use Cases
Scalability and Performance
q Horizontal scalability better than Vertical
q Hardware getting cheaper and processing power increasing
q Less Operational complexity as against RDBMS solutions.
q In most of the solutions you get automatic sharding etc as default.
Why NoSql : Motives and Drivers contd..
Why NoSQL : Motives and Drivers contd..
Cost q Scale(as with NoSQL) with Hefty Cost
q Commodity hardware, software versions, upgrades, maintenance.
q This brought organizations look out for alternatives and the need for a cost effective scale out option.
Why NoSQL : Motives and Drivers contd..
Why NoSQL : Evaluate Solutions from Non Functional
Aspect’s
• Internal partitioning
• Automated flexible data distribution
• Hot swappable nodes
• Replication-style
• Automated failover strategy
•
• • •
Data Modeling: Use Cases
Why NoSQL : Motives and Drivers contd..
User and Items
User and Items : Option 1
User and Items : Option 2
User and Items : Option 3
User and Items : Option 4
User and Items : Option Best
Composite Keys SELECT Values WHERE state="CA:*" SELECT Values WHERE city="CA:San Francisco*"
Aggregation
Aggregation With Atomicity
Inverted Indexes
MyBlog 3NF Relational Structure
{ "_id" : ObjectId("508d27069cc1ae293b36928d"), "title" : "This is the title", "body" : "This is the body text.", "tags" : [ ObjectId("508d35349cc1ae293b369299"), ObjectId("508d35349cc1ae293b36929a"), ObjectId("508d35349cc1ae293b36929b"), ObjectId("508d35349cc1ae293b36929c") ], "created_date" : ISODate("2012-10-28T12:41:39.110Z"), "author_id" : ObjectId("508d280e9cc1ae293b36928e"), "category_id" : ObjectId("508d29709cc1ae293b369295"), "comments" : [ ObjectId("508d359a9cc1ae293b3692a0"), ObjectId("508d359a9cc1ae293b3692a1"), ObjectId("508d359a9cc1ae293b3692a2") ] }
Document Based Db : Sol-1
Document Based Db : Sol-2 • {
– "_id" : ObjectId("508d27069cc1ae293b36928d"), – "title" : "This is the title", – "body" : "This is the body text.", – "tags" : [ – "chocolate", – "spleen", – "piano", – "spatula" – ], – "created_date" : ISODate("2012-10-28T12:41:39.110Z"), – "author_id" : ObjectId("508d280e9cc1ae293b36928e"), – "category_id" : ObjectId("508d29709cc1ae293b369295"), – "comments" : [ – { – "subject" : "This is coment 1", – "body" : "This is the body of comment 1.", – "author_id" : ObjectId("508d345f9cc1ae293b369296"), – "created_date" : ISODate("2012-10-28T13:34:23.929Z") – }, – { – "subject" : "This is coment 2", – "body" : "This is the body of comment 2.", – "author_id" : ObjectId("508d34739cc1ae293b369297"), – "created_date" : ISODate("2012-10-28T13:34:43.192Z") – }, – { – "subject" : "This is coment 3", – "body" : "This is the body of comment 3.", – "author_id" : ObjectId("508d34839cc1ae293b369298"), – "created_date" : ISODate("2012-10-28T13:34:59.336Z") – } – ] – }
Data Modeling SQL has been for
q Concurreny,Consistency,Integrity
q For Summations,Aggregations,Grouping’s
q Schema Says: What all Do I answer ??
Data Modeling q A plain key-value store is very powerful and fit the max use cases for a NoSQL solution
q Hierarchical or graph-like data modelling and processing. q Values like maps of maps of maps. q Document Databases which even store arbitrary complex objects.
q Document based indexing data store’s are a huge success.
At times SW apps are not limited to these constraints . This lead to data models like q Key/Value Store : Redis,MemcacheDb/Voldemort etc. q Wide Column Store / Column Families : Cassandra/Hadoop(Hbase)/Hypertable/Cloudera etc.
q Document Based Store’s : Solr/Lucene/MongoDb/CouchDb/TerraStore etc.
q Graph Data Store : Neo4J/GraphBase/FlockDb etc.
Why NoSql : Motives and Drivers contd..
q Schema Says: What are the questions
q Data modeling is based on the set of Queries
q Exploit De-normalization Duplication
q Use Aggregates
q Manage Joins with App + Aggregation + De-Normalization etc.
Why NoSQL : Motives and Drivers contd..
NoSQL: Path Ahead q ACID equivalence(Neo4J,CouchDb etc)
q Transaction Support
q Atomicity
q API definition’s over time
q Migration from SQL to NoSQL
q Community Support
NoSQL: Path Ahead contd.. Enterprise Adoption and Challenges q NoSQL looks good for Unstructured data
largely
q SQL is the best choice for a broad range of traditional workloads.
NoSQL: Path Ahead contd.. q Work with SQL Db w.r.t Creation/Updation etc.
q Archive the data in NoSQL for query/analysis etc.
NoSQL: Path Ahead contd..
NoSQL: Path Ahead contd.. Shout out loud Hybrid ACID + BASE They are not alternatives but
supplements
Stump the chump
References q Nancy Lynch and Seth Gilbert, “Brewer's conjecture and the feasibility of consistent, available,
partition-tolerant web services”, ACM SIGACT News, Volume 33 Issue 2 (2002), pg. 51-59. q Brewer's CAP Theorem", julianbrowne.com, Retrieved 02-Mar-2010 q Brewers CAP theorem on distributed systems", royans.net q CAP Twelve Years Later: How the "Rules" Have Changed on-line resource q E. Brewer, "Towards Robust Distributed Systems," Proc. 19th Ann. ACM Symp.Principles of
Distributed Computing (PODC 00), ACM, 2000, pp. 7-10; on-line resource q D. Abadi, "Problems with CAP, and Yahoo’s Little Known NoSQL System," DBMS Musings, blog,
23 Apr. 2010; on-line resource. q C. Hale, "You Can’t Sacrifice Partition Tolerance," 7 Oct. 2010; on-line resource. q Facebook: Scaling Out on-line resource. q Gemstone : The Hardest Problems In Data Management on-line resource q The Log-Structured Merge-Tree (Research Paper) q CodeProject : Consistent Hashing on-line resource q HighlyScalable : NoSQL Data Modeling Techniques on-line resource q eBay Tech Blog :Cassandra Data Modeling Best Practices on-line resource q John D Cook : Acid Vs Base on-line resource q Merkle Trees q Phy-Accural Faliover Detaection (Research Paper)
• Backup Slides
Some Fanda-mentals
CAP Theorem
At the most only two properties of the three in a shared/distributed system can be satisfied.
q Consistency
q Availability
q Tolerance to Network Partitions
CAP : Pictorially
Explanation Use case: Scaling Web Apps
Critical fact’s : • Network outages are common • Customer shopping carts, email search, social network
queries—can tolerate stale data
How:
Compromise on Consistency in-order to remain available vs disrupt user service at outages.
q Rather than requiring consistency after every transaction, it is enough for the database to eventually be in a consistent state.
q Brewer’s CAP theorem says you have no choice if you want to scale up.
Explanation
Explanation contd..
Sharp Contrast : High Speed Financial Application
q Highly Transactional
q Consistent
q Automated
q Can’t live with Eventual consistency
ACID vs BASE ACID q Atomic: Everything in a transaction succeeds or the
entire transaction is rolled back.
q Consistent: A transaction cannot leave the database in an inconsistent state.
q Isolated: Transactions cannot interfere with each other.
q Durable: Completed transactions persist, even when servers restart etc.
Some Fanda-mentals cont..
BASE q Basic Availability
q Soft-state
q Eventual consistency
Consistent Hashing
• Common way to load balance .
• The machine chosen to cache object o will be:
• hash(o) mod n • n:total number of machines
Consistent Hashing contd..
q Adding a machine to the cache means hash(o)
mod (n + 1)
q Removing a machine to the cache means hash(o) mod (n - 1)
q Result on any above: Disaster J
Swamped machines with redistribution
Consistent Hashing contd..
•
• Commonly, a hash function(e.g MD5 hash) will
• map a value into a 128-bit key, 0~2^127-1(or 32 bit even as given next) .
Consistent Hashing contd..
Consistent Hashing contd..
Both Key and Machine hashed with the same function
Consistent Hashing contd..
Adding a Node
Consistent Hashing contd..
Removing a Node
Replication Cassandra : Multi DC 1
Replication Cassandra : Multi DC 2
Use Case 1 • Ecommerce Site
• Problem : Record User Preferences e.g : Location,IP,Currency selected, Source of Traffic, Multiple other dynamic values
• Solution : In a CF based structure keep it simple
• UserId_Key: Pref2_Name:Value1,Pref2_Name:Value2,….PrefN_Name:ValueN
•
•
Use Case 1
RowKey: 1350136093705_6501082438199894 => (column=1350136093764, value=-3242432#911167901131523, timestamp=1350136093766000) => (column=1350283322499, value=GOI#200701231712126570, timestamp=1350283322502001) => (column=1350283566051, value=GOI#200703221605283033, timestamp=1350283566054001) => (column=1350749595676, value=GOI#200805261514037199, timestamp=1350749595677001) (column=1350785230322, value=BOM#200701251747233158, timestamp=1350785230324001)
⇒ RowKey: 1354499614310_10861558002828044 ⇒ => (column=1354499614368, value=TRV#201104071059204768, timestamp=1354499614370000, ttl=1728000) ⇒ ------------------- ⇒ RowKey: 1349760150553_6114662943774777 ⇒ => (column=1349760152066, value=BLR#200802111324575807, timestamp=1349760152068001) ⇒ ------------------- ⇒ RowKey: 1349805109805_6167423558533191 ⇒ => (column=1349805111833, value=TRV#312254274337517, timestamp=1349805111835001) ⇒ ------------------- ⇒ RowKey: 1354435656227_7908056941568359 ⇒ => (column=1354435656367, value=IDR#200701211254519381, timestamp=1354435656369000, ttl=1728000) ⇒ ------------------- ⇒ RowKey: 1347648097261_15570089270962881 ⇒ => (column=1347648097304, value=DEL#201101192008115545, timestamp=1347648097307000)
Use Case 1 » Get
• private Map<String, String> getPrerences(Keyspace keySpace, String userId, String... prefernceNames) throws IOException, CharacterCodingException {
• SliceQuery<String, String, String> rsq = HFactory.createSliceQuery(keySpace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get());
• rsq.setColumnFamily(USER_PREFERENCE); • rsq.setKey(userId); • rsq.setColumnNames(prefernceNames);
• QueryResult<ColumnSlice<String, String>> orows = rsq.execute(); • Map<String, String> preferenceMap = new LinkedHashMap<String, String>(); • for (HColumn<String, String> column : orows.get().getColumns()) { • preferenceMap.put(column.getName(), column.getValue()); • } • return preferenceMap;
• }
Use Case 1 •
Save
• Mutator<String> m = HFactory.createMutator(keySpace, StringSerializer.get());
• HColumn<String, String> userPrefrences = HFactory.createColumn(colkey, colvalue, StringSerializer.get(), StringSerializer.get());
• userPrefrences.setTtl(ttlUserPrefrences);
• m.addInsertion(rowkey, USER_PREFERENCE, userPrefrences);
• m.execute();
Use Case 2 • Online Travel Site
• Problem: Need to know different metrics for a city hotels e.g.:
• Hotels booked in last X Time
• Hotels Last viewed in Y Time
• Hotels Left with Z Inventory
Use Case 2 • RowKey: 2d323436353731 • => (super_column=911167901297486, • (column=6c6173747669657765646d657373616765, value=VIEWED#Last viewed
23 hour(s) ago., timestamp=1354962852610000) • column=6c6173747669657765646d657373616762, value=Inventory#20 ,
timestamp=1354962852610000, • column=6c6173747669657765646d657373616769, value=Bookings#8 ,
timestamp=135496282610000 • ) • ------------------- • RowKey: 58524f • => (super_column=200903041759196196, • (column=6c617374626f6f6b65646d657373616765, value=Booked#Last booked 1
day(s) ago., timestamp=1347781187842000) • (column=6c6173747669657765646d657373616765, value=VIEWED#Last viewed
2 hours ago., timestamp=1347707080147000)) • => (super_column=200903041848352230, • (column=6c6173747669657765646d657373616765, value=VIEWED#Last viewed
1 day(s) ago., timestamp=1347266107708000))
Use Case 2 • SuperSliceQuery<String, String, String, String> superQuery = HFactory.createSuperSliceQuery(getKeySpace(), • StringSerializer.get(), StringSerializer.get(), • StringSerializer.get(), StringSerializer.get()); • superQuery.setColumnFamily(SUPER_SOCIAL_MESSAGE).setKey(cityCode);
• QueryResult<SuperSlice<String, String, String>> result = superQuery.execute(); • List<HSuperColumn<String, String, String>> superColumns = result.get().getSuperColumns();
• if (superColumns != null) { • for (HSuperColumn<String, String, String> superColumn : superColumns) { • Map<String, String> messages = new HashMap<String, String>(); • List<HColumn<String, String>> columns = superColumn.getColumns(); • if (columns != null) { • for (HColumn<String, String> column : columns) { • messages.put(column.getName(), column.getValue()); • } • } • /* The equivalent doc *\ • document.addField(superColumn.getName(), messages); • documents.add(document); • } • }
Pig Script : MR • <document> • <pigscript start="-16" end="-43200" start1="-1441" end1="-10080" start2="0" end2="-15" start3="0" end3="-1440"> • <comment>Delete All Messages</comment> • <query><![CDATA[rows0 = LOAD 'cassandra://LH/HotelMessage' USING com.mmt.solr.hotels.cassandra.CassandraStorage() as (key:chararray,
cols:bag{T:tuple(name:chararray, value:chararray) } );]]></query> • <query><![CDATA[cols0 = FOREACH rows0 GENERATE key as key,flatten($1) as (name:chararray, value:chararray);]]></query> • <query><![CDATA[cols0 = FOREACH rows0 GENERATE key as key,flatten($1) as (name:chararray, value:chararray);]]></query> • <query><![CDATA[userhotel0 = FOREACH cols0 GENERATE key as key,com.mmt.solr.hotels.cassandra.ByteBufferToString($1) as
name,com.mmt.solr.hotels.cassandra.ByteBufferToString($2) as value;]]></query> • <query><![CDATA[uriCounts0 = FOREACH userhotel0 GENERATE key as citycode,com.mmt.solr.hotels.cassandra.ToBag(TOTUPLE(name,null));]]></query> • •
• <comment>Last Viewed start 15 minutes to 30 days ago</comment> • <query><![CDATA[rows = LOAD 'cassandra://LH/LastViewedHotels?slice_start=#start&slice_end=#end&limit=1024&reversed=true' USING
com.mmt.solr.hotels.cassandra.CassandraStorage() as (key:chararray, cols:bag{T:tuple(name:long, value:chararray) } );]]></query> • <query><![CDATA[cols = FOREACH rows GENERATE key as key,flatten($1) as (name:long, value:chararray);]]></query> • <query><![CDATA[userhotel = FOREACH cols GENERATE key as key,com.mmt.solr.hotels.cassandra.LongToHours($1) as
name,com.mmt.solr.hotels.cassandra.ByteBufferToString($2) as value;]]></query> • <query><![CDATA[userhotelByCity = FOREACH userhotel GENERATE key as key,flatten($1) as name,flatten(org.apache.pig.piggybank.evaluation.string.Split(value,'#',2)) as
(citycode:chararray,hotelid:chararray);]]></query> • <query><![CDATA[groupByhotels = GROUP userhotelByCity BY hotelid;]]></query> • <query><![CDATA[uriCounts = FOREACH groupByhotels { D = LIMIT userhotelByCity 1; • GENERATE flatten(D.citycode) as citycode,com.mmt.solr.hotels.cassandra.ToBag( • TOTUPLE(group,com.mmt.solr.hotels.cassandra.StringAppend('VIEWED#Last viewed ',D.name,' ago.'))); • };]]></query> • • <comment>Last Booked 1 to 8 days ago</comment> • <query><![CDATA[rows1 = LOAD 'cassandra://LH/BookedHotels?slice_start=#startA&slice_end=#endA&limit=1024&reversed=true' USING
com.mmt.solr.hotels.cassandra.CassandraStorage() as (key:chararray, cols:bag{T:tuple(name:long, value:chararray) } );]]></query> • <query><![CDATA[cols1 = FOREACH rows1 GENERATE key as key,flatten($1) as (name:long, value:chararray);]]></query> • <query><![CDATA[userhotel1 = FOREACH cols1 GENERATE key as key,com.mmt.solr.hotels.cassandra.LongToHours($1) as
name,com.mmt.solr.hotels.cassandra.ByteBufferToString($2) as value;]]></query> • <query><![CDATA[userhotelByCity1 = FOREACH userhotel1 GENERATE key as key,flatten($1) as name,flatten(org.apache.pig.piggybank.evaluation.string.Split(value,'#',2)) as
(citycode:chararray,hotelid:chararray);]]></query> • <query><![CDATA[groupByhotels1 = GROUP userhotelByCity1 BY hotelid;]]></query> • <query><![CDATA[uriCounts1 = FOREACH groupByhotels1 { D = LIMIT userhotelByCity1 1; • • GENERATE flatten(D.citycode) as citycode,com.mmt.solr.hotels.cassandra.ToBag( • TOTUPLE(group,com.mmt.solr.hotels.cassandra.StringAppend('Booked#Last booked ',D.name,' ago.'))); • };]]></query>
Criteria's to Evaluate NoSQL Solutions
• Internal partitioning
• Automated flexible data distribution
• Hot swappable nodes
• Replication-style
• Automated failover strategy