Upload
calin-andrei-burloiu
View
419
Download
2
Tags:
Embed Size (px)
DESCRIPTION
We needed a bridge between the real-time tier, where we used Couchbase, and the batch tier, built on Hadoop. When we couldn’t find a suitable option, we built our own: Couchdoop – an open-source Hadoop connector for Couchbase. Based on our experience with Couchdoop, we will discuss best practices in creating connectors for Hadoop and NoSQL DBs. We’ll address the challenges we encountered while developing Couchdoop and share how we tuned it for performance. Together with Bigstep, we will also show how much throughput that can be squeezed from a Hadoop connector. We have benchmarked Couchdoop for performance and we’ll talk about the behavior you can expect and tweaks that can improve the performance of your big data setup.
Citation preview
Two-tier Architecture
Real-time Tier (Couchbase)•Detects user intent•Gives next best recommendation or deal
Data Bridge (Couchdoop)
Batch Tier (Hadoop)•Recommends products
Use
r even
ts
Reco
mm
en
datio
ns
Importing Data{ “user”: “Rudy”, “action”: “view”, “product”: “Fender Guitar”}
{ “user”: “Rudy”, “action”: “click”, “product”: “Guitar Amplifier”} {
“user”: “Emma”, “action”: “buy”, “product”: “Blue Skirt”}
Couchdoop
Machine Learning RecommendationsHadoop
IMPORT
HDFS
{ “user”: “Rudy”, “recommendations”: [ [“Ibanez Acoustic Guitar”, 450], [“Guitar Tuner”, 120], [“Sound Mixer”, 30] ]}
EXPORT
Exporting Data
Couchdoop
Machine Learning RecommendationsHadoop
{ “user”: “Rudy”, “recommendations”: [ [“Ibanez Acoustic Guitar”, 450], [“Guitar Tuner”, 120], [“Sound Mixer”, 30] ]}
Update
Updating Data
Couchdoop
Machine Learning RecommendationsHadoop