Upload
chetan-khatri
View
28
Download
0
Embed Size (px)
Citation preview
Data Science Bootcamp, Day 1Presented By: Chetan Khatri, Volunteer Teaching assistant, Data Science Lab.Guidance By: Prof. Devji D. Chhanga, University of Kachchh
AgendaAn Introduction to Data Science with Industrial perspective.
An Introduction to Distributing System
CAP Theorem
Collection frameworks in Java
Querying in Distributed MySQL
Example
●Facebook CEO Mark Zukerberg wants to Analyze Data, he wants to know How many Daily Users ? , How many Daily Messages ?
●Assume that, Facebook’s Data center is available at California, Germany, Japan, Bangalore, Kenya.
●Assume that, Facebook is using MySQL as their Data storage RDBMS, How can he get it ?
●SQL ???
Querying in Distributed MySQL (conti…)
Let’s think ! You have to Query for Single Node ! Let’s Start !
Mark wants to have Analytics Chart so he can do Analytics on top of that, so he can have ratios of customer behaviour such as how many users are churning / leaving his platform which includes the Facebook + WhatsApp !
Assume, Table Structure are as below.user_master
User_id (PK)
Created_on(DATE)
last_updated_on(DATE)
transaction_master
trans_id (PK)
user_id(FK)
timespan(DATE)
Querying in Distributed MySQL (conti…)
1)Daily Users
Desired Output:Date Daily Users
12-05-2016 32,00,000
13-05-2016 21,00,854
14-05-2016 22,54,246
15-05-2016 32,51,230
Query:
Select last_updated_on as “Date” , count(user_id) as “Daily Users” from user_master group by last_updated_on order by last_updated_on;
Querying in Distributed MySQL (conti…)
1)Daily Messages by User
Desired Output:User Date Messages
Drew Houston 12-08-2016 700
Satya Nadella 12-08-2016 652
Sundar Pichai 12-08-2016 352
Tim Cook 12-08-2016 154
Query:
Home work !
Collection Framework in Java
How could you think About Hashset in Java?
How could you think About ArrayList in Java?
Concurrency in Java
Why Threading ?
How it can help you to optimize the performance / throughput ?
Q & A sessionQuestions Please !!
Than
k yo
u Chetan Khatri, Volunteer Teaching Assistant, Data Science Lab, University of Kachchh.
Email: [email protected]
Github Data Science Lab: https://github.com/dskskv
CCCS936 Repository: https://github.com/dskskv/CCCS936