Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Big Data: Data Analysis Boot CampIntroduction and Overview
Chuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhD
29 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 201929 March 2019
c©Old Dominion University
2/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Table of contents (1 of 1)
1 IntroductionThe global view
2 OverviewThe world from 50,000feet.Text
3 AdministriviaMiscellaneous andnecessary things
4 Process Overview5 Q & A6 Conclusion7 References8 Files9 Vita
c©Old Dominion University
3/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
The global view
Big Data: Data Analysis Boot Camp
We will cover aspects common toall Big Data investigations,including: defining Big Data,surveying tools and techniquesfor processing Big Data, andvisualizing selected aspects ofBig Data.The emphasis of the camp is tounderstand what is Big Datadata analysis beyond themarketing hype of the 3Vs ofvolume, variety, and velocity,
Image from [1].
More detailed information at:https://www.odu.edu/cepd/bootcamps/data-analysisc©Old Dominion University
https://www.odu.edu/cepd/bootcamps/data-analysis
4/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
The world from 50,000 feet.
Things we’ll be covering over the next three days:
Friday1 Administrivia2 What is BD?3 What is R?4 Looking at the built-in
iris and Titanic datasetsSaturday
1 Visualizing data withdifferent packages
2 Exploring cluster analysis(of different types)
3 Linear regression andsome variants
4 Classification techniques5 Text analysis6 Serial vs. parallel
processing
Sunday1 R limitations2 R and Hadoop3 R and SQL and No-SQL
DBMs4 Hands-on with real-world
crime data5 Wrap-up
c©Old Dominion University
5/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Text
Not required reading, but referenced throughout ourtime together.
Learning Predictive Analyticswith R (LPAR)
Big Data Analytics with R(BDAR)
Not necessary, but really helpful.c©Old Dominion University
6/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Text
Code samples
There are lots. And, they looklike this:
library(cluster.datasets)
data(all.us.city.crime.1970)
crime =
all.us.city.crime.1970
plot(crime[5:10])
Available in a separate file embedded in each presentation.
c©Old Dominion University
7/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Miscellaneous and necessary things
All things related to paper work.
Parking – front and backwithout permitsBreaks – yes we’ll have them.Lunch – yes places near by:right a main light to “fast food”Text books – recommended butnot necessary, have good ideas,techniquesNon-credit optionCredit option – two additionalassignment
Hours – 9AM to 5PM with abreak for lunchSunday access – yes, check inwith securitySoft copies – all presentations,and software are availableComputer logins and passwords– will be coordinatedBreak room – across hallBathrooms – around elevator
Other things as well.
c©Old Dominion University
8/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Miscellaneous and necessary things
Soft copies available from Internet
All information(presentations, scripts, anddata) is available on yourVM desktop (static)
All information is availablevia the I’net (dynamic)
Errata updated nightly
I’m not a web designer, nor do Iplay one on TV.
http://www.cs.odu.edu/
~ccartled/Teaching/
c©Old Dominion University
http://www.cs.odu.edu/~ccartled/Teaching/http://www.cs.odu.edu/~ccartled/Teaching/
9/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Miscellaneous and necessary things
Same image.
http://www.cs.odu.edu/~ccartled/Teaching/
c©Old Dominion University
http://www.cs.odu.edu/~ccartled/Teaching/
10/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
How do Data Wrangling, Analysis, and Visualization fittogether?
Notionally, there are threedistinct phases in data analysis.
1 Data wrangling – getting theraw data into a usable form
2 Data analysis – evaluatingand understanding the data
3 Data visualization –presenting the analyticalresults in an intelligiblemanner
Management continues across allphases. The other phases mayoverlap.
c©Old Dominion University
11/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Q & A time.
Q: What is the square root of4b2?A: To be or not to be.
c©Old Dominion University
12/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
What have we covered?
Where we are.Where we’re going.How we’ll get there.
Now!! On to exploring the world of Big Data!
c©Old Dominion University
13/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
References (1 of 1)
[1] Vangie Beal, Big Data,https://www.webopedia.com/TERM/B/big_data.html,2017.
c©Old Dominion University
https://www.webopedia.com/TERM/B/big_data.html
14/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Files of interest
1 Code snippets
c©Old Dominion University
library(cluster.datasets)data(all.us.city.crime.1970)crime = all.us.city.crime.1970plot(crime[5:10])
"Chuck Cartledge"
15/15
Introduction Overview Administrivia Process Overview Q & A Conclusion References Files Vita
Who am I?
Father
Husband (only 42 years, but it seemslonger)
PhD, Computer Science, 2014
CAPT, USN retired 2004 (31+ years)
Professional software developer (38 years)
A perennial student
1st computer: 1970, donated ICBMguidance computer, machine code,paper/mylar tape, and drum memory
Interests: autonomic systems, real–time applications, distributed processing,long-term preservation of digital data, Big Data
c©Old Dominion University
IntroductionThe global view
OverviewThe world from 50,000 feet.Text
AdministriviaMiscellaneous and necessary things
Process OverviewQ & AConclusionReferencesFilesVita