Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Milind B handarkarHadoop S olutions A rc hitec t(T hanks :Todd P apaioannou, V P C loud A rc hitec ture)
B y S earc hNetMedia
HADOOP & THEFUTURE OF CLOUD
COMPUTING
OF HADOOPTHE POWER
HAPPENING WHAT’S
- Big Data is here!- self-expression- transient content- unstructured data
F lic kr : s ub_lime79
THE NOISECUTTING THROUGH
F lic kr : L omo-C am
Location Social Relationships ScienceUnderstanding
User Interests
access audience blogs communication computer internet
mass mediapeople networking technology
DATABIG
F lic kr : NAS A G oddard P hoto and V ideo
INTO INSIGHTSTURNING DATA
machine learningtime series
content clustering
fac torization models
logic regres s ion
F lic kr : NAS A G oddard P hoto and V ideo
algorithmsuser interest prediction
adinventory modeling
RELEVANTMAKING IT
F lic kr : ogimogi
LIGHTNING-FASTHADOOP:
science + big data + insight = personal relevance = VALUE
TECHNOLOGY
F lic kr : DDF ic
EVERY CLICKBEHIND
& CLOUDHADOOP
F lic kr : G ot S arah
THE PLATFORM EFFECTTHE HADOOP ECOSYSTEM
and other Early AdoptersScale and productize Hadoop
11
Apache Hadoop
Orgs with Internet Scale ProblemsAdd tools / frameworks, enhance Hadoop
Mainstream / Enterprise adoptionFund further development, enhancements
EnhanceHadoopEcosystem
Service Providers Grow ecosystem - Training, support, enhancements
Virtuous Circle!• Investment -> Adoption• Adoption -> Investment
HADOOP IS GOINGMAINSTREAM
2007 2008 2009
12
2010
The Datagraph Blog
13
HADOOP ATYAHOO!“Where Science meets Data”
HADOOP CLUSTERSTens of thousands of servers
PRODUCTS
APPLIED SCIENCE
Data Analytics Content OptimizationContent Enrichment Yahoo! Mail Anti-Spam Advertising ProductsAd Optimization Ad SelectionBig Data Processing & ETL
User Interest Prediction Ad inventory prediction Machine learning -search ranking Machine learning - ad targetingMachine learning - spam filtering
2006 2007 2008 2009 201014
FROM PROJECT TOCORE PLATFORM
Today
38K Servers170 PB Storage1M+ Monthly Jobs
Thou
sand
s of
Ser
vers
Pet
abyt
es
90
80
70
60
50
40
30
20
10
0
250
200
150
100
50
0
Research
Science Impact
Daily Production
“Behind every click”
15
CANARY IN THECOAL MINE
» Heterogeneous mix of workloads
» Low latency jobs – near time
» Migrating legacy applications
» Application programmers more mainstream, not early adopters
More and varied workloads and users
Processing is increasing, data increasing faster
© Flickr: floridapfe
» Infrastructure utilization
Takeaway
» Scale
» Security
» Workload management
» Performance and utilization
» Developer support
Deployment & Development
» Enterprise readiness at unparalleled scale
16
LOOKING AHEAD FORHADOOP
17
YAHOO!’S VISIONOPEN SOURCE CLOUD
Open Source Benefits» Avoid technological dead ends
» Leverage community contributions
» Workforce already trained
Ongoing contributions Yahoo!’s adoption of open source
Future contributions
Cloud serving
Storage
FUTURE HOLD?WHAT DOES THE
B y E ls ie
MORE BIG
B y B ionic Teac hing
DATA IN THECLOUD
B y F adilfb
PRIVATE CLOUDS
B y Zac hs tern
HYBRID CLOUDS
B y C alop
AUTOMATION
CLOUD FABRICS
QUESTIONS?