Upload
evert-lammerts
View
1.310
Download
3
Embed Size (px)
DESCRIPTION
This was the first of two introduction presentations to the first Hadoop Hackathon at SARA, the Dutch center for High Performance Computing and Networking.
Citation preview
SARA Hadoop [email protected] 7, 2010
SARA Hadoop Hackathon, December 7, 2010
DJOERD HIEMSTRA(UTwente)
EDGAR MEIJ(UvA)
SARA Hadoop Hackathon, December 7, 2010
Nutch*2002 2004
MR/GFS**20062004
Hadoop
* http://nutch.apache.org/** http://labs.google.com/papers/mapreduce.html http://labs.google.com/papers/gfs.html
SARA Hadoop Hackathon, December 7, 2010
http://wiki.apache.org/hadoop/PoweredBy
2010: A Hype in Production
SARA Hadoop Hackathon, December 7, 2010
Super computingSuper computing
Cluster computingCluster computing
Grid computingGrid computingCloud computingCloud computing
GPU computingGPU computing
http://www.sara.nl/
SARA Hadoop Hackathon, December 7, 2010
ComputationExpensive!
:-(:-)
DataCheaper!
Data
Computation
Ref: Luiz André Barroso and Urs Hölzle, Google Inc. The Datacenter as a Computer: An Introduction to the Design of WarehouseScale Machines
SARA Hadoop Hackathon, December 7, 2010
DN TT DN TT DN TT DN TT
DN TT DN TT DN TT DN TT
NameNode JobTracker
DN
TT
DataNode
TaskTracker
SARA Hadoop Hackathon, December 7, 2010
File Map ReduceShuffle Output
$ echo “${email#*@}, ${name}” $ sort $ wc l
ewi.utwente.nl, 1gmail.com, 2nbic.nl, 1nikhef.nl, 3sara.nl, 1
SARA Hadoop Hackathon, December 7, 2010
From: Hadoop, The Definitive Guide (2nd Edition), Tom White
SARA Hadoop Hackathon, December 7, 2010
Today
09.30 - 09.50 Welcome & Introduction09.50 - 10.15 Map/Reduce @ University of Twente10.15 - 10.30 Kick-off hackathon14.00 - 15.00 Optional: SARA tour10.30 - 17.00 Hackathon17.00 - 17.30 Results and closing