Upload
oswin-holland
View
214
Download
1
Embed Size (px)
Citation preview
COMP 2903A34s – Google and the
Wisdom of CloudsDanny Silver
JSOCS, Acadia University
Stephen BakerBusiness Week, December 24, 2007
• Senior writer at BusinessWeek, covering technology• Named Pittsburgh bureau manager in 1992• Reporter for the El Paso Herald-Post• Chief economic reporter for The Daily Journal in Caracas,
Venezuela• Bachelor's degree from the University of Wisconsin and a
master's from the Columbia University Graduate School of Journalism.
Scaling Up to the Web
• George Bisciglia – Google, “What would you do if you had 1000 times more data?”
• Google’s network of servers must process mountains of data in microseconds
• They are experts at using the “cloud”• Parallel processing is the key• This requires special training …
http://www.youtube.com/watch?v=6uJUnEK0NmA&feature=related
Google 101
• Bisciglia and Univ. of Washington• Course covers programming at the scale the
cloud allows• Led to partnership with IBM and courses at
other universities• Delivers to “students, researchers and
entrepreneurs the immense power of Google-style computing”
Move to the Cloud
• Signals fundamental shift in how information is handled
• Like move from home generator to common electrical utility
• It changes the way we view computing• Requires a completely different business
model
Massive Growth of the Cloud
• Amazon, Yahoo, Microsoft, IBM, Google are all supplying cloud services
• In 2007 Google added 4 new data centers @ $600M per center!
• Let’s take a tour http://www.youtube.com/watch?v=zRwPSFpLX8I
• Microsoft has a massive new cloud center as well - http://www.youtube.com/watch?v=K3b5Ca6lzqE
MapReduce
• Programming model and associated implementation for processing and generating large data sets
• Many real world tasks can be tackled• Programs automatically parallelized and executed on a
large cluster of machines• Network system takes care of:
– partitioning the input data– scheduling the program's execution– handling machine failures
– inter-machine communication http://labs.google.com/papers/mapreduce.html
MapReduce
• Programmers without experience in parallel and distributed systems can use such systems
• MapReduce runs on a large cluster of commodity machines and is highly scalable
• Hundreds of MapReduce programs have been implemented
• Over one thousand MapReduce jobs are executed on Google's clusters every day.
http://labs.google.com/papers/mapreduce.htmlAlso see Hadoop opensource:
http://wiki.apache.org/hadoop/HadoopMapReduce
Future?
• Intro to Hadoop: http://vimeo.com/8689411
• Hadoop World conference:http://vimeo.com/7108908
• Google is now on Caffeinehttp://www.theregister.co.uk/2010/09/09/google_caffeine_explained/
• http://www.youtube.com/watch?v=kKTrK3V1FpE