45
Is There Room for Another Elephant in Tucson? Tucson Java User Group Andrew Lenards

Is There Room For Another Elephant In Tucson

Embed Size (px)

DESCRIPTION

Would you like to scale data-intensive tasks horizontally? Would you like an open source project that gave you that foundation?Well, there is: Apache Hadoop. It's a Java software framework for supporting data-intensive distributed applications. The framework was inspired by Google papers on their MapReduce framework and Google File System.Who uses Hadoop? Here's a short list: Yahoo!, A9.com, LinkedIn, Facebook, ImageShack, eHarmony, Hulu, Last.fm, and The New York Times. The highest profile user, Yahoo!, is also a major contributor to the project. They use it extensively in their web search and advertising divisions.In this talk, titled "Is there room for another elephant in Tucson?", Andrew Lenards will tell us about Hadoop and describe how it could be applied to several practical problems, even if you aren't as big as Google.

Citation preview

Page 1: Is There Room For Another Elephant In Tucson

Is There Room for Another                   Elephant in 

             Tucson?

Tucson Java User GroupAndrew Lenards

Page 2: Is There Room For Another Elephant In Tucson

andrew.lenards@gmail UA grad, Dec 2001former teaching assistant UA CSformer instructor UA CSreformed .NET developer10 years on/off coding Java Co-founder UA Student ACMActive in: • Tucson Java User Group• Tucson Startup Drinks

Semi-active in: • Tucson Free Unix Group • Ubuntu Arizona Local Community

Page 3: Is There Room For Another Elephant In Tucson

Why do I care?

Hadoop holds the TeraSoft, MinuteSort, GraySort benchmarks...  Captured TeraSort in 2008Captured MinuteSort, GraySort in 2009

Metrics (http://sortbenchmark.org/)GraySort: Sort rate (TBs / minute) achieved while sorting a very large amount of data (currently 100 TB minimum). MinuteSort: Amount of data that can be sorted in 60.00 seconds or less.TeraSort*: Elapsed time to sort 1012 bytes of data.  * Now deprecated

Page 4: Is There Room For Another Elephant In Tucson

Establishing our setting...

Page 5: Is There Room For Another Elephant In Tucson

Differing World Views

Page 6: Is There Room For Another Elephant In Tucson

Brendan's quote...

"When a computer fails, they don't bother trying to find it and fix it.  They just add more machines."

-- former UA student, former Google intern, circa 2000

Page 7: Is There Room For Another Elephant In Tucson

End of Moore's Law?

Page 8: Is There Room For Another Elephant In Tucson

End of Moore's Law?

Page 9: Is There Room For Another Elephant In Tucson

Scaling UpEndangered Practice?

Polar Bear Endangered

Species

Page 10: Is There Room For Another Elephant In Tucson

DATA!

Page 11: Is There Room For Another Elephant In Tucson

DATA!

Page 12: Is There Room For Another Elephant In Tucson

Zettabytes???

• The New York Stock Exchange generates about one terabyte of new trade data per day.

• Facebook hosts approximately 10 billion photos, taking up one petabyte of storage

• Ancestry.com stores around 2.5 petabytes of data.• The Internet Archive stores around 2 petabytes of data, and

is growing at a rate of 20 terabytes per month• The Large Hadron Collider will produce about 15 petabytes

of data per year

The "digital universe" is estimated to be 1.8 zettabytes by 2011

Source: "Hadoop: The Definitive Guide, Tom White"

Page 13: Is There Room For Another Elephant In Tucson
Page 14: Is There Room For Another Elephant In Tucson

MapReduce: the Abstraction

Page 15: Is There Room For Another Elephant In Tucson

MapReduce

Introduced in 2004 Google paper: "MapReduce: Simplified Data Processing on Large Clusters"

 "Structured as functional programming meets distributed processing" (Aaron Kimball, Cloudera)

Designed for batch processing, not designed for interactive

Page 16: Is There Room For Another Elephant In Tucson

MapReduce + RDBMS, not versus

Traditional RDBMS MapReduce

Data Size Gigabytes Petabytes

Access Interactive & batch Batch

Updates Read and write many times

Write once,read many times

Structure Static schema Dynamic schema

Integrity High Low

Scaling Nonlinear Linear

Source: "Hadoop: The Definite Guide, Tom White"

Page 17: Is There Room For Another Elephant In Tucson

Shared-state makes everything hard...

Sharing requires the usage communication mechanisms between processes.  (which we know complicates things)The MapReduce abstraction limits communication to keep benefits. 

Mappers do not need to communicate Reducers do not need to communicate

Page 18: Is There Room For Another Elephant In Tucson

Shared Nothing Architecture

Introduced in 1986 paper by Michael Stonebraker on distributed computing architectures, but applies to large scale web applications.   Note: Stonebraker was co-author of  "MapReduce: A Major Step Backwards" in January 2007.

Page 19: Is There Room For Another Elephant In Tucson

Functional inspiration, but not dogmatic

Functions w/ no side-effects are pure functions• Map is an n-to-n operation • Fold is an n-to-1 operation

(often called a "reduce")    With MapReduce, we define a problem in Mappers & Reducers

However, a Mapper can produce more than 1 key per element.  And a Reducer may produce many values.  So the abstraction is not married to the functional model.

Page 20: Is There Room For Another Elephant In Tucson

Partitioning work...

The design of scaling out horizontally with MapReduce is done by break large files into chunks (or blocks) and bringing computation to the data (data locality).   The "blocks" are the input to Mappers, so work partitioning is implicit to the system.

Page 21: Is There Room For Another Elephant In Tucson

Raising the level of abstraction

MapReduce allows you to focus on the problem, let the library deal w/ the messy details An understanding of the high-level domain and the low-level details does not need to exist within the same human-form anymore.

Page 22: Is There Room For Another Elephant In Tucson

MapReduce Usage

Page 23: Is There Room For Another Elephant In Tucson

Example usage...

• Distributed Grep• Word Count / Count URL Frequency• Inverted Index• Term-Vector per Website• Reverse Web-Link Graph

Page 24: Is There Room For Another Elephant In Tucson

Apache Web Server Logfiles

Consider we want to do a simple analysis of visits per host.

An abstract view of the inputs would be:

<k1, v1> -> Mapper -> <k2, v2> -> Reducer -> <k3, v3>

or

(<line-number>, <line>) --> Mapper --> (<hostname>, 1) (<hostname>, 1) --> Reducer --> (<hostname>, count)

Page 25: Is There Room For Another Elephant In Tucson

crawl-66-249-71-34.googlebot.com - - [16/Aug/2009:04:40:36 -0700] "GET /tree/home.pages/searchTOL?taxon=Arna&Submit2=Find&startline=26 HTTP/1.1" 200 14693 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"astdenis-105-1-18-94.w81-248.abo.wanadoo.fr - - [16/Aug/2009:04:40:36 -0700] "GET /onlinecontributors/img/quicknav/RightArrow.png HTTP/1.1" 200 321 "http://www.tolweb.org/Echinodermata" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6)"localhost.localdomain - - [16/Aug/2009:04:40:36 -0700] "GET /onlinecontributors/app?service=external&page=ViewBranchOrLeaf&sp=SSiphonophorida&sp=S8149&sp=S HTTP/1.1" 200 5894 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)"llf531045.crawl.yahoo.net - - [16/Aug/2009:04:40:36 -0700] "GET /Siphonophorida/8149 HTTP/1.0" 200 5894 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)"localhost.localdomain - - [16/Aug/2009:04:40:36 -0700] "GET /onlinecontributors/app?service=external&page=ViewBranchOrLeaf&sp=SCampanulotes&sp=S73605&sp=S HTTP/1.1" 200 8571 "-" "Mozilla/4.0"65.55.108.238 - - [16/Aug/2009:04:40:36 -0700] "GET /Campanulotes/73605 HTTP/1.1" 200 8571 "-" "Mozilla/4.0"astdenis-105-1-18-94.w81-248.abo.wanadoo.fr - - [16/Aug/2009:04:40:36 -0700] "GET /tree/img/magnify.gif HTTP/1.1" 200 124 "http://www.tolweb.org/Echinodermata" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6)"localhost.localdomain - - [16/Aug/2009:04:40:37 -0700] "GET /onlinecontributors/app?service=external&page=ViewBranchOrLeaf&sp=SHomo&sp=S16418&sp=S HTTP/1.1" 200 10283 "-" "Mozilla/5.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)"crawler5108.ask.com - - [16/Aug/2009:04:40:37 -0700] "GET /Homo/16418 HTTP/1.0" 200 10283 "-" "Mozilla/5.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)"astdenis-105-1-18-94.w81-248.abo.wanadoo.fr - - [16/Aug/2009:04:40:37 -0700] "GET /tree/img/tinylink.png HTTP/1.1" 200 207 "http://www.tolweb.org/Echinodermata" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6)"msnbot-65-55-105-180.search.msn.com - - [16/Aug/2009:04:40:37 -0700] "GET /onlinecontributors/app?page=ImageGallery&service=external&sp=l27570&state:ImageGallery=ZH4sIAAAAAAAAAFvzloG1nJeBgYGJgYEtLz8l1TOluIiBLyuxLFEvJzEvXc8nPy%2FduvvJhDP9yveZGBi9GFjLEnNKUyuKGAQQivxKc5NSi9rWTJXlnvKgG2hURQEDGGRfKhdgYODNTU3JTHTOSSwu9swrAZoviNAKFEhNTy0SerRgyffGdgugFZ4wKwoZ6hgYQaYAAKhZ4XSlAAAA HTTP/1.1" 200 7899 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"

Page 26: Is There Room For Another Elephant In Tucson

Input to the Mappers...

(0, "crawl-66-249-71-34.googlebot.com - - [16/Aug...") (1, "localhost.localdomain - - [16/Aug/2009:04:40...")(2, "crawler5108.ask.com - - [16/Aug/2009:04:40:3...") (3, "msnbot-65-55-105-180.search.msn.com - - [16/...")(4, "astdenis-105-1-18-94.w81-248.abo.wanadoo.fr ...") ...

Page 27: Is There Room For Another Elephant In Tucson

Output from Mappers,Input to the Reducers...("com.googlebot.crawl-66-249-71-34", 1)("com.googlebot.crawl-66-249-71-34", 1)("com.ask.crawler5108", 1)("com.ask.crawler5108", 1) ("com.msn.search.msnbot-65-55-220-136", 1) ("com.msn.search.msnbot-65-55-105-180", 1)("com.msn.search.msnbot-65-55-220-136", 1) ("com.msn.search.msnbot-65-55-220-136", 1) ("fr.wanadoo.abo.w81-248.astdenis-105-1-18-94", 1)

Page 28: Is There Room For Another Elephant In Tucson

Output from Reducers...

("com.ask.crawler5108", 2) ("com.googlebot.crawl-66-249-71-34", 2) ("com.msn.search.msnbot-65-55-220-136", 3)("com.msn.search.msnbot-65-55-105-180", 1)("fr.wanadoo.abo.w81-248.astdenis-105-1-18-94", 1)...

Page 29: Is There Room For Another Elephant In Tucson

Using the analysis...

We know that analysis of logfiles is a particularly well-suited problem for MapReduce.  But what do companies use the resulting analysis for?  Rackspace's mail division, Mailtrust, used Hadoop for processing email logs.  They use an ad hoc query to determine geographic distribution of their users.  Then, they scheduled this MapReduce job to run monthly and use it help decide where to place new mail servers in their data centers  Source: "Hadoop: The Definitive Guide, by Tom White"

Page 30: Is There Room For Another Elephant In Tucson

A Yellow Elephant Enters...

Page 31: Is There Room For Another Elephant In Tucson

Apache Hadoop Project

3 years old... Grew out of the Lucene & Nutch projects."[I]n a nutshell... Hadoops provides: a reliable shared storage and analysis system."

-- "Hadoop: The Definitive Guide, Tom White" Storage: Hadoop Distributed Filesystem (HDFS)Analysis: MapReduce implementation

... and a small ecosystem of supporting sub-projects

Page 32: Is There Room For Another Elephant In Tucson

Hadoop's assumptions

•   Hardware is going to failure•   Access is going to be in batch processing, so high

  throughput trumps low latency data access•   Data sets are large, files will be gigabytes to terabytes

  in size•   Write-once-read-many is the file access needed by

  applications •   Moving computation is cheaper than moving data•   Must be portable from one platform to another (both 

  software & hardware)

Page 33: Is There Room For Another Elephant In Tucson

Coke/Pepsi, Google/Hadoop

There is nearly a one-to-one mapping between the Google architecture and Apache Hadoop

Page 34: Is There Room For Another Elephant In Tucson

Google/Hadoop Decoder Ring

MapReduceGoogle Filesystem (GFS) BigTableChubby Lock SystemSawzall ....

Hadoop MapReduceHDFSHBaseZooKeeperPig ....

Page 35: Is There Room For Another Elephant In Tucson

NameNode, DataNodes

Only one dedicated machine will run NameNode software service for an entire cluster.  Each machine in a cluster will run DataNode software services.

NameNode plays role of arbitrator & metadata repository (for HDFS).  User data never flows through the NameNode.

NameNode maintains the file system namespace.

Any change to the file system namespace or its properties is recorded by the NameNode.Yes, this means there is a Single Point of Failure. 

Page 36: Is There Room For Another Elephant In Tucson

Big Files, Narrow Access Pattern

HDFS is optimized to store LARGE files, on the order of Gigabytes.  

Files are wrote to disk, start-to-finish, and then immutable.  

Files are read from disk, start-to-finish, by client applications (like MapReduce jobs).

Files are redundantly stored.

Page 37: Is There Room For Another Elephant In Tucson

HDFS

Filesystem is an unfortunate name because it makes us think about files and directories.  We really should think about HDFS as a 'dataset system.'

Page 38: Is There Room For Another Elephant In Tucson

JobTracker, TaskTracker

JobTracker runs on the NameNodeTaskTracker runs on each DataNode

JobTracker pushes out work to available TaskTrackers in the cluster.  It attempts to keep the computation close to the data (again, data locality).  But, if it cannot find an available TaskTracker with the data block needed for the task - it will attempt to schedule with a machine on the same rack. 

So, this means that JobTracker is "rack-aware" (or, that it understands the network topology of the cluster).

Page 39: Is There Room For Another Elephant In Tucson

Re-execute slow running tasks

To avoid the "Convoy Effect", slow running tasks may be reassigned for execution by another DataNode holding the data for a block.   This means that failing or slow hardware will be hold up the rest of the computations for the job.

Re-execution of tasks can be done when "speculative-execution" enabled.

Page 40: Is There Room For Another Elephant In Tucson

Hadoop Ecosystem• HBase

A distributed column-oriented database (BigTable impl)• Hive

A distributed data warehouse.• Pig

A data flow language & execution environments for exploring very large datasets.

• ZookeeperA distributed, highly available coordination service.

• ChukwaA distributed data collection & analysis system.

• AvroA data serialization system for efficient, cross-language RPC, and persistent data storage

Page 41: Is There Room For Another Elephant In Tucson

Where to next? • Cloudera Training Videos

http://cloudera.com/hadoop-training• Hadoop: The Definitive Guide, Tom White, O’Reilly/Yahoo!

• Intro to Parallel Programming & MapReducehttp://code.google.com/edu/parallel/mapreduce-tutorial.html

• Google Papers MapReduce, Google Filesystem, BigTable

• Trending Topicshttp://www.trendingtopics.org/

• Tutorials everywhere!

Page 42: Is There Room For Another Elephant In Tucson

Acknowledgments

iPlant Collaborative for a job (& allowing me to research Hadoop)

Cloudera and Aaron Kimball for training videos Tom White for "Hadoop: The Definitive Guide"

Page 43: Is There Room For Another Elephant In Tucson

Photo Acknowledgments

S#01: http://www.flickr.com/photos/kitkaphotogirl/3186255594/sizes/o/

S#02: taken by Alex YelichS#05: http://www.flickr.com/photos/toptechwriter/322770006/sizes/o/ http://bit.ly/29hzL1 & http://bit.ly/3G592M

S#07: http://jonasboner.com/talks/state_youre_doing_it_wrong/pictures/moores_law.jpg

S#08: http://milwaukee.indymedia.org/en/2006/05/205520.shtml

S#09: http://www.flickr.com/photos/ucumari/1203329752/sizes/l/

S#10: http://www.flickr.com/photos/t/236605/sizes/l/

S#11: Andrew Lenards

S#13: http://www.flickr.com/photos/bbaltimore/1412386/sizes/o/

S#44: http://www.flickr.com/photos/autumn_bliss/414160235/sizes/o/

Page 44: Is There Room For Another Elephant In Tucson
Page 45: Is There Room For Another Elephant In Tucson

http://creativecommons.org/licenses/by-nc-sa/3.0/

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site.