Upload
amit-chaudhary
View
1.335
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Hadoop is a project run under Apache. It is an efficient choice to manage big clusters of data easily.
Citation preview
Structured, Unstructured and Complex Data Management
Amit Chaudhary 11MCA03
Karthik Iyer 11MCA05
Hadoop
What is this? Structure of this Is this unknown thing right for me? Where is this used?
Any idea? (Idea SIM card)
What is ?
It is an open source project by the Apache Foundation to handle large data processing
It was inspired by Google’s MapReduce and Google File System (GFS) papers
It was originally conceived by Doug Cutting
It is named after his son’s pet elephant incidentally
Large Data Means?
1000 kilobytes = 1 Megabyte 1000 Megabytes = 1 Gigabyte 1000 Gigabytes = 1 Terabyte 1000 Terabytes = 1 Petabyte 1000 Petabytes = 1 Exabyte 1000 Exabytes = 1 Zettabyte 1000 Zettabytes = 1 Yottabyte 1000 Yottabytes = 1 Bronobyte 1000 Bronobytes = 1 Geopbyte
So what’s the big deal?
Scalable: New nodes can be added as needed, without changing the formats
Flexible: It is schema-less, and can absorb any type of data, structured or not, from any number of sources
Fault tolerant: System redirects work to another location if a node fails
Hadoop = HDFS + MapReduce
HDFS: For storing massive datasets using low-cost storage
MapReduce: The algorithm on which Google built its empire
HDFS
It is a fault-tolerant storage system Able to store huge amounts of
information It creates clusters of machines and
coordinates work among them If one fails, it continues to operate the
cluster without losing data or interrupting work, by shifting work to the remaining machines in the cluster
HDFS
It manages storage on the cluster by breaking incoming files into pieces, called blocks
Stores each of the blocks redundantly across the pool of servers
It stores three complete copies of each file by copying each piece to three different servers
How this works?
How this works?
Which companies are using? LinkedIn Walt Disney Wal-mart General Electric Nokia Bank of America Foursquare
at Foursquare
Foursquare: Mobile + Location + Social Networking
Is this unknown thing right for me?