27
HADOOP ONLINE TRAINING @ TRAINING ICON Online Training | Corporate Training CONTACT US: TRAINING ICON INDIA +91-9666900051 USA : +1-408-791-8864 [email protected] www.trainingicon.com

Big data Hadoop an Introduction Online Training @ Training Icon

Embed Size (px)

DESCRIPTION

Training Icon is a Global Interactive Learning company started by proven industry experts with an aim to provide Quality Training in the latest IT Technologies. Training Icon has a pool of Expert Trainers worldwide on all the technologies to train the students. Training Icon is offering Training services to Major IT giants and to individual students worldwide. About Our faculty: we have excellent HADOOP instructors who have real time experience plus expert orientation in Online Training. We offer you: 1. Interactive Learning at Learners convenience 2. Industry Savvy Trainers 3. “Real Time" Practical scenarios 4. Learn Right from Your Place 5. Customized Course Curriculum 6. 24/7 Server Access 7. Support after Training and Certification Guidance 8. Resume Preparation and Interview assistance 9. Recorded version of sessions We also provide online training on SAP HANA, SAP BASIS, SAP ABAP, SAP FICO , SAP BI, SAP BO,SAP OIL & GAS, SAP All Modules, ORACLE APPS ALL MODULES,ORACLE DBA,WORKDAY,HADOOP,JAVA,MS DYNAMICS AXAPTA,ALL ETL TOOLS, WEBLOGIC Experience the Quality of our Online Training. For Free Demo Please Contact Rathan, INDIA: +91-9666900051, US: +1 408-791-8864, Email id: [email protected] http://www.trainingicon.com

Citation preview

Page 1: Big data Hadoop an Introduction Online Training @ Training Icon

HADOOP ONLINE TRAINING @ TRAINING ICON

Online Training | Corporate Training CONTACT US:TRAINING ICONINDIA +91-9666900051USA : [email protected]

Page 2: Big data Hadoop an Introduction Online Training @ Training Icon

Why and What Hadoop ?

A tool to process big data

Page 3: Big data Hadoop an Introduction Online Training @ Training Icon

What is BIG Data ?

Facebook, Google+ etc.,

Machines too generate lots of data

We are having a online discussion now , certainly how many of us are in this conference will also be recorded as data.

Page 4: Big data Hadoop an Introduction Online Training @ Training Icon

What is BIG Data ? ..continued

Exponential growth of data challenges to Google, Yahoo, Microsoft, Amazon

Need to go through TBs and PBs of data ?

Which websites and books were popular ? What kind of Ads appeal to them ?

Existing tools became inadequate to process such large data sets.

Page 5: Big data Hadoop an Introduction Online Training @ Training Icon

Why is the data so BIG ?

Till Couple of decade back Floppy disks

From then on CD/DVD Drives

Half a decade back Hard drives (500 GB)

Now Hard Drives(I TB) are available in abundance

Page 6: Big data Hadoop an Introduction Online Training @ Training Icon

Why is the data so BIG ?

So WHAT ?

Even the technology to read has taken a leap.

Page 7: Big data Hadoop an Introduction Online Training @ Training Icon

Why is the data so BIG ?

Year Device VolumeData

Transfer speed

Time to process

1990 Optical Drive 1370 MB 4.4 MB/s 5 minutes

2012 1 TB SATA Drives 1 TB 100 MB/s 2.5 Hrs

Page 8: Big data Hadoop an Introduction Online Training @ Training Icon

How to handle such BIG ?

BIG elephant Numerous small chicken ?

Page 9: Big data Hadoop an Introduction Online Training @ Training Icon

How to handle such BIG ?Concept of Torrents

Reduce time to read by reading it from multiple sources simultaneously.

Imagine if we had 100 drives, each holding one hundredth of the data. Working in parallel, we could read the data in less than two minutes.

Page 10: Big data Hadoop an Introduction Online Training @ Training Icon

How to handle such BIG ? -- Issues

How to handle a system up and downs ?

How to combine the data from all the systems ?

Page 11: Big data Hadoop an Introduction Online Training @ Training Icon

Problem1 : System’s Ups and Downs Commodity hard ware for data storage and analysis

Chances of failure are very high

So, have a redundant copy of the same data across some machines

In case of eventuality of one machine, you have the other

Google came up with a file system GFS (Google File System) which implemented all these details.

Page 12: Big data Hadoop an Introduction Online Training @ Training Icon

Problem 2 : How to combine the data ?

Analyze data across different machines , But how do we merge them to get a meaningful outcome ?

Yes, all (some) of the data has to travel across network. Then only merging of the data can occur.

Doing this is notoriously challenging

Again Google Map—Reduce

Page 13: Big data Hadoop an Introduction Online Training @ Training Icon

Map ReduceProvides a programming model abstracts the problem of

disk reads and writes transforming in to a computation of keys and values.

Two phases

Map

Reduce

Page 14: Big data Hadoop an Introduction Online Training @ Training Icon

So what is Hadoop ? An operating system ?

Provides

1. A reliable shared storage system

2. Analysis system

Page 15: Big data Hadoop an Introduction Online Training @ Training Icon

History of Hadoop

Google was the first to launch GFS and MapReduce

They published a paper in 2004 announcing the world a brand new technology

This technology was well proven in Google by 2004 itself

MapReduce paper by Google

Page 16: Big data Hadoop an Introduction Online Training @ Training Icon

History of Hadoop

Doug Cutting saw an opportunity and led the charge to develop an open source version of this MapReduce system called Hadoop .

Soon after, Yahoo and others rallied around to support this effort.

Now Hadoop is core part in : Facebook, Yahoo, LinkedIn, Twitter …

Page 17: Big data Hadoop an Introduction Online Training @ Training Icon

History of Hadoop

GFS HDFS

MapReduce MapReduce

Page 18: Big data Hadoop an Introduction Online Training @ Training Icon

HDFS -- A BriefDesign Streaming very large files on commodity cluster.

1. Very Large FilesMBs to PBs

2. Streaming Write once read many approachAfter huge data being placed We tend to use the data not modify itTime to read the whole data is more important

3. Commodity ClusterNo High end ServersYes, high chance of failure (But HDFS is tolerant enoguh)Replication is done

Page 19: Big data Hadoop an Introduction Online Training @ Training Icon

MapReduce -- A BriefLarge scale data processing in parallel.

MapReduce provides:Automatic parallelization and distributionFault-toleranceI/O schedulingStatus and monitoring

Two phases in MapReduceMapReduce

Page 20: Big data Hadoop an Introduction Online Training @ Training Icon

MapReduce -- A Brief

Map phase map (in_key, in_value) -> list(out_key, intermediate_value) Processes input key/value pair Produces set of intermediate pairs

Reduce Phase reduce (out_key, list(intermediate_value)) -> list(out_value) Combines all intermediate values for a particular key Produces a set of merged output values (usually just one)

http://www.excelonlineclasses.co.nr/

Page 21: Big data Hadoop an Introduction Online Training @ Training Icon

MapReduce -- A Brief

Page 22: Big data Hadoop an Introduction Online Training @ Training Icon

Hadoop Cluster

Page 23: Big data Hadoop an Introduction Online Training @ Training Icon

Hadoop Ecosystems

Page 24: Big data Hadoop an Introduction Online Training @ Training Icon

Pre-Requisites

Core-Java

Acquaintance with LINUX will help.

For better experience :- Linux installation on your machines.

Page 25: Big data Hadoop an Introduction Online Training @ Training Icon

We offer you:

1. Interactive Learning at Learners convenience 2. Industry Savvy Trainers 3. “Real Time" Practical scenarios 4. Learn Right from Your Place 5. Customized Course Curriculum 6. 24/7 Server Access 7. Support after Training and Certification Guidance 8. Resume Preparation and Interview assistance 9. Recorded version of sessions

Page 26: Big data Hadoop an Introduction Online Training @ Training Icon

Thank you Your feedback is highly important to improve our course

material.

For Free Demo Please Contact  Rathan, INDIA: +91-9666900051,  US: +1 408-791-8864, Email id: [email protected] http://www.trainingicon.com

Page 27: Big data Hadoop an Introduction Online Training @ Training Icon

Disclaimer

Training Icon Online classes acknowledges the proprietary rights of the trademarks and product names of other companies mentioned in any of the training material including but not limited to the handouts, written material, videos, power point presentations, etc. All such training materials are provided to our students for learning purposes only. Students shall not use such materials for their private gain nor can they sell any such materials to a third party. Some of the examples provided in any such training materials may not be owned by us and as such we does not claim any proprietary rights for the same. We does not guarantee nor is it responsible for such products and projects. We acknowledges that any such information or product that has been lawfully received from any third party source is free from restriction and without any breach or violation of law whatsoever.