23
Learners Point TRAINING BIG DATA ANALYTICS follow us learnerspoint.org UPSKILL NOW

Big Data Analytics - Learners Point

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Learners PointT R A I N I N G

BIG DATAANALYTICS

follow uslearnerspoint.org

UPSKILL NOW

Big Data Analytics is the process of gathering, managing, and analyzing large sets of data (Big Data) to uncover patterns and other usefulinformation. These patterns are a minefield of information and analysing them provide several insights that can be used by organizations to make business decisions. This analysis is essential for large organizations like Facebook who manage over a billion users every day, and use the data collected to help provide a better user experience.

An IBM listing states that the demand for data science and analytics is expected to grow from 3,64,000 to nearly 27,20,000 by 2020. According to a recent study done by Forrester, companies only analyze about 12% of the data at their disposal. 88% of the data is ignored, mainly due to the lack of analytics and repressive data silos. Imagine the market share of big data if all companies start analysing 100% of the data available to them. Hence the conclusion is that there is no time like now to start investing for major companies big data Analytics. It is paramount that developers upskillthemselves with analytical skills and get ready to take a share of the Big Data career pie.

Training summary

Training summary

Located at theheart of Dubai

We are also associated with CPD UK, the premier accreditation service provider in the United Kingdom.

Learners Point is a well-recognized institute in corporate and individual training in the MENA region and has contributed to the career success of more than 110,000 professionals since its founding in 2001. We are ISO 9001:2015 quality management system certified.

Our training institute is licensed by the Governmentof Dubai, UAE, and our certifications are widely recognized by employers around the globe.

Who

01

Who shouldattend the program?• Fresher’s who would like to build the career in the distributed world, this is an introductory course.• Laterals which want to learn framework like SPARK, hadoop knowledge will add benefit.• Software developers and architects• Analytics professionals• Senior IT professionals• Testing and mainframe professionals• Data management professionals• Business intelligence professionals• Project managers• Aspiring data scientists• Graduates looking to build a career in Big data analytics

Objectives

Objectives

• Hadoop Distributed File System (HDFS) and Map-Reduce (MR)• Understanding the core concepts of hadoop which includes hadoop distributed file system (HDFS) and map-reduce (MR)• Understanding NO-SQL databases like HBASE and CASSANDRA.• Understanding hadoop ecosystem like HIVE, PIG,SQOOP and FLUME• Acquiring knowledge in other aspects like scheduling Hadoop jobs using Python, R, Ruby. Etc.• Developing batch analytics applications for UK web based news channels to Up cast the news and engaging customer with the customized recommendations.• Integrating clickstream and sentimental analytics to the UK Web based news channel.• Deep knowledge in hadoop - ingestion Phase (FLUME AND SQOOP),Storage Phase(HDFS and HBASE), processing Phase (MR, HIVE, PIG and SPARK), cluster management (standalone and YARN) and integrations (HCATALOG, ZOOKEEPER and OOZIE)• Accelerated career growth.• Increased pay package due to hadoop skills.

Course outline

Course outline

Introducing bigdata & hadoop

Hadoop daemonprocesses

HDFS (hadoopdistributed file system)

Hadoop installationmodes and HDFS

Data analyticsusing pentaho asan ETL tool

Integrations

Hadoopdeveloper tasks

Hadoop ecosystems

Course outline

Introducing big data & hadoopLearning objective:You will get introduced to real-world problems withBig data and will learn how to solve those problemswith state-of-the-art tools. Understand how Hadoopoffers solutions to traditional processing with itsoutstanding features. You will get to Know Hadoopbackground and different distributions of Hadoopavailable in the market. Prepare the UnixBox for the training.

Topics:1.1 Big data introduction:• What is big data• Data analytics• Big data challenges• Technologies supported by big data

Course outline

1.2 Hadoop introduction:• What is hadoop?• History of hadoop• Basic concepts• Future of hadoop• The hadoop distributed file system• Anatomy of a hadoop cluster• Breakthroughs of hadoop• Hadoop distributions:• Apache hadoop• Cloudera hadoop• Horton networks hadoop• MapR hadoop

Hands On:Installation of virtual machine using VMPlayer onhost machine. and work with some basics unixcommands needs for hadoop.

Course outline

Hadoop daemon processesLearning objective:You will learn what are the differentdaemons and their functionality at a high level.

Topics:• Name node• Data node• Secondary name node• Job tracker• Task tracker

Hands On:• Creates a unix shell script to run all the deamons at one time.• Starting HDFS and MR separately

Course outline

HDFS (Hadoop distributed file system)Learning objective:You will get to know how to write and read files in HDFS. Understandhow name node, data node and secondary name node take part inHDFS architecture. you will also know different ways of accessing HDFS data.

Topics:• Blocks and input splits• Data replication• Hadoop rack awareness• Cluster architecture and block placement• Accessing HDFS• JAVA approach• CLI approach

Hands On:• Writes a shell script which write and read files in HDFS. Changes replication factor at three levels. Use Java for working with HDFS.• Writes different HDFS commands and also admin commands.

Course outline

Hands On:Install virtual box manager and install Hadoop in Pseudo distributed mode. Changes the differentconfiguration files required for pseudo distributed mode. Performs different file operations on HDFS.

Hadoop Installation Modes and HDFSLearning Objective:You will learn di�erent modes of hadoop, understand pseudo mode fromscratch and work with con�guration. You will learn functionality of di�erent HDFSoperation and visual representation of HDFS read and write actions with theirdaemons namenode and data node.

Topics:Local ModePseudo-distributed ModeFully distributed modePseudo Mode installation and con�gurationsHDFS basic �le operations

Course outline

Topics:• Basic API concepts• The driver class• The mapper class• The reducer class• The combiner class• The partitioner class• Examining a sample mapReduce program with several examples• Hadoop's Streaming API

Hands On:• Learn about writing MR job from scratch,writing different logics in mapper and reducer and submitting the MR job in standalone and distributed mode.• Also learn about writing word count MR job, calculating average salary of employee who meets certain conditions and sales calculation using MR.

Hadoop Developer Tasks

Learning Objective:Understand different Phases in Map Reduce includingMap, Shuffling, Sorting and Reduce Phases. Get a deepunderstanding of Life Cycle of MR in YARN submission.Learn about Distributed Cache concept in detail withexamples. Write Wordcount MR Program and monitorthe Job using Job Tracker and YARN Console. Alsolearn about more use cases.

Course outline

Hadoop ecosystems6.1 PIG - Learning objective:Understand the importance of PIG in Big Data world, PIG architecture and PIG Latin commands for doing different complex operation onrelations, and also PIG UDF and aggregation functions with piggybank library. Learn how to pass dynamic arguments to PIG scripts.

Topics• PIG concepts• Install and configure PIG on a cluster• PIG Vs MapReduce and SQL• Write sample PIG Latin scripts• Modes of running PIG• PIG UDFs.Hands On:Login to Pig grunt shell to issue PIG latin commands indifferent execution modes. Different ways of loading andtransformation on PIG relations lazily. Registering UDF in gruntshell and perform replicated join operations.

Course outline

6.2 HIVE - Learning objective:Understand importance of Hive in Big Data world. Different ways of configuring HIVE metastore. Learn different types of tables in hive. Learn how to optimize hive jobs using partitioning and bucketing and passing dynamic arguments to hive scripts.You will get an understanding of Joins,UDFS,Views etc.

Topics:• Hive concepts• Hive architecture• Installing and configuring HIVE• Managed tables and external tables• Joins in HIVE• Multiple ways of inserting data in HIVE tables• CTAS, views, alter tables• User defined functions in HIVE• Hive UDF

Hands On:Executes hive queries in different Modes. Creates Internal and External tables. Perform query optimization by creating tables withpartition and bucketing concepts. Run system defined and user define functions including explode and windows functions.

Course outline

6.3 SQOOP - Learning Objectives:Learn how to import normally and incrementally data from RDBMS to HDFS and HIVE tables, and also learn how to export the datafrom HDFS and HIVE table to RDBMS. Learns architecture of sqoop Import and export.

Topics:• SQOOP concepts• SQOOP architecture• Connecting to RDBMS• Internal mechanism of import/export• Import data from Oracle/MySQL to HIVE• Export data to Oracle/MySQL• Other SQOOP commands.

Hands On:triggers shell script to call sqoop import and export commands. Learn to automate sqoop incremental imports with entering the ast value of the appended column. Run sqoop export from HIVE table directly to RDBMS.

Course outline

6.4 HBASE - Learning Objectives:Understand different types of NOSQL databases and CAP theorem. Learn different DDL and CRUD operations of HBASE. Understandhbase architecture and Zookeeper Importance in managing HBase. Learns Hbase column family optimization and client side buffering

Hands On:Create HBASE tables using shell and perform CRUD operations with JAVA API. Change the column family properties and also performsharding process. Also create tables with multiple splits to improve theperformance of HBASE query.

Topics:• HBASE concepts• ZOOKEEPER concepts• HBASE and Region server architecture• File storage architecture• NoSQL vs SQL• De�ning Schema and basic operations• DDLs• DMLs• HBASE use cases

Course outline

01

6.5 OOZIE - Learning objectives:Understand oozie architecture and monitor oozie workflow using oozie. Understand how coordinator and bundles workalong with workflow in oozie. Also learn oozie commands to submit, monitor and Kill the workflow.

Topics:• OOZIE concepts• OOZIE architecture• Workflow engine• Job coordinator• Installing and configuring OOZIE• HPDL and XML for creating workflows• Nodes in OOZIE• Action nodes and control nodes• Accessing OOZIE jobs through CLI, and web console• Develop and run sample workflows in OOZIE• Run MapReduce programs• Run HIVE scripts/jobs.

Hands on:Create the Workflow to incremental Imports of sqoop. Create the workflow for Pig, Hive and sqoop exports. And also executecoordinator to schedule the workflows.

Course outline

01

6.6 FLUME - Learning objectives:Understand flume architecture and its componentssource, channel and sinks. Configure flume with socket,file sources and HDFS and Hbase sink. Understand fanin and fan out architecture.

Topics:• FLUME concepts• FLUME architecture• Installation and configurations• Executing FLUME jobs

Hands on:Create flume configurations files and configure withdifferent source and sinks. Stream twitter data andcreate hive table.

Course outline

Data analytics usingpentaho as an ETL toolLearning objective:You will learn pentaho Big Databest practices guidelines, andtechniques documents.

Topics:• Data Analytics using pentaho as an ETL tool• Big Data Integration with zero coding required

Hands on:• You will use pentaho as ETL tool for data analytics.

Course outline

IntegrationsLearning Objective:You will see different integrations amonghadoop ecosystem in a data engineering flow.Also understand how important it is to create aflow for ETL process.

Topics:• MapReduce and HIVE integration• MapReduce and HBASE integration• Java and HIVE integration• HIVE - HBASE Integration

Hands On:Uses storage handlers for integrating HIVEand HBASE. Integrates HIVE and PIG as well.

Trainer

About our trainer• 27+ years in IT Industry, 20 years in Multinationals (IBM/Citibank/SAP)• 10 years of Industry experience in Data Science and Big Data Analytics• 5 years expertise in Coin and Enterprise Blockchain Technology• Published a book “BLOCKCHAIN IN LEGAL SYSTEM”• 15+ years of expertise in Banking & IT Technology • Specialized in Blockchain, Bigdata and Artificial Intelligence integration• Specialized in Tier-4 Datacenters design and executed the Asia-Pacific's largest Data Center in India• Two times Bravo Award winner in IBM, Special awards and Cash awards from Citibank• Y2K Command Center Head, Asia Pacific Head for Unix Systems in IBM• Founder Necxury Blockchain Solutions, Blockchain Director, Decimus- Singapore • Crypto coins and tokens developed and traded across world, IT Infrastructure Solutions Architect• Hands on with 20+ OS (Unix, Windows, Linux Etc..), 15+ Computer Languages (C, Java, Golang etc..), 10+ Databases (Sybase, Oracle, SQL-Serv, Etc. And KVS DB's Couch DB, Mongo DB etc..), Many Middleware's etc. • Trainings conducted for major corporates & resources from IBM, SAP, Samba, Bank of America, JPA, Wipro, Accenture Etc. Cloud, Server, and Storages Architect• In a single point managed up to 3000 Engineers (Server Engineers, NW, DBA and System SME's) • System Security Head in Citicorp for EMEA

Students review

Student review

Aftab Ahammed: IT Professional“I am delighted to join Learners Point for the Big Data Analytics course. Their extensive learning path helped me to excel across the entire modules of Big Data. Also, I was pretty much impressed with the trainer. He took ample time to explain the course content. Hats off to him!.”

Jeeger Parekh: Finance Advisor“Learners Point Training Institutes is one of the best platforms to learn the key concepts of Big DataAnalytics. Big shout to the trainer who explained Hadoop, Administration, Testing and Analysis modules in a much easier manner. His training sessions were structured and easy to follow, plus he finished the course on time.”

+971 (04) 403 8000 | (04) 3266 880

[email protected]

#610 - Business Center, Burjman Metro StationExit 4, Mashreq Bank Building - Dubai

learnerspoint.org

follow us