10
LEARN WELL TECHNOCRAFT

%5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

DATA SCIENCE/ MACHINE LEARNING SYLLABUS

8 T H Y E A R O F A C C OM P L I S H M E N T S

LEARN WELL TECHNOCRAFT

8 4 1 1 0 0 2 3 3 9 / 7 7 0 9 2 9 2 1 6 2 WWW . DW - L E A R NW E L L . C OM I N F O @ DW - L E A R NW E L L . C OM 2 0 3 , S U P R EM E C E N T E R , I T I R O A D , A B O V E P I Z Z A H U T , A U N D H , P U N E - 4 1 1 0 0 7 .

A U T H O R I Z E D G L O B A L C E R T I F I C A T I O N C E N T E R F O R M I C R O S O F T , O R A C L E , I B M ,

  AW S A N D MAN Y MO R E .

Page 2: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

ACHIEVEMENTS FROM TRAINING

C A N D I D A T E W I L L B E A B L E T O S H O W 2 - 3

Y E A R S E X P E R I E N C E A F T E R T R A I N I N G .

R E A L T I M E S C E N A R I O S , C A S E S T U D I E S ,

  P R O J E C T S I N C L U D E D .

R E A L T I M E D A T A P R O V I D E D F O R

P R A C T I C E .

S O F T W A R E W I L L B E I N S T A L L E D O N

  C A N D I D A T E S M A C H I N E .

I N D I V I D U A L 1 O N 1 D I S C U S S I O N S F O R

  R E S U M E M O D I F I C A T I O N S .

L E A R N F R O M I N D U S T R Y E X P E R T S .

G L O B A L C E R T I F I C A T I O N P R E P A R A T I O N .

A P P E A R F O R G L O B A L C E R T I F I C A T I O N A T

  L E A R N W E L L T E C H N O C R A F T I T S E L F .

G E T D I S C O U N T E D C E R T I F I C A T I O N

  V O U C H E R S .

A U T H O R I Z E D G L O B A L C E R T I F I C A T I O N

  C E N T E R F O R P E A R S O N , P S I , K R Y T E R I O N .

WWW.DW-LEARNWELL.COM

Page 3: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

Data Scientist SyllabusData Scientist/Analyst, now a day’s the most buzzing work in IT

world. Businesses are generating so much of the data and the need to

analyze the data is top most priority.

Keeping in line with the market requirements, as per the job

description for data scientist role, we have designed a new course:

Complete R Programming:  R is a Data Analytical Language

SAS: For each module of R, we will cover SAS also(Optional) Python: Python is an Data Analytical Language (Optional) Hadoop: Basic to intermediate aspects of Hadoop Spark: Hadoop combined with Spark makes a great combination.(Optional) Tableau: It’s the Visualization tool, which helps in presenting the reports and

graph’s to business. Excel/SQL: It’s very vital for a Data Scientist to work on excel files and

Databases.(Optional)

R ProgrammingWhat is R?

Birth and Rise of R Links for the necessary software GUI of R: IDE and Statistical Analysis Interfaces R Workspace GUI of RStudio

Basic Operations in R Expressions: Basic Idea

Constant Values: Numeric & Non-numeric Arithmetic: Operations and BODMAS Conditions: Equality, Greater Than, Less Than, etc. Function Calls: Introduction to R Functions Symbols & Assignment Keywords: NA, Inf, NaN, NULL, TRUE, FALSE Naming a Variable: Generally accepted conventions

Data Types & Data Structures in R Basic data types

Basic data structures: Vector, Factor, Matrices, Data Frame, List

Page 4: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

Subsetting in R Vector Subsetting

c() function: Creation of Vectors Using rep() and seq() functions Using factor() to covert vectors to factors Using data.frame() to create data frames Meta data access: dimnames(), rownames(), colnames() Using matrix() to create matrices Using array() to create arrays Subsetting data frames: row subset, column subset, using subset() function Assigning to a subset

Basic Operations in R

Expressions: Basic Idea

Constant Values: Numeric & Non-numeric Arithmetic: Operations and BODMAS Conditions: Equality, Greater Than, Less Than, etc. Function Calls: Introduction to R Functions Symbols & Assignment Keywords: NA, Inf, NaN, NULL, TRUE, FALSE Naming a Variable: Generally accepted conventions

Additional Topics on Data structures

The recycling rule: Uneven arithmetic operation on vectors

Type coercion: Character to Numeric Automatic Type coercion Coercing factors: Using as.factor() function Changing factor levels Attributes:

Using is.na() to detect NA

Subsetting factors

attribute() functions attr()functions names() functions

Classes: Idea of OOP in R Dates: As a special class Formulas: As a special class Exploring Objects:

summary(), str(), dim() functions Generic functions

Page 5: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

Data Import & Export Text formats: Reading Delimited Files

read.table() function Using read.fwf() function for fixed width files Using readLines() for reading lines

Using write. csv() function to store data as CSV files

Reading Excel file: Package XLConnect Reading SPSS file: Package Foreign Reading SAS data file: Package sas7bdat Database connection: The ideas of ODBC connecting in Windows RODBC package: Create and Query database from R Basic SQL

Control Structures & User defined Functions Conditional Statements

If statement: The Structure If Else statement: The Structure ifelse() function Iteration & Looping The for loop The while loop The repeat statement lapply() function sapply() function apply() function User defined function Variable scooping: Global and Local Variables Using user defined functions inside function definition

Charting with R The plot function

plot.new() function: Generating new plot object plot.window() function: Creating window

points() function: Plotting points axis() function: Generating Axis box() function: Creating enclosure title() function: Assigning title par() function: Fixing plotting parameters lines() function: Adding connector lines Multi figure layout: Creating multiple charts in the same window hist() function: Plotting histograms

Page 6: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

Kernel Density Plot: The non-parametric probability distribution Comparing Groups via Kernel Density: Comparing two different probability distributions Simple Bar Plot: Visualizing categorical data Staked Bar Plot: Understating category composition Grouped Bar Plot Line Charts Pie Charts Boxplots: Understanding data distributions and outliers Geo Charts Motion Charts

Summary statistics for data t tests: Comparing means Anova: Comparing means and causal relations Factor Analysis: Dimension Reduction technique Cluster Analysis: Segmentation and Homogeneous groups of data

Analytics & Data Mining Using R Linear Regression: Predicting from uni-linear causality

Logistic Regression: Predicting the probability in a binary outcome Situations. Time series Analysis: Automated ARIMA Decision Trees: Conditional inference trees for classification and Profiling

Analytics: Association Rule Mining Using R (Market BasketAnalysis)

Introduction to Association learning Different types of association algorithms Apriori Algorithm: Support, Confidence and Lift Market basket Analysis

Text Mining Using R

Introduction to Text Mining Keyword search Word cloud Sentiment Analysis Twitter Data Analysis – Case Study.

Page 7: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

SASNote: For each module for R programming SAS topics will be covered including Regression, Machine Learning and Sentiment Analysis etc.

HadoopHadoop 1.0 overview and enhancements in Hadoop 2.0

Hadoop installation and setup using Virtual Box and Hortonworks distro. Typical Cluster architecture in Hadoop 1.0 vs Hadoop 2.0 (optional) HDFS architecture MR architecture Java MR example (optional for interested students) Scoop hands on with example Pig hands on with example Hive hands on with example

Note: We have a separate 5-6 weekends detailed course for Hadoop also.

Spark

Installation & Overview.

Reading data from text files Basics of Spark and core concepts like RDD, caching etc. Understand few famous programs like word count and additionally trying few more. Trying out various APIs offered by Spark Core libraries.

Introduction to Spark

Overview of SparkSQL

Using Hive meta data with Spark SQL

Spark SQL

SchemaRDDs Using various File formats like Parquet and JSON Using Spark SQL and Hive UDFs

Spark ML Overview of Spark ML

Understanding Vectors Understanding Linear regression and running with Spark ML Understanding Logistic regression and running with Spark ML Running Clustering example with Spark ML Dimensionality reduction in Spark using Principal Component Analysis

Page 8: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

Tableau

User interface basics Connecting to data Dimensions vs. measures Show Me Marks card Simple formatting Building views Building a dashboard

Overview:

Connecting to single or multiple tables Connecting live versus importing the data

Connecting to Excel, CSV and Text Files:

Editing data connections after initial connection Data source filtering

Flat Files (Excel, CSV, Access DB) Relational Databases ODBC Drivers Live or Import Data Connection Metadata Management Multiple Data Connections Creating and Refreshing an Extract

Working with Data:

Hierarchies Sorting Grouping Filtering Aggregations Trend lines Page shelf Forecasting

Analysis:

Page 9: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

Aggregate Calculations Row-Level Calculations Quick Table Calculations

Calculation:

Dashboard Objects Filter Actions URL Actions Sizing Tiled and Floating Sheets Dynamic Sheet Titles

Dashboard:

Publishing the Workbook Scheduling Refresh Extract Managing Authentication and Authorization Monitoring Background Tasks Automation of Reports

Tableau Server(Provided availability of License):

Note: We have a separate detailed 5-6 weekends course for Tableau also.

Formatting: Row-banding

Number formatting Text formatting Shading Labels

Analysis: Tooltips

Page 10: %5 4$*&/$& .$)*/& -&3/*/( 4:--#64 LEARN WELL Well... · D a t a S cie n t is t S y l l a b u s Data Sci enti st/ Anal yst, now a day’s the most b uzzi ng wor k i n IT wor l d. Busi

Also Available

Internships - Paid / Free

Internship certifications on

successful   completion

Final year Collage Projects on

Latest  Skills

Special Project batches

Collage Seminars

www.dw-learnwell.com