Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
DATA SCIENCE/ MACHINE LEARNING SYLLABUS
8 T H Y E A R O F A C C OM P L I S H M E N T S
LEARN WELL TECHNOCRAFT
8 4 1 1 0 0 2 3 3 9 / 7 7 0 9 2 9 2 1 6 2 WWW . DW - L E A R NW E L L . C OM I N F O @ DW - L E A R NW E L L . C OM 2 0 3 , S U P R EM E C E N T E R , I T I R O A D , A B O V E P I Z Z A H U T , A U N D H , P U N E - 4 1 1 0 0 7 .
A U T H O R I Z E D G L O B A L C E R T I F I C A T I O N C E N T E R F O R M I C R O S O F T , O R A C L E , I B M ,
AW S A N D MAN Y MO R E .
ACHIEVEMENTS FROM TRAINING
C A N D I D A T E W I L L B E A B L E T O S H O W 2 - 3
Y E A R S E X P E R I E N C E A F T E R T R A I N I N G .
R E A L T I M E S C E N A R I O S , C A S E S T U D I E S ,
P R O J E C T S I N C L U D E D .
R E A L T I M E D A T A P R O V I D E D F O R
P R A C T I C E .
S O F T W A R E W I L L B E I N S T A L L E D O N
C A N D I D A T E S M A C H I N E .
I N D I V I D U A L 1 O N 1 D I S C U S S I O N S F O R
R E S U M E M O D I F I C A T I O N S .
L E A R N F R O M I N D U S T R Y E X P E R T S .
G L O B A L C E R T I F I C A T I O N P R E P A R A T I O N .
A P P E A R F O R G L O B A L C E R T I F I C A T I O N A T
L E A R N W E L L T E C H N O C R A F T I T S E L F .
G E T D I S C O U N T E D C E R T I F I C A T I O N
V O U C H E R S .
A U T H O R I Z E D G L O B A L C E R T I F I C A T I O N
C E N T E R F O R P E A R S O N , P S I , K R Y T E R I O N .
WWW.DW-LEARNWELL.COM
Data Scientist SyllabusData Scientist/Analyst, now a day’s the most buzzing work in IT
world. Businesses are generating so much of the data and the need to
analyze the data is top most priority.
Keeping in line with the market requirements, as per the job
description for data scientist role, we have designed a new course:
Complete R Programming: R is a Data Analytical Language
SAS: For each module of R, we will cover SAS also(Optional) Python: Python is an Data Analytical Language (Optional) Hadoop: Basic to intermediate aspects of Hadoop Spark: Hadoop combined with Spark makes a great combination.(Optional) Tableau: It’s the Visualization tool, which helps in presenting the reports and
graph’s to business. Excel/SQL: It’s very vital for a Data Scientist to work on excel files and
Databases.(Optional)
R ProgrammingWhat is R?
Birth and Rise of R Links for the necessary software GUI of R: IDE and Statistical Analysis Interfaces R Workspace GUI of RStudio
Basic Operations in R Expressions: Basic Idea
Constant Values: Numeric & Non-numeric Arithmetic: Operations and BODMAS Conditions: Equality, Greater Than, Less Than, etc. Function Calls: Introduction to R Functions Symbols & Assignment Keywords: NA, Inf, NaN, NULL, TRUE, FALSE Naming a Variable: Generally accepted conventions
Data Types & Data Structures in R Basic data types
Basic data structures: Vector, Factor, Matrices, Data Frame, List
Subsetting in R Vector Subsetting
c() function: Creation of Vectors Using rep() and seq() functions Using factor() to covert vectors to factors Using data.frame() to create data frames Meta data access: dimnames(), rownames(), colnames() Using matrix() to create matrices Using array() to create arrays Subsetting data frames: row subset, column subset, using subset() function Assigning to a subset
Basic Operations in R
Expressions: Basic Idea
Constant Values: Numeric & Non-numeric Arithmetic: Operations and BODMAS Conditions: Equality, Greater Than, Less Than, etc. Function Calls: Introduction to R Functions Symbols & Assignment Keywords: NA, Inf, NaN, NULL, TRUE, FALSE Naming a Variable: Generally accepted conventions
Additional Topics on Data structures
The recycling rule: Uneven arithmetic operation on vectors
Type coercion: Character to Numeric Automatic Type coercion Coercing factors: Using as.factor() function Changing factor levels Attributes:
Using is.na() to detect NA
Subsetting factors
attribute() functions attr()functions names() functions
Classes: Idea of OOP in R Dates: As a special class Formulas: As a special class Exploring Objects:
summary(), str(), dim() functions Generic functions
Data Import & Export Text formats: Reading Delimited Files
read.table() function Using read.fwf() function for fixed width files Using readLines() for reading lines
Using write. csv() function to store data as CSV files
Reading Excel file: Package XLConnect Reading SPSS file: Package Foreign Reading SAS data file: Package sas7bdat Database connection: The ideas of ODBC connecting in Windows RODBC package: Create and Query database from R Basic SQL
Control Structures & User defined Functions Conditional Statements
If statement: The Structure If Else statement: The Structure ifelse() function Iteration & Looping The for loop The while loop The repeat statement lapply() function sapply() function apply() function User defined function Variable scooping: Global and Local Variables Using user defined functions inside function definition
Charting with R The plot function
plot.new() function: Generating new plot object plot.window() function: Creating window
points() function: Plotting points axis() function: Generating Axis box() function: Creating enclosure title() function: Assigning title par() function: Fixing plotting parameters lines() function: Adding connector lines Multi figure layout: Creating multiple charts in the same window hist() function: Plotting histograms
Kernel Density Plot: The non-parametric probability distribution Comparing Groups via Kernel Density: Comparing two different probability distributions Simple Bar Plot: Visualizing categorical data Staked Bar Plot: Understating category composition Grouped Bar Plot Line Charts Pie Charts Boxplots: Understanding data distributions and outliers Geo Charts Motion Charts
Summary statistics for data t tests: Comparing means Anova: Comparing means and causal relations Factor Analysis: Dimension Reduction technique Cluster Analysis: Segmentation and Homogeneous groups of data
Analytics & Data Mining Using R Linear Regression: Predicting from uni-linear causality
Logistic Regression: Predicting the probability in a binary outcome Situations. Time series Analysis: Automated ARIMA Decision Trees: Conditional inference trees for classification and Profiling
Analytics: Association Rule Mining Using R (Market BasketAnalysis)
Introduction to Association learning Different types of association algorithms Apriori Algorithm: Support, Confidence and Lift Market basket Analysis
Text Mining Using R
Introduction to Text Mining Keyword search Word cloud Sentiment Analysis Twitter Data Analysis – Case Study.
SASNote: For each module for R programming SAS topics will be covered including Regression, Machine Learning and Sentiment Analysis etc.
HadoopHadoop 1.0 overview and enhancements in Hadoop 2.0
Hadoop installation and setup using Virtual Box and Hortonworks distro. Typical Cluster architecture in Hadoop 1.0 vs Hadoop 2.0 (optional) HDFS architecture MR architecture Java MR example (optional for interested students) Scoop hands on with example Pig hands on with example Hive hands on with example
Note: We have a separate 5-6 weekends detailed course for Hadoop also.
Spark
Installation & Overview.
Reading data from text files Basics of Spark and core concepts like RDD, caching etc. Understand few famous programs like word count and additionally trying few more. Trying out various APIs offered by Spark Core libraries.
Introduction to Spark
Overview of SparkSQL
Using Hive meta data with Spark SQL
Spark SQL
SchemaRDDs Using various File formats like Parquet and JSON Using Spark SQL and Hive UDFs
Spark ML Overview of Spark ML
Understanding Vectors Understanding Linear regression and running with Spark ML Understanding Logistic regression and running with Spark ML Running Clustering example with Spark ML Dimensionality reduction in Spark using Principal Component Analysis
Tableau
User interface basics Connecting to data Dimensions vs. measures Show Me Marks card Simple formatting Building views Building a dashboard
Overview:
Connecting to single or multiple tables Connecting live versus importing the data
Connecting to Excel, CSV and Text Files:
Editing data connections after initial connection Data source filtering
Flat Files (Excel, CSV, Access DB) Relational Databases ODBC Drivers Live or Import Data Connection Metadata Management Multiple Data Connections Creating and Refreshing an Extract
Working with Data:
Hierarchies Sorting Grouping Filtering Aggregations Trend lines Page shelf Forecasting
Analysis:
Aggregate Calculations Row-Level Calculations Quick Table Calculations
Calculation:
Dashboard Objects Filter Actions URL Actions Sizing Tiled and Floating Sheets Dynamic Sheet Titles
Dashboard:
Publishing the Workbook Scheduling Refresh Extract Managing Authentication and Authorization Monitoring Background Tasks Automation of Reports
Tableau Server(Provided availability of License):
Note: We have a separate detailed 5-6 weekends course for Tableau also.
Formatting: Row-banding
Number formatting Text formatting Shading Labels
Analysis: Tooltips
Also Available
Internships - Paid / Free
Internship certifications on
successful completion
Final year Collage Projects on
Latest Skills
Special Project batches
Collage Seminars
www.dw-learnwell.com