19
Fundamentals of Machine Learning Instructor: Ekpe Okorafor 1. Accenture – Big Data Academy 2. Computer Science African University of Science & Technology

Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Fundamentals of Machine Learning

Instructor: Ekpe Okorafor1. Accenture – Big Data Academy

2. Computer Science African University of Science & Technology

Page 2: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Affiliations:• Accenture Digital – Big Data Academy

Principal & Faculty, Applied Intelligence

• African University of Science & Technology

Visiting Professor, Computer Science / Data Science

• Dallas, USA

Ekpe Okorafor, PhD

Email: [email protected]; [email protected]

Twitter: @EkpeOkorafor; @Radicube

• Big Data, Predictive & Adaptive Analytics

• Artificial Intelligence, Machine Learning

• Performance Modelling and Analysis

• Information Assurance and Cybersecurity.

• High Performance Computing & Network Architectures

• Distributed Storage & Processing

• Massively Parallel Processing & Programming

• Fault-tolerant Systems

Research Interests:

Page 3: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Objectives

Objectives

• What Machine Learning is

• When to Leverage Machine learning

• Machine Learning algorithms

• Machine Learning methodology

3

Page 4: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

What is Machine Learning

4

Machines are taking over!

Page 5: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

But Seriously, What is Machine Learning?

5

“Machine Learning is the science of getting computers to act without being

explicitly programmed.” – Andrew Ng (Coursera)

“A computer program is said to learn from experience E with respect to some

class of tasks T and performance measure P, if its performance at task in T,

as measured by P, improves with experience E.” – Tom M. Mitchell (1997)

Page 6: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

What are AI and ML?

6

• Artificial Intelligence (AI) is a branch or Computer Science that uses algorithms and techniques to

mimic human intelligence

• Machine Learning (ML) is one of several AI techniques for sophisticated cognitive tasks

Computer Science

Mathematical foundations

Algorithms and data structures

Artificial IntelligenceCommunication and security

Computer Architecture

Databases

……

Symbolic AL (e.g. Expert

Systems)

Probabilistic AI (e.g.

Search & optimization)

Machine LearningDecision trees

Bayesian inference

Deep learning

Reinforced learning

Support vector machines

Neural networks

Random forest

……

Page 7: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Machine Learning

7

• Machine Learning is a particularly interesting technique because it represents a paradigm shift

within AI

Traditional AI techniques

Data

Logic

Machine Learning

Data

Output

Output

Logic

• Static – hard-coded set of

steps and scenarios

• Rule Based – expert

knowledge

• No generalization – handling

special cases is difficult

• Dynamic – evolves with data,

finds new patterns

• Data driven – discovers

knowledge

• Generalization – adapts to

new situations and special

cases

Page 8: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Machine Learning - Example

8

Symbolic AI

“Let us sit down with the

world’s best chess player,

Ekpe Okorafor, and put his

knowledge into a computer

program”

Mathematical/Statistical AI Machine Learning Approach

• Example - Excelling at playing the game of chess

“Let us simulate all the

different possible moves and

the associated outcomes at

each single step and go with

the most likely to win”

“Let us show millions of

examples or real life and

simulated games (won and

lost) to the program, and

let it learn from

experience”

Page 9: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Machine Learning – When to use

9

Tasks programmers can’t describe

• Machine learning is particularly good at solving 2 types of problems

where other AI techniques fail

Complex multidimensional problems that can’t be

solved by numerical reasoning

Hand writing

Cognitive Reasoning

Weather Forecasting

Network Intrusion

Health Care Outcomes

Movie Recommendation

Page 10: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Machine Learning – Breaking it down

10

Supervised and Unsupervised Learning

• Supervised learning - we already know the answers we want (found in

past or completed data).

• Unsupervised learning - we want to find unknown structures or trends.

Input DataInformation + Answers

ResultOptimum Model

• Relationships

• Patterns

• Dependencies

• Hidden structures

Machine Learning

Page 11: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Supervised Learning

11

Page 12: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Unsupervised Learning

12

Page 13: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Supervised and Unsupervised Learning

13

REGRESSION:

Estimate continuous values

(Real-valued output)

Supervised Learning:

Predicting values. Known targets.

User inputs correct answers to learn from. Machine uses the information to guess new answers.

CLASSIFICATION:

Identify a unique class

(Discrete values, Boolean, Categories)

CLUSTER ANALYSIS:

Group into sets

Unsupervised Learning:

Search for structure in data. Unknown targets.

User inputs data with undefined answers. Machine finds useful information hidden in data

DENSITY ESTIMATION:

Approximate distribution

DENSITY REDUCTION:

Select relevant variables

Page 14: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Supervised and Unsupervised Learning

14

Supervised Learning:

Unsupervised Learning:

Regression• Linear Regression

• Ordinary Least Squares Regression

• LOESS (Local Regression)

• Neural Networks

Classification• Decision Trees

• K-Nearest Neighbors

• Support Vector Machine

• Logistic Regression

• Naïve Bayes

• Random Forests

Cluster Analysis• K-Means Clustering

• Hierarchical Clustering

Dimension Reduction• Principal Component Analysis (PCA)

• Linear Discriminant Analysis (LDA)

Page 15: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

What About Reinforcement Learning?

15

10 Mins

Page 16: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Machine Learning Application –

Recommender Systems

16

• Recommender systems deal with making recommendations based upon

previously collected data and leveraging ML techniques.

Content Based (Features)

Modified Linear Regression

Non-content Based (No Features)

Collaborative Filtering

Matrix Factorization

Page 17: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Train & Test Methodology

17

• ML techniques use a train + test system (commonly known as cross-

validation) before using findings in real situations.

TRAINING:Learn data properties

1. The machine makes conclusions by learning

from the data

2. It improves its model until optimal Performance

is reached

3. Using a Cost / Loss Function to measure

Accuracy. It repeats iterations until a

minimum Is reached.

TESTING:Test the properties

1. Apply the conclusions to new data and

compare results to know answers

2. The model does not change. It us just tested to

measure how good the machine did after training

3. Useful to detect overfitting. If good enough, it is ready

to be used

APPLICATION:Use the properties

• In a real situation, the answers are not known

• Apply the model conclusions to predict the

answers from the inputs. Use the answers in

whatever necessary

Page 18: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Additional Resources

18

• ML course at Coursera: https://www.coursera.org/learn/machine-learning/

• Toolbox scikit-learn: http://scikit-learn.org/stable/user_guide.html

• Caret Package: http://topepo.github.io/caret/index.html

• Python and R codes: http://www.analyticsvidhya.com/blog/2015/09/full-

cheatsheet-machine-learning-algorithms/

• Introductory Primer to Machine Learning: http://www.toptal.com/machine-

learning/machine-learning-theory-an-introductory-primer

Page 19: Fundamentals of Machine Learningindico.ictp.it/event/8329/session/7/contribution/27/material/slides/0.… · • ML techniques use a train + test system (commonly known as cross-validation)

Summary

19

• Machine Learning (ML) is one of several AI techniques for sophisticated

cognitive tasks

• Machine Learning is a particularly interesting technique because it

represents a paradigm shift within AI

• Machine learning is particularly good at solving 2 types of problems

where other AI techniques fail

• Tasks programmers can’t describe

• Complex multidimensional problems that can’t be solved by

numerical reasoning

• Machine Learning employs supervised and unsupervised learning

approaches

• ML techniques use a train + test system (commonly known as cross-

validation) before using findings in real situations.