Introduction to Artificial Intelligence - DGIST · Introduction • This course is an introduction to the field of artificial intelligence. • In this course, I will try to explain

Prof. Inkyu MoonDept. of Robotics Engineering, DGIST

Introduction to Artificial Intelligence

Introduction

What is purpose?

• We want our computers or robotic systems to act intelligently!

• We see that intelligent systems or AI algorithms are rapidly coming out of research laboratories and we want to use them to our advantage.

• What are the principles behind intelligent systems?

• How are they built?

• What are intelligent systems useful for?

• How do we choose the right tool for the job?

• These questions are answered in this course.

Introduction

• This course is an introduction to the field of artificial intelligence.

• In this course, I will try to explain the basics of artificial intelligence and eliminate the fear of artificial intelligence.

What are the prerequisites?

• Linear algebra, basic probability and statistics.

Introduction

What is covered in this course?

• It covers information-based learning, probability-based learning, regression, artificial neural networks, basic deep learning, evolutionary computation, and so on.

• In the course, students understand Bayesian reasoning, explore artificial neural networks, deep learning, regression models and implement a simple problem as a genetic algorithm.

• You learn about Python tool because it can easily demonstrate the artificial neural networks theory: you make your own neural network.

Introduction

Grading

• Midterm Exam: 40%

• Final Exams: 50%

• Homework/Class Participation: 10%

Reference

• M. Negnevitsky, Artificial Intelligence: A Guide to Intelligent Systems, Pearson Education, 2005

• J. Kelleher, Machine learning for predictive data analysis, MIT Press, 2015

• T. Rashid, Make Your Own Neural Network, Create Space Independent Publishing Platform, 2016

• F. Chollet, Deep Learning with Python, Manning Publications Co., 2018

Chapter 1Introduction to Artificial Intelligence,Machine Learning

What is Artificial Intelligence (AI)?

Artificial Intelligence (AI)?

• Artificial intelligence is a science that makes machines do things that would require intelligence if done by humans.

• A machine would be intelligent if it can achieve human-level performance in some cognitive task.

• To build an intelligent machine, we have to capture, organize and use human expert knowledge in some problem area: it’s very difficult to extract knowledge from human experts.

• Machine learning can accelerate this process and enhance the quality of knowledge by adding new rules or changing incorrect ones.

What is machine learning (ML)?

• Branch of Artificial Intelligence: design and development of algorithms that allow machines to evolve behaviors based on empirical data.

• As intelligence requires knowledge, it is necessary for the machines or robots to acquire knowledge.

Machine learning (ML) is a key technology for implementation of brain-machine interface.


• Modern medical science collects huge amounts of information through many diagnostic machines.

• They must be analyzed to extract insights.

• ML is a core element for predictive data analytics.

Example

Predictive data analytics moving from data to insight to decision.

Insight DecisionDataData

SourceData

Anaytics

DecisionMaking


• Art of building and using models that make predictions.

• Automatically learn a model of the relationship between two features.

• This model is used to make predictions for new instances.

Predictive Data AnalyticsTrainingDataset

Machine

Learning

Algorithm

PredictionModel

Descriptive FeaturesTarget

Feature

···· ···· ···· ···· ···· ····

···· ···· ···· ···· ···· ····

···· ···· ···· ···· ···· ····

···· ···· ···· ···· ···· ····

(a) Training a model from a set of historical instances

Query InstancePrediction

ModelPrediction

(b) Using a model to make predictions

The two steps in machine learning


• Price Prediction: businesses can adjust their prices to maximize returns based on factors (seasonal changes, shifting customer demand).

• Dosage Prediction: doctors and scientists can decide how much of a medicine or other chemical to include in a treatment.

• Risk Assessment: risk is one of the key influencers in almost every decision an organization makes.

Applications


• Propensity Modeling: most business decision making would be made much easier if we could predict the propensity of individual customers.

• Diagnosis: doctors, engineers, and scientists regularly make diagnoses as part of their work.

• Document Classification: automatically classify documents into different categories (for examples: email spam filtering, news sentiment analysis).


• Table 1 shows the mortgage dataset.

• Descriptive features: OCCUPATION, AGE, LOAN-SALARY RATIO

• Target feature: OUTCOME

Example

Table 1. Dataset of mortgages

ID OCCIPATION AGE LOAN-SALARY RATIO OUTCOME

1 industrial 34 2.96 repay

2 professional 41 4.64 default



5 industrial 48 3.80 default

6 industrial 61 2.52 repay

7 professional 37 1.50 repay

8 professional 40 1.93 repay




• Very simple prediction model for this dataset:

• This model can predict whether the applicant will repay the mortgage or default on it.

Example


• Complex representation of the same problem:ID AMOUNT SALARY LOAN-SALARY RATIO AGE OCCUPATION PROPERTY TYPE OUTCOME

1 245,100 66,400 3.69 44 industrial farm stb repay

2 90,600 75,300 1.20 41 industrial farm stb repay

3 195,600 52,100 3.75 37 industrial farm ftb dafault

4 157,800 67,600 2.33 44 industrial apartment ftb repay

5 150,800 35,800 4.21 39 professional apartment stb dafault


7 193,100 73,200 2.64 38 professional house ftb repay

8 215,000 77,600 2.77 17 professional farm ftb repay


10 186,100 49,200 3.78 30 industrial house ftb dafault

11 161,500 53,300 3.03 28 professional apartment stb repay

12 157,400 63,900 2.46 30 professional farm stb repay

13 210,000 54,200 3.87 43 professional apartment ftb repay


15 143,200 65,300 2.19 32 industrial apartment ftb dafault

16 203,000 64,400 3.15 44 industrial farm ftb repay

17 247,800 63,800 3.88 46 industrial house stb repay


19 213,300 61,100 3.49 21 industrial apartment ftb dafault


21 154,000 48,900 3.15 49 professional house stb repay


23 252,000 59,700 4.22 27 professional house stb dafault

24 175,200 39,900 4.39 37 professional apartment stb dafault

25 149,700 58,600 2.55 35 industrial farm stb dafault

Table 2. More complex dataset of mortgages.


• The previous prediction model is not consistent.

• We have to find the new model as follows:

How Does Machine Learning Work?

• Search for the model which best captures the relationship between two features in the dataset.

• Consistent models finding is not sufficient:

• With large datasets, prediction models with noisy data will make incorrect predictions

• Training dataset represents only a small sample of all possible sample set in the domain. (ill-posed problem)

Machine Learning Algorithms


• Table 3 shows shopping habits of 5 customers.

• We need a prediction model for this dataset.

Example for ill-posed problem


• Three binary descriptive features: 23 = 8 possible combinations of descriptive feature values

• For each of the 8 possible combination values, there are 3 possible target feature values: 38 = 6,561 possible prediction models

Table 3. Shopping habits dataset

ID BBY ALC ORG GRP

1 no no no couple

2 yes no yes family

3 yes yes no family

4 no no yes couple

5 no yes yes single


• Table 4 illustrates the relationship between the possible combination values and the possible prediction models: the possible potential models are shown as M1 to M6,561.

Table 4. Relationship between descriptive feature value combinations and prediction models.

BBY ALC ORG GRP M1 M2 M3 M4 M5 · · · M6561

no no no ? couple couple single couple couple · · · couple

no no yes ? single couple single couple couple · · · single

no yes no ? family family single single single · · · family

no yes yes ? single single single single single · · · couple

yes no no ? couple couple family family family · · · family

yes no yes ? couple family family family family · · · couple

yes yes no ? single family family family family · · · single

yes yes yes ? single single family family couple · · · family


• The training dataset does not contain a sample for every possible combination value: three remaining combination values.

• 27 (=33) potential models remain consistent with the training dataset.

BBY ALC ORG GRP M2 M4 M5 · · ·

no no no couple couple couple couple · · ·

no no yes couple couple couple couple · · ·

no yes no ? family single single · · ·

no yes yes single single single single · · ·

yes no no ? couple family family · · ·

yes no yes family family family family · · ·

yes yes no family family family family · · ·

yes yes yes ? single family couple · · ·

Table 5. Potential prediction models


• Three models, M2, M4, M5 are shown in Table 5.

• Machine learning has an ill-posed problem: single model cannot be found with the training dataset.










Table 5. Potential prediction models


• What prediction should be returned for new queries ?

• If a new customer buys baby food, alcohol, and organic vegetables, the multiple models will contradict each other.

Multiple Models

Example










Contradiction


• Goal of machine learning is to find the best generalized prediction model.

• Machine learning algorithm must use some criteria: what criteria should we use?

• Many different machine learning algorithms: each machine learning algorithm has their own criteria.

• Model selection criteria is known as inductive bias.

• Machine learning can induce the best single prediction model with inductive bias.


• Restriction bias constrains the potential models during the learning process

• Preference bias guides the learning algorithm to prefer certain models over others.

• For example, let’s consider multivariable linear regression with gradient descent:

• It implements the restriction bias of only considering prediction models based on a linear combination of the descriptive feature values.

• It applies a preference bias over the order of the linear models in terms of a gradient descent.

Two types of inductive bias

What can go wrong with machine learning?

• Underfitting: prediction model is too simple to represent the relationship between two features.

• Overfitting: prediction model is so complex, the model too closely fits to the dataset and becomes sensitive to the noise in the dataset.

Inappropriate inductive bias can lead to


• Consider the task to predict a person’s INCOME (target) based on AGE (single descriptive feature).

• Table 6 shows a training dataset for five people.

Underfitting & Overfitting Examples

A visualization of Table 6

ID AGE INCOME1 21 24,000

2 32 48,000

3 62 83,0004 72 61,000

5 84 52,0000

20,000

40,000

60,000

80,000

100,000

0 20 40 60 80 100

INC

OM

E

AGE

Table 6. Age-income dataset


• Figure (a) cannot fully capture the relationship between two features.

• Figure (b) seems more complicated than necessary.

• Underfitting or overfitting do not generalize well: cannot make good predictions for new queries.


• Figure (c) is a Goldilocks model: good balance.

• Goldilocks models can be found with proper inductive biases.

Summary

• Machine learning (ML) works by searching through potential models to find the prediction model.

• Machine learning has an ill-posed problem.

• ML uses two sources, the training dataset and the inductive bias assumed by the algorithm.

• Main families of ML algorithm: 1) Information-based learning 2) Similarity-based learning 3) Probability-based learning - GMM, EM, HMM 4) Error-based learning - Regression, SVM, ANN

Documents

Introduction to Artificial Intelligence - DGIST · Introduction • This course is an introduction to the field of artificial intelligence. • In this course, I will try to explain