60
人工智慧&最佳化理論與應用 Introduction Artificial Intelligence and Machine Learning

Introduction Artificial Intelligence and Machine Learning

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction Artificial Intelligence and Machine Learning

人工智慧&最佳化理論與應用

Introduction

Artificial Intelligence and

Machine Learning

Page 2: Introduction Artificial Intelligence and Machine Learning

• Web Intelligence and Data Mining Lab• Web Data Extraction & Integration • Information Retrieval and Extraction• Data Mining and Machine Learning

https://sites.google.com/site/jahuichang/

中央大學資工系教授桃園市研考委員 2015. 3 - Now

科技部智慧學門複審委員 2014.10 - NowTAAI常務理事 2004.1- NowACLCLP 理事 2012.1- Now

Page 3: Introduction Artificial Intelligence and Machine Learning

Outline

Artificial Intelligence

Knowledge Reasoning

Machine Learning

Pattern RecognitionNLP: IE+IR+MT/

Statistical Reasoning

Problem Solving/ Search Algorithm

3UncertaintyReasoning

Expert System/Logic Programming

ProbabilityBayesian Network

Supervised learning

Page 4: Introduction Artificial Intelligence and Machine Learning

What is AI?

• Thinking vs. Acting • Humanly vs. Rationally

Thinking humanly Thinking rationally

Acting humanly Acting rationally

Page 5: Introduction Artificial Intelligence and Machine Learning

What is AI?

• Humanly

– Thinking humanly: cognitive modeling

– Acting humanly: Turing Test

• Rationally

– Thinking rationally: "laws of thought“

– Acting rationally: rational agent

Page 6: Introduction Artificial Intelligence and Machine Learning

AI prehistory

• Philosophy Logic, methods of reasoning, mind as physical system foundations of learning, language,rationality

• Mathematics Formal representation and proof algorithms,computation, (un)decidability, (in)tractability,probability

• Economics utility, decision theory • Neuroscience physical substrate for mental activity• Psychology phenomena of perception and motor control,

experimental techniques• Computer building fast computers

engineering• Control theory design systems that maximize an objective

function over time • Linguistics knowledge representation, grammar

Page 7: Introduction Artificial Intelligence and Machine Learning

Abridged history of AI

• 1943 McCulloch & Pitts: Boolean circuit model of brain• 1950 Turing's "Computing Machinery and Intelligence"• 1956 Dartmouth meeting: "Artificial Intelligence" adopted• 1952—69 Look, Ma, no hands! • 1950s Early AI programs, including Samuel's checkers

program, Newell & Simon's Logic Theorist, Gelernter's Geometry Engine

• 1965 Robinson's complete algorithm for logical reasoning• 1966—73 AI discovers computational complexity

Neural network research almost disappears• 1969—79 Early development of knowledge-based systems• 1980-- AI becomes an industry • 1986-- Neural networks return to popularity• 1987-- AI becomes a science • 1995-- The emergence of intelligent agents

Page 8: Introduction Artificial Intelligence and Machine Learning

AI Renaissance

• The Internet, intranets, and the AI renaissance, Intelligence 1997

– Daniel E. O’Leary

• Why artificial intelligence is enjoying a renaissance

– The Economist, 2017

Page 9: Introduction Artificial Intelligence and Machine Learning

AI: State of the Art

• Deep Blue defeated the reigning world chess champion Garry Kasparov in 1997

• IBM's Watson Supercomputer Destroys Humans in Jeopardy, 2011

• 人機大戰 Google AlphaGO勝出!李世乭投降, 2016

Page 10: Introduction Artificial Intelligence and Machine Learning

無人駕駛車、車聯網

• Self-Driving: No hands across America, 1995, 2017

– Google自動車十新創事業Waymo

– 三星與奧迪

– 加拿大黑莓研發中心

– 進軍共享經濟福斯推新品牌Moia

– BMW 慕尼黑測試無人車

– 英國新創公司 Oxbotica

• 5分鐘視頻告訴你,為什麼無人駕駛比人強

Page 11: Introduction Artificial Intelligence and Machine Learning

機器翻譯的進步

Google翻譯導入類神經網路的深度學習

Page 12: Introduction Artificial Intelligence and Machine Learning

Part I Artificial Intelligence

Two Chapters

Part II Problem Solving

Four Chapters

Part III Knowledge and Reasoning

Six Chapters

Part IV Uncertain Knowledge and Reasoning

Five Chapters

Part V Learning

Four Chapters

Part VII Communicating, Perceiving, and Acting

Natural Language Processing

Natural Language for Communication

Perception (vision)

Robotics

Artificial Intelligence: A Modern Approach

Page 13: Introduction Artificial Intelligence and Machine Learning

Outline

Artificial Intelligence

Knowledge Reasoning

Machine Learning

Pattern RecognitionNLP: IE+IR+MT/

Statistical Reasoning

Problem Solving/ Search Algorithm

3UncertaintyReasoning

Expert System/Logic Programming

ProbabilityBayesian Network

Supervised learning

Page 14: Introduction Artificial Intelligence and Machine Learning

SOLVING PROBLEM BY SEARCH

Video lectures on YouTube:https://www.youtube.com/playlist?list=PLAwxTw4SYaPlqMkzr4xyuD6cXTIgPuzgn

Page 15: Introduction Artificial Intelligence and Machine Learning

Some games

• 8-queens

• Bridge

• 8-puzzle

• Sudoku

• Shortest path

• Coloring

• Dice …

14 Jan 2004 CS 3243 - Blind Search 15

Page 16: Introduction Artificial Intelligence and Machine Learning

14 Jan 2004 CS 3243 - Blind Search 16

Problem Types: Terminologies

• Fully vs Partial observable

• Deterministic vs. Stochastic

• Discrete or Continuous

• Benign vs. Adversarial

Checker Poker Robotic Car

Vacuum

Page 17: Introduction Artificial Intelligence and Machine Learning

Rout Finding Problem

Page 18: Introduction Artificial Intelligence and Machine Learning

Search Algorithms

• Uniform Search

• Depth-First Search

• Breadth-First Search

• A* (A-Star) Search Algorithm

• Local Beam Search

• Hill Climbing

• Simulated annealing search

Page 19: Introduction Artificial Intelligence and Machine Learning

Local Search in Continuous Spaces

Page 20: Introduction Artificial Intelligence and Machine Learning

Outline

Artificial Intelligence

Knowledge Reasoning

Machine Learning

Pattern RecognitionNLP: IE+IR+MT/

Statistical Reasoning

Problem Solving/ Search Algorithm

3UncertaintyReasoning

Expert System/Logic Programming

ProbabilityBayesian Network

Supervised learning

Page 21: Introduction Artificial Intelligence and Machine Learning

Knowledge Reasoning

Page 22: Introduction Artificial Intelligence and Machine Learning

Why do we need logics?

• Problem solving agents cannot infer unobserved information.

• We want an algorithm that reasons in a way that resembles reasoning in humans.

• Logic a.k.a. Symbolic Reasoning

Page 23: Introduction Artificial Intelligence and Machine Learning

Wumpus World PEAS

• Performance measure– gold +1000, death -1000

– -1 per step, -10 for using the arrow

• Environment– Shooting kills wumpus if you are

facing it

– Shooting uses up the only arrow

• Sensors: – Stench, Breeze, Glitter, Bump,

Scream

• Actuators: – Left turn, Right turn, Forward,

Grab, Shoot

Page 24: Introduction Artificial Intelligence and Machine Learning

Topics for Logic Reasoning

• Knowledge-based agents

• Logic in general - models and entailment

• Propositional (Boolean) logic

• Equivalence, validity, satisfiability

• Inference rules and theorem proving for Horn clauses

– forward chaining

– backward chaining

– resolution

Page 25: Introduction Artificial Intelligence and Machine Learning

Expert Systems

Page 26: Introduction Artificial Intelligence and Machine Learning

Logic in general

• Logics are formal languages for representing information such that conclusions can be drawn– Syntax defines the sentences in the language– Semantics define the "meaning" of sentences;– i.e., define truth of a sentence in a world

• Entailment means that one thing follows from another:KB ╞ α

– Knowledge base KB entails sentence α if and only if α is true in all worlds where KB is true, i.e. M(KB) M()

• Models are formally structured worlds with respect to which truth can be evaluated– M() is the set of worlds where is true– M(KB) is the set of all worlds where KB is true– Think of KB and as a collection of constraints

Page 27: Introduction Artificial Intelligence and Machine Learning

Inference

• KB ├i α = sentence α can be derived from KB by procedure i

• Soundness: i is sound if whenever KB ├i α, it is also true that KB╞ α

• Completeness: i is complete if whenever KB╞ α, it is also true that KB ├i α

Page 28: Introduction Artificial Intelligence and Machine Learning

Propositional logic: Syntax and Semantics

• Propositional logic is the simplest logic – illustrates basic ideas

• The proposition symbols S1, S2 etc are sentences– If S is a sentence, S is a sentence (negation)

– If S1 and S2 are sentences, S1 S2 is a sentence (conjunction)

– If S1 and S2 are sentences, S1 S2 is a sentence (disjunction)

– If S1 and S2 are sentences, S1 S2 is a sentence (implication)

– If S1 and S2 are sentences, S1 S2 is a sentence (biconditional)

Page 29: Introduction Artificial Intelligence and Machine Learning

Reasoning Patterns

• How do we know KB╞ α?

– Model checking, O(2n)

– Application of inference rules

• Inference Rules

– Modus Ponens

– And-Elimination

• Monontonicity

– If KB╞ α, then KB╞ α

Page 30: Introduction Artificial Intelligence and Machine Learning

Inference by Model Checking

• Depth-first enumeration of all models is sound and complete

• For n symbols, time complexity is O(2n), space complexity is O(n)

Assign true to

variable P

Assign false

to variable P

Page 31: Introduction Artificial Intelligence and Machine Learning

Application of Inference Rules

• Two sentences are logically equivalent iff true in same set of models: α ≡ β iff α╞ β and β╞ α

You need to

know these

(Discrete math)

Page 32: Introduction Artificial Intelligence and Machine Learning

Resolution

• Inference rule for CNF: sound and complete! *( )

( )

( )

A B C

A

B C

“If A or B or C is true, but not A, then B or C must be true.”

( )

( )

( )

A B C

A D E

B C D E

“If A is false then B or C must be true, or if A is true

then D or E must be true, hence since A is either true or

false, B or C or D or E must be true.”

( )

( )

( )

A B

A B

B B B

Simplification

* Resolution is “refutation complete”

in that it can prove the truth of any

entailed sentence by refutation.

Page 33: Introduction Artificial Intelligence and Machine Learning

Efficient Propositional Inference

• Two families of efficient algorithms for propositional inference:

Complete backtracking search algorithms

• DPLL algorithm (Davis, Putnam, Logemann, Loveland)

Incomplete local search algorithms

• WalkSAT algorithm

Page 34: Introduction Artificial Intelligence and Machine Learning

Summary for Knowledge Reasoning

• Logical agents apply inference to a knowledge baseto derive new information and make decisions

• Basic concepts of logic:– syntax: formal structure of sentences– semantics: truth of sentences wrt models– entailment: necessary truth of one sentence given another– inference: deriving sentences from other sentences– soundness: derivations produce only entailed sentences– completeness: derivations can produce all entailed

sentences

• Resolution is complete for propositional logicForward, backward chaining are linear-time, complete for Horn clauses

• Propositional logic lacks expressive power

Page 35: Introduction Artificial Intelligence and Machine Learning

Outline

Artificial Intelligence

Knowledge Reasoning

Machine Learning

Pattern RecognitionNLP: IE+IR+MT/

Statistical Reasoning

Problem Solving/ Search Algorithm

3UncertaintyReasoning

Expert System/Logic Programming

ProbabilityBayesian Network

Supervised learning

Page 36: Introduction Artificial Intelligence and Machine Learning

4 UNCERTAINTY REASONING

Bayesian Networks

3

Page 37: Introduction Artificial Intelligence and Machine Learning

Bayesian Networks

• A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions

• Syntax:– a directed, acyclic graph (link ≈ "directly influences")– a conditional distribution for each node given its parents:P (Xi | Parents (Xi))

Page 38: Introduction Artificial Intelligence and Machine Learning

Compactness

• A CPT for Boolean Xi with k Boolean parents has 2k

rows for the combinations of parent values

• Each row requires one number p for Xi = true(the number for Xi = false is just 1-p)

• If each variable has no more than k parents, the complete network requires O(n ·2k) numbers– I.e., grows linearly with n, vs. O(2n) for the full joint

distribution

– For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 25-1 = 31)

Page 39: Introduction Artificial Intelligence and Machine Learning

Semantics

The full joint distribution is defined as the product of the local conditional distributions:

P (X1, … ,Xn) = πi = 1 P (Xi | Parents(Xi))

e.g., P(j m a b e)

= P (j | a) P (m | a) P (a | b, e) P (b) P (e)

= 0.9*0.7*0.001*0.999*0.998

0.00063

n

Page 40: Introduction Artificial Intelligence and Machine Learning

Local semantics

• Local semantics: each node is conditionally independent of its nondescendants given its parents

Page 41: Introduction Artificial Intelligence and Machine Learning

Markov blanket

• Each node is conditionally independent of all others given its Markov blanket: parents + children + children's parents

Page 42: Introduction Artificial Intelligence and Machine Learning

Inference by Enumeration

Page 43: Introduction Artificial Intelligence and Machine Learning

Exact Inference vs. Approximate Inference

• Speeding Up Inference

– Pull out terms

– Maximize Independence

– Variable Enumeration

• Approximate Inference by

– Sampling

– Rejection sampling

– Gibbs sampling

Page 44: Introduction Artificial Intelligence and Machine Learning

Outline

Artificial Intelligence

Knowledge Reasoning

Machine Learning

Pattern RecognitionNLP: IE+IR+MT/

Statistical Reasoning

Problem Solving/ Search Algorithm

3UncertaintyReasoning

Expert System/Logic Programming

ProbabilityBayesian Network

Supervised learning

Page 45: Introduction Artificial Intelligence and Machine Learning

MACHINE LEARNING

a.k.a. Predictive analytics, Data science

Page 47: Introduction Artificial Intelligence and Machine Learning

Topics from PRML

• Introduction• Probability Distribution• Linear Models for Regression• Linear Models for Classification• Neural Networks• Kernel Mothods• Sparse Kernel Machines• Graphical Models• Approximate Inference

Page 48: Introduction Artificial Intelligence and Machine Learning

Topics from Deep Learning

• Applied Math and ML Basics– Linear Algebra– Probability and Information Theory– Numerical Computation– Machine Learning Basics

• Deep Networks: Modern Practices– Deep Forward Neural Networks– Regularization for Deep Learning– Optimization for Training Deep Models– Convolutional Neural Networks– Sequence Modeling: RNN

Page 49: Introduction Artificial Intelligence and Machine Learning

Learning from Examples

»Supervised Learning

»Classification

»Regression

»Sequence Labeling

»Structure Learning

»Object Recognition

»Summarization

»Unsupervised Learning»Clustering

»Semi-supervised Learning»Use both labeled and

unlabeled training data

»Active Learning»Choose the critical data

points to be labeled

»Distant Learning

Page 50: Introduction Artificial Intelligence and Machine Learning

Classification: Definition

• Given a collection of records (training set)

– Each record contains a set of attributes, one of the attributes is the class.

• Find a model for class attribute as a function of the values of

other attributes.

• Goal: previously unseen records should be assigned a class as

accurately as possible.

– A test set is used to determine the accuracy of the model.

– Usually, the given data set is divided into training and test sets, with

training set used to build the model and test set used to validate it.

Page 51: Introduction Artificial Intelligence and Machine Learning

Classification Example

Tid Refund MaritalStatus

TaxableIncome Cheat

1 Yes Single 125K No

2 No Married 100K No

3 No Single 70K No

4 Yes Married 120K No

5 No Divorced 95K Yes

6 No Married 60K No

7 Yes Divorced 220K No

8 No Single 85K Yes

9 No Married 75K No

10 No Single 90K Yes10

Refund MaritalStatus

TaxableIncome Cheat

No Single 75K ?

Yes Married 50K ?

No Married 150K ?

Yes Divorced 90K ?

No Single 40K ?

No Married 80K ?10

Test

Set

Training

SetModel

Learn

Classifier

Page 52: Introduction Artificial Intelligence and Machine Learning

Clustering: Definition

• Given a set of points, with a notion of distancebetween points, group the points into some number of clusters, so that – Members of a cluster are close/similar to each other

– Members of different clusters are dissimilar

• Usually:– Points are in a high-dimensional space

– Similarity is defined using a distance measure• Euclidean (data points)

• Cosine (vectors),

• Jaccard (set),

• Edit distance (string), …

J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets,

http://www.mmds.org52

Page 53: Introduction Artificial Intelligence and Machine Learning

Example: Clusters & Outliers

J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets,

http://www.mmds.org53

x x

x x x x

x x x x

x x x

x x

x

xx x

x x

x x x

x

x x x

x

x x

x x x x

x x x

x

x

x

x x

x x x x

x x x x

x x x

x x

x

xx x

x x

x x x

x

x x x

x

x x

x x x x

x x x

x

Outlier Cluster

Page 54: Introduction Artificial Intelligence and Machine Learning

Structural Learning: An unified framework

Page 55: Introduction Artificial Intelligence and Machine Learning

Supervised Learning

• Linear Regression

• Binary Classification

• Bayesian Networks

• Minimize Errors

• Maximize Likelihood

Page 56: Introduction Artificial Intelligence and Machine Learning

Bayesian Networks

• Definition:

– If the network structure of the model is a directed acyclic graph, the model represents a factorization of the joint probability of all random variables.

– Conditional Probability Tables

– Smoothing

– Inference

𝑃 𝑥1, 𝑥2,… , 𝑥𝑛 = 𝑖=1

𝑛

𝑝(𝑥𝑖|𝑝𝑎𝑟𝑒𝑛𝑡 𝑥𝑖 )

Page 57: Introduction Artificial Intelligence and Machine Learning

Continuous Optimization Problem

• General form:

where– (x) : Rn → R is the objective function to be minimized

over the variable x ,– gi(x) ≤ 0 are called inequality constraints, and– hi(x) = 0 are called equality constraints.

• Special cases:– Unconstrained Optimization, m=p=0– Linear constrained Optimization: gi(x) and hi(x) are linear– Nonlinear constrained Optimization– Quadratic Programming

𝐦𝐢𝐧𝐢𝐦𝐢𝐳𝐞𝑥𝑓(𝑥)

Subject to 𝑔𝑖 𝑥 ≤ 0, 𝑖 = 1,… ,𝑚ℎ𝑗 𝑥 = 0, 𝑗 = 1,… , 𝑝

Page 58: Introduction Artificial Intelligence and Machine Learning

Quadratic Programming

• The objective of quadratic programming is to find an n-dimensional vector x, that will

• Given:

– a real-valued, n-dimensional vector c,

– an n × n-dimensional real symmetric matrix Q,

– an m × n-dimensional real matrix A, and

– an m-dimensional real vector b,

𝐦𝐢𝐧𝐢𝐦𝐢𝐳𝐞𝒙

12𝒙𝑇𝑄𝒙+𝒄𝑇𝒙)

Subject to 𝐴𝒙 ≤ 𝑏

Page 59: Introduction Artificial Intelligence and Machine Learning

Summary

• Related Courses

– AI, Neural Network, Data Mining, Machine Learning, Optimization, Graphical Models, Deep Learning

• Optimization Problems

– Combinatorial Optimization Problem• Search for values for each variable

• Constrained Satisfaction Problem

– Continuous Optimization Problem

• Batch/Minibatch/Stochastic Gradient descent

• Inference

– Symbolic Logic vs. Statistical Models

Page 60: Introduction Artificial Intelligence and Machine Learning

Questions & Answers