Machine Learning Deep Learning - Government Training ... 22 - Machine Learnin… · and Deep Learning, ... Lasso (and Ridge) ... Require EXTENSIVE programming experience in image

© Copyright 2017 by Dr. Chih Lai Page: 1

Machine Learning

&

Deep Learning

Graduate Program in Software School of Engineering

University of St. Thomas

Dr. Chih Lai

© Copyright 2017 by Dr. Chih Lai, Page: 2

Chih Lai, Ph.D. Associate Professor

� Teaching Experience: Dr. Lai is an associate professor with GPS. He has taught courses in Machine Learning and Deep Learning, Data Mining and Predictive Analytics, Healthcare Analytics. He was also a visiting professor of the Informatics Department at Trier University of Applied Science in Germany in 2010. Dr. Lai also taught an Operating System course at the Computer Science Department of Oregon State University.

� Research and Publications: Dr. Lai’s research interests include Machine Learning and Deep Learning on multimedia data (numerical, images, videos, text). Dr. Lai has published many technical papers on IEEE and ACM conferences / journals. Dr. Lai received teaching and research grants from Amazon, Microsoft, and university of St. Thomas. Please visit http://www.linkedin.com/pub/chih-lai-

ph-d/3/2b6/193for details.

� Industry Experience: Before joining UST, Dr. Lai was a principal software engineer, working on a new aircraft collision avoidance system (ADS-B) which FAA has mandatory installation on most aircraft by 2020. Dr. Lai received three U.S. patents and three European patents, all related to aircraft collision avoidance algorithms. Dr. Lai also worked with Medtronic and has pending patents on monitoring and evaluating Parkinson patients. Other industry experience includes building a network gateway between IBM / Novell networks.


Data = New Oil

� 75-billion connected devices, 50-trillion GB data.

Data is new oil!!

INTERNETof THINGS


� Google is rethinking everything with machine learning at the core.

� Growing used of deep learning at Google is exponential.

Everything Is About…

Google Trending Over Last 5 Years.

Nov / Dec 2017


Outline

� Proliferation of Data (Structured and Unstructured) / Internet of Things.

� Major topics in ML + DL.� How do machines learn?

� Avoid Over-Learning, & Regularization.

� Support Vector Machine (SVM) and Kernel Methods.

� Neural Network, Deep Learning.

� Convolutional Neural Network (CNN).

• Autoencoder (AE), De-Noise Autoencoder (DAE).

• Recurrent Neural Network (RNN) Long Short-Term Memory (LSTM).

• Re-enforcement Learning.

� ML + DL Computing Platforms.


Machine Learning

� Machine derives weights (importance) for each predictor to predict target.

� Initialize a random weight for each predictor, then…

� Try to reduce error (i.e. cost)

� You define “error”.

X / Predictors / Features / Independent Variables

YTargetDependentResponse

Weights = 0.08 -1.48 0.5 -0.01 9.7 ???

X Y1 $4

2 $8

3 $12

W1 = 0W1 = 2W1 = 4


How Do We Avoid Over-Learning?

� Avoid overfitting (w/ simpler model) but still preserve good prediction accuracy.

� By identifying really “important” attributes for prediction.

� Lasso (and Ridge) regularization.

� Balance error ï model complexity.

Impo

rtan

ce

© Copyright 2017 by Dr. Chih Lai, Page: 8

� Add more predictors may improve prediction quality. WHY???

Logistic Regression, Predicting Depression

= CFM Accuracy

Depressed

Non-depressed

P1, R1, F1 ==> 0.74663 0.74948 0.74805

9, DIABETES 4, ALZH10, Isch-Heart

female


Support Vector Machine (SVM)

� SVM in ¶ dimensional space. WHY???� Transform data from low-D to high-D

� We can find linear solution in high-D

� = non-linear D.B. in low-D.• Linear model = simpler model.

� DimensionÆ� AccuracyÆ

� But, computationally more expensive???

� Use SVM Kernel!!

https://www.youtube.com/watch?v=3liCbRZPrZA&feature=youtu.be


MPG + Weight, RBF Kernel (Radial Basis Function)��,�[� + �]

RBF w/ C = 20

RBF w/ C = 1

RBF w/ C = 0.1

CÆ � Regularization↓

� complexityÆ

� error↓

C↓ � RegularizationÆ

� complexity↓

� errorÆ


Brain / Artificial Neurons

� Neurons response slowly compare to electrical circuits.

� But, massive parallel computation in our brains� 1011 neurons in brain w/ 104 connections per neuron.

�

Feature CombinationFeature Combination

Feature CombinationFeature Combination


� Computers execute programs for users� Powerful if we know algorithm / solution, not useful if you don’t !!

� Solution often unknown / unclear for many tasks.� Recognizing stop sign, nerve component, GO, etc.

� Finding / Predicting similar business / health (future) events.

Human Intelligence vs. Artificial Intelligence


Modeling Stock Market Curve

� Closely follow the original curve.


Endless Possibilities…

� Meet Shelley, the AI that is learning to write horror stories for Halloween� https://www.weforum.org/agenda/2017/10/robots-used-to-feature-in-horror-stories-now-they-re-writing-them-for-halloween/

� Spooky Author Identification� https://www.kaggle.com/c/spooky-author-identification


Great, Let’s Go “DEEP”

.. .. ..


Sigmoid, Tanh, ReLUExperiments

� Change 1-layer sigmoid 8-layer sigmoid. Compare to ReLU.

More neurons dropped out than tanh.

http://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.08596&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification


Convolutional Neural Network (CNN)

� https://research.googleblog.com/2014/09/building-deeper-understanding-of-images.html

� What is this??

Live demo at: http://scs.ryerson.ca/~aharley/vis/conv/flat.html


No More Human Feature Engineering

� Require EXTENSIVE programming experience in image processing.

� Feature Learning + Classification together.� Features extracted automatically & outperform hand-crafted rules.

https://www.mathworks.com/videos/deep-learning-for-computer-vision-120997.html


Distracted Driver Detection

� Classify driver’s behavior into one of the 10 classes.


Features Learned

by CNN

https://www.mathworks.com/videos/deep-learning-for-computer-vision-120997.html


CNN Performance 99.1%


Ultrasound Nerve Segmentation


Google Alpha Go

� Alpha Go relies on CNN without using max pooling!!


The Most Famous CNN– AlexNet, Do I Care???

� Alex Krizhevsky, 2012 ImageNet Competition� http://www.vlfeat.org/matconvnet/models/beta16/imagenet-caffe-alex.mat

https://groups.google.com/forum/#!topic/caffe-users/cUD3IF5NMOk

Transfer Learning


Some Popular ML+DL Tools

YES

IEEE Computing Edge April 2017, pp.12

GPU??

??

Documents

Machine Learning Deep Learning - Government Training ... 22 - Machine Learnin… · and Deep Learning, ... Lasso (and Ridge) ... Require EXTENSIVE programming experience in image