Learning with neural networksxiaowei/ai_materials... · Network: Loss function: Example •Training...

Learning with neural networks

Dr. Xiaowei Huang

https://cgi.csc.liv.ac.uk/~xiaowei/

Up to now,

• Traditional Machine Learning Algorithms

• Deep learning • Introduction to Deep Learning

• Functional view and features

Topics

• Forward and backward computation

• Back-propogation and chain rule

• Regularization

Initialisation

• Collect annotated data

• Define model and initialize randomly

• Predict based on current model• In neural network jargon “forward propagation”

• Evaluate predictions and update model weights

Forward computations

Backward computations

• Collect gradient data

• Predict based on current model• In neural network jargon “backpropagation”

Update weight

Recall: Training Objective

Given training corpus {𝑋, 𝑌} find optimal parameters

Loss function

Prediction

Ground truth

accumulated loss

Find an optimal model parameterised over

Forward Computation

Example

Network:

Loss function:

Example

• Training data (a single sample)• given inputs 0.05 and 0.10, we want

the neural network to output

0.01 and 0.99.

Example

0.01 and 0.99.

Example

0.01 and 0.99.

Backward Propogation

Recall: Minimizing with multiple dimensional inputs • We often minimize functions with multiple-dimensional inputs

• For minimization to make sense there must still be only one (scalar) output

Functions with multiple inputs

• Partial derivatives

measures how f changes as only variable xi increases at point x

• Gradient generalizes notion of derivative where derivative is wrt a vector

• Gradient is vector containing all of the partial derivatives denoted

Note: In the training objective case,

f is the loss

the parameter x is

Optimization through Gradient Descent

• As with many model, we optimize our neural network with Gradient Descent

• The most important component in this formulation is the gradient

• Backpropagation to the rescue• The backward computations of network return the gradients

• How to make the backward computations

Weight Update

Example

Chain rule

• Assume a nested function, 𝑧 = 𝑓(𝑦) and y= g(x)

• Chain Rule for scalars 𝑥, 𝑦, 𝑧

• When 𝑥∈ R𝑚,𝑦∈ R𝑛, 𝑧∈R

• i.e., gradients from all possible paths

Chain rule

• When 𝑥∈ R𝑚,𝑦∈ R𝑛, 𝑧∈R

Chain rule

• When 𝑥∈ R𝑚,𝑦∈ R𝑛,𝑧∈R

Chain rule

• When𝑥∈ R𝑚,𝑦∈ R𝑛,𝑧∈R

Chain rule

• Chain Rule for scalars 𝑥, 𝑦, 𝑧 :

• When 𝑥∈ R𝑚,𝑦∈ R𝑛,𝑧∈R

• i.e., gradients from all possible paths • or in vector notation

• is the Jacobian

The Jacobian

• When 𝑥∈ R3, 𝑦∈ R2

Chain rule in practice

• f(y) =sin(𝑦) ,𝑦=𝑔(𝑥) =0.5𝑥2

Backpropagation

General Workflow

Regularization as Constraints

Recall: what is regularization?

• In general: any method to prevent overfitting or help the optimization

• Specifically: additional terms in the training optimization objective to prevent overfitting or help the optimization

Recall: Overfitting

• Key: empirical loss and expected loss are different

• Smaller the data set, larger the difference between the two

• Larger the hypothesis class, easier to find a hypothesis that fits the difference between the two • Thus has small training error but large test error (overfitting)

• Larger data set helps

• Throwing away useless hypotheses also helps (regularization)

Regularization as hard constraint

• Training objective

• When parametrized

Adding constraints

Regularization as hard constraint

• When 𝛺 measured by some quantity 𝑅

• Example: 𝑙2 regularization Adding constraints

Regularization as soft constraint

• The hard-constraint optimization is equivalent to soft-constraint

for some regularization parameter 𝜆∗ > 0

• Example: 𝑙2 regularization

Adding constraints

Regularization as soft constraint

• Showed by Lagrangian multiplier method

• Suppose 𝜃∗ is the optimal for hard-constraint optimization

• Suppose 𝜆∗ is the corresponding optimal for max

Adding constraints

Learning with neural networksxiaowei/ai_materials... · Network: Loss function: Example •Training...

Documents

Chapter 10 Chapter 10 Neural Network Neural Network

Statistical & Data Analysis Using Neural Network · · 2015-03-12Statistical & Data Analysis Using Neural Network ... The Biological Perspective of Neural Networks Neural Network

Neural Network Modeling of Spectral Embeddingpdfs.semanticscholar.org/a6d0/f4e824e9fc65bbec5d1ada2cd4a1c4939411.pdf2 Neural Network for Manifold Learning Neural network can be expressed

Biological Neural Network & Nonlinear Dynamics Biological Neural Network Similar Neural Network to Real Neural Networks Membrane Potential Potential of

02 Fundamentals of Neural Network - myreaders.info · 1.1 Why Neural Network SC - Neural Network – Introduction Neural Networks follow a different paradigm for computing.

Chapter 9 Chapter 9 Neural Network Neural Network

Neural Network Toolbox - Hacettepe Universitysolen/Matlab/MatLab/Matlab - Neural... · Neural Network Toolbox For Use with MATLAB ... Neural Network Design Book.....xxii Acknowledgments.....xxiii

Explanation-Based Neural Network Learning for Robot …papers.nips.cc/paper/614-explanation-based-neural-network-learnin… · explanation-based neural network learning (EBNN),

Kuniaki Yoshitomiyossvr0.ed.kyushu-u.ac.jp/jpn/pdf/E-JUST_Yoshitomi.pdf · artificial neural network(ANN) • An . artificial neural network (ANN), usually called "neural network"

Designing Perceptron Three-Layered Neural Network … - 104_Designing... · Designing Perceptron Three-Layered Neural Network for ... portfolio via fuzzy neural network to ... if

A Neural Network Architecture for Visual Selectiongalton.uchicago.edu/~amit/Papers/netdet.pdf · A Neural Network Architecture for Visual Selection ... Neural Network Architecture

Artificial Neural Network (Back-Propagation Neural Network)

ELEC 677: Recurrent Neural Network Applications & Recurrent Neural Network … · 2016-11-08 · Recurrent Neural Network Applications & Recurrent Neural Network Language Models Lecture

Pairwise Neural Network Classifiers with Probabilistic Outputspapers.nips.cc/paper/883-pairwise-neural-network...Pairwise Neural Network Classifiers with Probabilistic Outputs David

Neural Network High Precision Processing for Astronomical ... · NEURAL NETWORK HIGH PRECISION PROCESSING FOR ASTRONOMICAL IMAGES ... and optical engineer- ... Neural Network High

A Neural-Network Architecture for Syntax Analysis - Neural ...faculty.ist.psu.edu › vhonavar › Papers › ieeetnnchen.pdfing a neural associative memory. The proposed neural-network

Neural Network Society, Japanese Neural Network Society 15

- Tutorial - Neuromorphic Computing with Spintronics...• Deep neural network • Recurrent neural network –Reservoir computing • Hopfield network • Stochastic neural network

Convolutional Neural Networks - University of Liverpoolxiaowei/ai_materials...Convolutional Neural Networks Dr. Xiaowei Huang xiaowei/ Up to now, •Traditional Machine Learning Algorithms

Neural Network Oleh Danny Manongga. 2 Overview Basics of Neural Network Advanced Features of Neural Network Applications I-II Summary