NEURAL NETWORKS I - mkang.faculty.unlv.edumkang.faculty.unlv.edu/teaching/CS489_689/11.Neural...

INTRODUCTION TO MACHINE LEARNING

NEURAL NETWORKS I

Mingon Kang, Ph.D.

Department of Computer Science @ UNLV

Begin with a single Perceptron

Ref: Dr. Manuela Veloso, CMU

Threshold Functions

Boolean Functions and Perceptron

Discussion in Perceptrons

A single perceptron works well to classify a linearly

separable set of inputs

Multi-layer perceptrons

found as a “solution” to represent nonlineaerly

separable functions (1950s)

Many local minima – non-convex

Research in neural networks stopped until the 70s

𝐗 =

, 𝑔 𝑥 = max(𝑥, 0)

𝐖1 = 1, 1; 1,1 ,𝐖2 = [1,−2], c = [0, −1]

Neural Network

Feedforward Networks or Multilayer Perceptrons

(MLPs)

E.g., for a classifier

𝑦 = 𝑓∗(𝒙; 𝜽) maps an input 𝒙 to a category 𝑦

A feedforward network defines a mapping 𝑦 = 𝑓(𝒙; 𝜽)and learns the optimal parameters 𝜽.

Neural Network

Called Networks

Involve many various functions

Associated with a directed acyclic graph that represent

how the functions are composed together

Mostly, chain structures

◼ 𝑓 𝑥 = 𝑓 3 𝑓 2 𝑓 1 𝑥

𝑓 𝑖 : the i-th layer in the network

Neural Network

First layer

Second layer

Output layerThird layer

Input layer

Hidden layerDepth

Design of a NN model

Architecture of the network

How many layers

How these layers are connected to each other

How many units are in each layer

Activation function: to compute the hidden layer

values

Cost function: to optimize the model

Optimizer: how to optimize the model

Weights and Bias

Labeling the Weights of a Neural Network

𝒃𝟏𝟑

𝒃𝒋𝒍 is the bias of the 𝒋𝒕𝒉 neuron of the 𝒍𝒕𝒉 layer

Weighted Input and Activation of a Neuron

Weighted Input of the 𝑗𝑡ℎ neuron of the 𝑙𝑡ℎ layer is 𝑧𝑗𝑙:

𝑧𝑗𝑙 = σ𝑘 𝑤𝑗𝑘

𝑙 𝑎𝑘𝑙−1 + 𝑏𝑗

Activation from the 𝑗𝑡ℎ neuron of the 𝑙𝑡ℎ layer is 𝑎𝑗𝑙:

𝑎𝑗𝑙 = 𝑓 𝑧𝑗

𝑙 , where 𝑓 is the activation function

𝑎𝑗𝑙 = 𝑓 σ𝑘 𝑤𝑗𝑘

𝑙 𝑎𝑘𝑙−1 + 𝑏𝑗

In matrix notation, the activation becomes:

𝑎𝑙 = 𝑓 𝑤𝑙 𝑎𝑙−1 + 𝑏𝑙

Example

Ref: https://visualstudiomagazine.com/articles/2013/05/01/neural-network-feed-forward.aspx

Activation Functions

• Sigmoid Activation: 𝑎𝑗𝑙 = 𝜎 𝑧𝑗

𝑙 =1

1+ 𝑒−𝑧𝑗

• Softmax Activation: 𝑎𝑗𝑙 =

𝑒𝑧𝑗𝑙

σ𝑘 𝑒𝑧𝑘𝑙

• Tanh Activation: 𝑡𝑎𝑛ℎ 𝑧𝑗𝑙 =

𝑒𝑥 − 𝑒−𝑥

𝑒𝑥 + 𝑒−𝑥

• Rectified Linear Activation: 𝑚𝑎𝑥 0, 𝑧𝑗𝑙 , maximum of

0 𝑜𝑟 𝑧𝑗𝑙

Activation Functions

Universal Approximator Theorem

One hidden layer may be enough to represent (not

learn) an approximation of any function to an

arbitrary degree of accuracy

So why deeper?

Shallow net may need (exponentially) more width

Shallow net may overfit more

Cost Functions

• Quadratic Cost: 𝐶 =1

2𝑛σ𝑥 ||𝑦 𝑥 − 𝑎𝐿 𝑥 ||2

• Binary Cross-Entropy Cost: 𝐶 = −1

𝑛σ𝑥 ቀ

𝑦 𝑥 × log𝑒 𝑎𝐿 𝑥 + (

𝑦 × log𝑒 1 − 𝑎𝐿 𝑥

• Negative Log-likelihood Cost: 𝐶 = − log𝑒 𝑎𝑦𝐿

• A Cost function must satisfy the following two conditions (for backpropagation):

• The Cost function 𝐶 should be calculated as an average over the cost functions 𝐶𝑥 for individual training examples.

• The cost functions for the individual training examples 𝐶𝑥 and consequently the Cost 𝐶 function must be a function of the outputs of the neural network.

Examples of Neg Log-likelihood Cost

Given the posterior probability and the ground truth:

A set of output probabilities: e.g. [0.1, 0.3, 0.5, 0.1]

Ground truth: e.g., [0, 0, 0, 1]

Likelihood

0*0.1 + 0*0.3 + 0*0.5 + 1*0.1 = 0.1

NLL: -ln(0.1) = 2.3

If ground truth is [0, 0, 1, 0]

0*0.1 + 0*0.3 + 1*0.5 + 0*0.1 = 0.5

NLL: -ln(0.5) = 0.69

Output Types

Discussion

Construct a neural network for MNIST data

28*28 pixel images and 10 labels

Construct a neural network for a binary

classification problem

10 features and 2 labels

Construct a neural network for predicting

tomorrow’s temperature in degree

10 variables

NEURAL NETWORKS I - mkang.faculty.unlv.edumkang.faculty.unlv.edu/teaching/CS489_689/11.Neural...

Documents

Artificial Neural Networks Lect8: Neural networks for constrained optimization

Neural network for Neural Networks

Lecture 10: Neural Networks and Deep Learningsaravanan-thirumuruganathan.github.io/cse5334... · Neural Networks Deep Learning Convolutional Neural Networks Recurrent Neural Networks

CS536: Machine Learning Artificial Neural Networks Neural Networks

Neural Networks: Backpropagation - sviveksvivek.com/.../fall2018/slides/neural-networks/neural-networks-backpropagation.pdfNeural Networks: Backpropagation 1 Based on slides and material

Neural Networks: Old and New · Arti cial neural networks Brain neural networks Credit: Max Pixel Arti cial neural networks Why called arti cial? {(Over-)simpli cation on neural level

Neural Networks with Complex Activations and … · Neural Networks with Complex Activations and Connection Weights ... neural network paradigms, ... Neural Networks with Complex

Neural Networks

Introduction to Neural Networks - Databricks · • Introduction to Neural Networks • Training Neural Networks • Applying your Neural Networks This series will be make use of

Neural Networks Chapter 8. 8.1 Feed-Forward Neural Networks

Artificial Neural Networks Lect7: Neural networks based on competition

Neural Network Part 1: Multiple Layer Neural Networkspages.cs.wisc.edu/.../slides/lecture10-neural-networks-1.pdfNeural networks • a.k.a. artificial neural networks, connectionist

Neural Networks · Neural Networks NEURAL NETWORKS by Christos Stergiou and Dimitrios Siganos Abstract This report is an introduction to Artificial Neural Networks. The various types

Neural networks Computer vision Neural Networks and ... · Neural networks Computer vision Neural Networks and Learning Machines (3rd Edition) Author: Simon's H. Haykin Published:

An introduction to neural networks for beginners€¦ · Part 1 – Introduction to neural networks 1.1 WHAT ARE ARTIFICIAL NEURAL NETWORKS? Artificial neural networks (ANNs) are

A STUDY OF NEURAL NETWORKS AND MULTIPLE NEURAL NETWORKS …

Neural Networks Neural Networks

Neural Networks Neural Networks based on Competition CHAPTER 4

ARTIFICIAL NEURAL NETWORKS. Introduction to Neural Networks

Artiﬁcial Neural Networks and Fuzzy Neural Networks for