INTRODUCTION TO NEURAL NETWORKS › ~dszajda › classes › IQS2 › Spring_… · INTRODUCTION TO NEURAL NETWORKS SOME CONTENT COURTESY OF PROFESSOR ANDREW NG OF STANFORD UNIVERSITY

INTRODUCTION TO NEURAL NETWORKS

SOME CONTENT COURTESY OF PROFESSOR ANDREW NG OF

STANFORD UNIVERSITY

IQS2: Spring 2013

Neuron

¨ The basic unit in a neural network is a perceptron (or simply “neuron”)

¨ The xi values are the inputs to the neuron¤ Each blue circle represents a single input unit

2

Neuron

¨ The xi values are the inputs to the neuron¤ Each blue circle represents a single input unit

¨ The θi values are the weights¤ There is a weight corresponding to every connection

between an input unit and a neuron¤ The weights represent the “strength” of the connection

3

Activation Function

¨ The function g is the activation function¤ The notation gθ(x) is an indication that the value of g

depends on both the input it receives (so x is shorthand for the collection of the xi) and the weights (so θ is shorthand for the collection of the θi)

¤ gθ(x) is the output of the neuron¨ The inputs to the neuron from a given input unit are

always multiplied by the corresponding weight¤ So the neuron in our previous slide receives the input

value θ1x1 from the first input unit

4

Activation Function (cont.)

¨ The total input value the neuron receives from all inputs is the sum of all the values received from the individual inputs¤ I used the term “shorthand” earlier, but that’s not quite

correct. ¤ If we write the inputs as a column vector, and call it x (note

no subscript)¤ And similarly if we write the weights as a column vector,

and call it θ ¤ Then the total input to the neuron is the matrix product θTx

5

In Pictures (sort of)

¨ If

¨ Then the input to the neuron (i.e., the input to the activation function g) is

6

What does g look like?

¨ Well, for us, it looks like this:

¨ Written using the previous notation:

7

What does g look like?8

Why This Function

¨ Why would we want an activation function with a range between 0 and 1?¤ Well, what is it that a neural network is supposed to do?

¨ Why can’t we just have g(z) = z (that is gθ(x) = θTx)?

9

Back to the Neuron

¨ The basic unit in a neural network is a perceptron (or simply “neuron”)

¨ The neuron receives input (the xi values), processes them, and produces an output (the gθ(x) value)

10

Basics

¨ There can be a lot more than a single neuron in a neural network

¨ There can be more than 3 inputs to a neuron (but showing 257 would crowd the picture)

¨ We generally write the inputs as a column vector, which we denote by x (note no subscript), as earlier

¨ Similarly, if we consider our parameters as a column vector, we denote it simply as θ

11

Neural Network

Output Layer

Hidden LayerInput Layer

12

13

Vectorized Implementation

Cost Equation

Gradient Checking

One variable

Two variables

n variables Do this with each and all partials!

Cost Equation

Gradient Computation using Back Propagation

Summary

¨ We’ve seen the basic concepts involved in neural networks

¨ We’ve discussed optimization of functions of several variables

¨ We’ve discussed the back propagation algorithm

¨ We’ve discussed using gradient checking to verify that back propagation is working¤ And have possibly been traumatized by it

30

Documents

INTRODUCTION TO NEURAL NETWORKS › ~dszajda › classes › IQS2 › Spring_… · INTRODUCTION TO NEURAL NETWORKS SOME CONTENT COURTESY OF PROFESSOR ANDREW NG OF STANFORD UNIVERSITY