20
Artificial intelligence in data science Backpropagation Janos Török Department of Theoretical Physics September 30, 2021

Backpropagation Janos Török

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Backpropagation Janos Török

Artificial intelligence in data scienceBackpropagation

Janos Török

Department of Theoretical Physics

September 30, 2021

Page 2: Backpropagation Janos Török

Fully connected neural networks

I Ideas from Piotr Skalski (practice), Pataki Bálint Ármin(lecture) and HMKCode (lecture)

Page 3: Backpropagation Janos Török

Fully connected neural networks

I Model:I Inputs (xj) or for hidden layer l : Al−1

j

I Weight w lij

I Bias bliI Weighted sum of input and bias: z li =

∑j A

l−1j w l

ij + bliI Activation function (nonlinear) g : Al

i = g(z li )

Yang et el, 2000.

Page 4: Backpropagation Janos Török

Feed forward

I Example

I We have an output, how to change weights and biases toachieve the desired output?

I Error L

Page 5: Backpropagation Janos Török

Backpropagation

I

∆W = −α ∂L

∂W

I W is a large three dimansional matrixI Chain rule!

Page 6: Backpropagation Janos Török

Backpropagation

I Chain rule

Page 7: Backpropagation Janos Török

Backpropagation: Example

I From HMKCodeI Note that there is no activation function (it would just add

one more step in the chain rule)

Page 8: Backpropagation Janos Török

Backpropagation: Example

I Weights

Page 9: Backpropagation Janos Török

Backpropagation: Example

I Feedforward

Page 10: Backpropagation Janos Török

Backpropagation: Example

I Error from the desired target

Page 11: Backpropagation Janos Török

Backpropagation: Example

I Prediction function

Page 12: Backpropagation Janos Török

Backpropagation: ExampleI Gradient descent

Page 13: Backpropagation Janos Török

Backpropagation: Example

I Chain rule

Page 14: Backpropagation Janos Török

Backpropagation: Example

I Chain rule

Page 15: Backpropagation Janos Török

Backpropagation: Example

I Chain rule

Page 16: Backpropagation Janos Török

Backpropagation: ExampleI Chain rule

Page 17: Backpropagation Janos Török

Backpropagation: Example

I Summarized

Page 18: Backpropagation Janos Török

Backpropagation: Example

I Summarized in matrix form

Page 19: Backpropagation Janos Török

Backpropagation: Multiple data points

I Generally ∆ is a vector, with the dimension of the number oftraining data points.

I The error can be the average of the error, so repeate theequations below for all training points and average the changes(the part after a)

I Fortunately numpy does not care about the number ofdinemsions, so insted of the multiplication in the rightmatrices we can use dot product.

Page 20: Backpropagation Janos Török

How many layers?

I Neural network with at least one hidden layer is a universalapproximator (can represent any function).

Do Deep Nets Really Need to be Deep? Jimmy Ba, Rich Caruana,