Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Face Recognition
# whoami Rubén Martínez Sánchez
• Twitter: @eldarsilver
• Computer Engineer (Universidad Politécnica Madrid)
• Security Researcher (Pentester)
• Certified Etical Hacker (CEH)
• Member of MundoHacker (TV Show)
• Master Data Science Datahack
• Cloudera Developer Training for Apache Spark
• Cloudera Developer Training for Apache Hadoop
Agenda# ls()
• Face Recognition
• How to train your Network: Triplet Loss
• How to create your Dataset:
• Types of triplets
• Offline vs Online Triplet Mining
• Online Triplet Mining wins
• Hardening: Spatial Transformer Network
• Demo
• New paths: Spiking Neural Networks
# cat Face_Recognition
• Introduction
Face Recognition Pipeline: Image HC Face CNN + Transfer Learning + STN Identity
Easily expandable: Embeddings.
• Haar Cascades
# cat Face_Recognition
• Convolutional Layer
A kernel will be applied over a region of the image (NxN pixels), making an
element wise product of each pixel of that region with the corresponding pixel
of the kernel to finally add those resulting values.
Stride (S) (1 by default).
Number of kernels.
Kernel size (F).
Zero Padding (P) (add zeros around the image).
Feature Map size = (N-F+2P)/S +1
# cat Face_Recognition
• Convolutional Layer
# cat Face_Recognition
• Pooling Layer
This layer reduces the dimensionality of each feature map but retains the most
important information
Types: MaxPooling, Avg, Sum, etc.
It provides translation invariance depending on the size of the Receptive Field.
In a convolution neural network each unit in a hidden layer is only connected
to a small number of units in the previous layer. This region is called the
Receptive Field. (is the region of the input space that affects a particular unit
of the network) .
is the filter size in layer k. Product from i = 1 to k-1 of the stride in layer i.
is the stride in layer i.
With a small receptive field, the effects of a pooling operator are only felt
towards deeper layers
𝑓𝑘
𝑅0 = 1
𝑠𝑖
# cat Triplet_Loss
• Definition
anchor
positive
negative
The loss of a triplet (a, p, n):
CNN
CNN
CNN
Shared weights
Shared weights
Embedding
Embedding
Embedding
Triplet Loss
𝐿 = 𝑚𝑎𝑥 𝑑 𝑎, 𝑝 − 𝑑 𝑎, 𝑛 + 𝑚𝑎𝑟𝑔𝑖𝑛, 0
# cat How_to_create_your_Dataset
• Types of triplets
Easy triplets:
Semihard triplets:
Hard triplets:
𝑑 𝑎, 𝑝 + 𝑚𝑎𝑟𝑔𝑖𝑛 < 𝑑 𝑎, 𝑛
𝑑 𝑎, 𝑝 < 𝑑 𝑎, 𝑛 < 𝑑 𝑎, 𝑝 + 𝑚𝑎𝑟𝑔𝑖𝑛
𝑑 𝑎, 𝑛 < 𝑑 𝑎, 𝑝
# cat How_to_create_your_Dataset
• Offline Triplet Mining
Create a list of hard or semihard triplets with all the embeddings each epoch.
Create batches of size N:
Compute 3N embeddings + loss of these N triplets + backpropagation.
Not efficient.
• Online Triplet Mining
Each Batch of N inputs:
Compute N embeddings → maximum of triplets.
More efficient.
𝑁3
# cat Online_Triplet_Mining_Wins
• Strategies
Batch of size N.
Batch all.
Batch hard.
• Batch all
Select all the valid triplets.
It avoids easy triplets.
# cat Online_Triplet_Mining_Wins
• Batch hard
It finds the hardest positive and the hardest negative for each anchor.
Hardest positive:
Compute pairwise distance matrix.
Compute 2D mask of valid pairs (a, p): Returns tf.bool `Tensor` with shape
[batch_size, batch_size]. Mask_positive[a, p] is True if a and p are distinct
and have same label.
posit_dist = tf.multiply(mask_positive, pairwise_distance) → Put to 0 triplets
where label(a) != label(p) or label(n) == label(a) or a == p.
hardest_posit = tf.reduce_max(posit_dist, axis=1)
Hardest negative:
Get pairwise distance matrix.
Compute 2D mask of valid pairs (a, n): They should have different label.
For each row → we add the maximum value to the invalid pairs (a, n):
max_row = tf.reduce_max(pairwise_distance, axis=1)
neg_dist = pairwise_distance + max_row * (1 – mask_negative)
hardest_negat = tf.reduce_min(neg_dist, axis=1)
Triplet Loss with Online Triplet Mining:
tf.reduce_mean(tf.maximum(hardest_posit + margin – hardest_negat, 0.0))
# cat Hardening_with_Spatial_Transformer_Networks
• Intro
Goal → Add geometric transformation on an input.
The parameters of the transformation are learnt using the backpropagation
algorithm.
Properties:
Modular.
Specific Transformation for each input.
Trainable with Backpropagation.
Components:
Localisation Network.
Grid Generator.
Sampler.
# cat Hardening_with_Spatial_Transformer_Networks
• Localisation Network
Goal → DNN or CNN estimating the parameters of a spatial transformation
based on the input grid.
Affine Transformations:
Components:
Input: Feature Map of
shape (h, w, c).
Output: Transformation
matrix of shape (6,).
𝛩
𝑃′ =𝑎 𝑏𝑑 𝑒
⋅𝑥𝑦 +
𝑐𝑓 =
𝑎𝑥 + 𝑏𝑦 + 𝑐𝑑𝑥 + 𝑒𝑦 + 𝑓
𝑎 𝑏 𝑐𝑑 𝑒 𝑓0 0 1
⋅𝑥𝑦1
=𝑎 ⋅ 𝑥 + 𝑏 ⋅ 𝑦 + 𝑐𝑑 ⋅ 𝑥 + 𝑒 ⋅ 𝑦 + 𝑓
1
# cat Hardening_with_Spatial_Transformer_Networks
• Grid Generator
Goal → output a parameterised sampling grid.
Creates a normalized meshgrid G of the same size as U → set of indices
Apply an affine transformation to this meshgrid.
Source: https://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf
𝑥𝑡, 𝑦𝑡
𝑥𝑠
𝑦𝑠=
𝜃11 𝜃12 𝜃13𝜃21 𝜃22 𝜃23
⋅𝑥𝑡
𝑦𝑡
1
# cat Hardening_with_Spatial_Transformer_Networks
• Grid Generator
Source: https://www.datahack.es/tensorflow-potenciar-convoluciones-spatial-
transformers/
# cat Hardening_with_Spatial_Transformer_Networks
• Sampler
Goal → Produce the sampled output V using the initial feature map, the
transformed meshgrid and a differentiable interpolation function (bilinear).
# ./Demo_Effect
# ./Demo_Effect
# cat Spiking_Neural_Networks
• Biological Retina
• Source: https://slideplayer.com/slide/7469065
# cat Spiking_Neural_Networks
• Leaky Integrate and Fire Model
Membrane Capacitance Cm, Membrane Voltage Vm, Membrane Resistance Rm,
Conductance G = 1 / Rm, Equilibrium Voltage Ve (between -30 mV and -90 mV),
Voltage Threshold Vth, Voltage Reset Vr, Current Im.
Spike → Vm > Vth
Change of the cell voltage over time when an external current Im is applied
(𝝉 = 𝑪𝒎 ∗ 𝑹𝒎):
𝑑𝑉
𝑑𝑡=
− 𝑉𝑚− 𝑉𝑒 +(𝐼𝑚 ∗𝑅𝑚)
𝜏
# cat Spiking_Neural_Networks
• Leaky Integrate and Fire Model
If our time step is ∆𝒕 ∶
V(t + ∆𝑡) ≈ 𝑉 𝑡 +𝑑𝑉
𝑑𝑡∆𝑡
Behavior of the LIF Neuron:
if Vm(t) > Vth:
Vm(t + 1) = Vr
else:
Vm(t + 1) = Vm(t) + dt * (- (Vm(t) – Ve) + Im * Rm) / 𝝉
# poweroff
Thank you for your curiosity!!