Neural Network for counting by hand...

Neural Network for counting by hand gesture

Cristian D. Angulo L. - 6219701

Department of Computer Science, Faculty of Science and Technology, Assumption University in Thailand,

SC6611-Neural Networks and Deep Learning

Cristian Angulo

Neural Network for counting by hand gesture

Hand gesture recognition • Sign Language Digits • Dataset from kaggle.com Image processing • Convert images into int

class Net(nn.Module):

def __init__(self, num_channels):

super(Net, self).__init__()

self.num_channels = num_channels

self.conv1 = nn.Conv2d(3, self.num_channels, 3, stride=1, padding=1)

self.conv2 = nn.Conv2d(self.num_channels, self.num_channels*2, 3, stride=1, padding=1)

self.conv3 = nn.Conv2d(self.num_channels*2, self.num_channels*4, 3, stride=1, padding=1)

self.fc1 = nn.Linear(self.num_channels*4*8*8, self.num_channels*4)

self.fc2 = nn.Linear(self.num_channels*4,6)

def forward (self, x):

#image 3x64x64

x = self.conv1(x) #num_channels x 64 x 64

x = F.relu(F.max_pool2d(x, 2)) #it is going to divide the image into2

x = self.conv2(x) #num_channels*2 x 32 x 32

x = F.relu(F.max_pool2d(x, 2)) #num_channels*2 x 16 x 16

#flatten

x = x.view(-1, self.num_channels*4*8*8)

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

x = F.log_softmax(x, dim=1)

return x

STRUCTURE OF THE NEURAL NETWORK

Three convolutional layers

Two fully connected layers

Cristian Angulo

#image 3x64x64

#flatten

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

return x

Cristian Angulo

#image 3x64x64

#flatten

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

return x

1. First convolutional >Input: 3 >Output: n 2. Second convolutional >Input: n >Output: 2*n 3. Third convolutional >Input: 2*n >Output: 4*n

Cristian Angulo

#image 3x64x64

#flatten

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

return x

1. First convolutional >Input: 3 x 64 x 64 >Output: n x 64 x 64 2. Second convolutional >Input: n x 32 x 32 >Output: 2n x 32 x 32 3. Third convolutional >Input: 2n x 16 x 16 >Output: 4n x 16 x 16

Cristian Angulo

#image 3x64x64

#flatten

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

return x

1. First convolutional >Input: 3 x 64 x 64 >Output: n x 64 x 64 First Max Pooling: >Input: n x 64 x 64 >Input: n x 32 x 32 2. Second convolutional >Input: n x 32 x 32 >Output: 2n x 32 x 32 Second Max Pooling: >Input: 2n x 32 x 32 >Input: 2n x 16 x 16 3. Third convolutional >Input: 2n x 16 x 16 >Output: 4n x 16 x 16 Third Max Pooling: >Input: 4n x 16 x 16 >Input: 4n x 8 x 8

Cristian Angulo

#image 3x64x64

#flatten

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

return x

1. First fully connected >Input: n x 4 x 8 x 8 >Output: n x 4 2. Second fully connected >Input: n x 4 >Output: 6

Cristian Angulo

#image 3x64x64

#flatten

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

return x

Activation Function

ReLU or Rectified Linear Unit

Cristian Angulo

#image 3x64x64

#flatten

x = self.fc1(x)

x = F.relu(x)

x = self.fc2(x)

#log_sofmax

return x

sofmax

This function uses an alternative formulation to compute the output and gradient correctly

Cristian Angulo

TENSOR OF THE IMAGES

Cristian Angulo

class SIGNSDataset (Dataset):

def __init__(self, base_dir, split='train', transform=None):

path = os.path.join(base_dir, '{}_signs'.format(split))

files = os.listdir (path)

self.filenames = [ os.path.join(path,f) for f in files if

f.endswith('.jpg')]

self.targets = [int(f[0]) for f in files]

self.transform = transform

def __len__(self):

return len(self.filenames)

def __getitem__(self, idx):

image = Image.open(self.filenames[idx])

if self.transform:

image = self.transform(image)

return image, self.targets[idx]

SIGNSDataset

• It will create a tuple with two objects; the first object is the tensor of the image, and the second object is the attribute value of the image.

>input: route of the images >output: tensor

SIGNSDataset

Some parameters of this object are: >path where the files are. >split: training, validation, test >transform: ToTensor()

Cristian Angulo

f.endswith('.jpg')]

def __len__(self):

if self.transform:

SIGNSDataset

DataLoader is one of the utilities of Pytorch to work with datasets. DataLoader represents a Python iterable over a dataset

Cristian Angulo

f.endswith('.jpg')]

def __len__(self):

if self.transform:

trainset = SIGNSDataset (route, split= 'train', transform=

transforms.ToTensor())

dataloader = DataLoader(trainset, batch_size=32)

input3 = []

for inputs, targets in dataloader:

out = make_grid(inputs)

imshow(out)

print (targets)

input3 = inputs

break tensor([1, 2, 1, 1, 3, 2, 0, 2, 1, 4, 3, 5, 1, 4, 1, 4, 1, 5, 4, 2, 3, 4, 1, 2, 3, 3, 0, 1, 4, 0, 5, 5])

IMPLEMENTATION

Cristian Angulo

Loss Function

The different between the outputs and the targets.

for epoch in range (num_epochs):

print('Epoch {}/{}'.format(epoch+1, num_epochs))

print('-'*10)

running_loss = RunningMetric() #loss

running_acc = RunningMetric() #accuracy

inputs, targets = inputs.to(device), targets.to(device)

optimizer.zero_grad()

outputs = net(inputs)

_, preds = torch.max(outputs, 1)

loss = loss_fn(outputs, targets)

loss.backward() #gradient

optimizer.step() #update the weights with the gradient

batch_size = inputs.size()[0]

running_loss.update(loss.item()*batch_size, batch_size)

running_acc.update(torch.sum(preds == targets).float(), batch_size)

print("Loss: {:.4f} Acc: {:.4f}". format(running_loss(),running_acc()))

loss_fn = nn.NLLLoss()

IMPLEMENTATION

Cristian Angulo

Back propagation

During the backwards pass, gradients from each node are averaged

print('-'*10)

IMPLEMENTATION

Cristian Angulo

Update the weights

The optimizer updates the weights and uses some parameters such as: >learning rate and momentum

print('-'*10)

TRAIN THE NEURAL NETWORK

Cristian Angulo

1. Neural network: structure

2. Optimizer: update the weights

3. Epoch: training times

print('-'*10)

loss_fn = nn.NLLLoss()

optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

net.parameters()

num_epochs = 50

RESULTS

Cristian Angulo

print('-'*10)

Epoch: 100/100

Loss: 0.1346 Accuracy: 0.9630

Training time: 0:01:59.808481

CONCEPTS

1. Convolutional layer

2. Max Pooling

3. Fully connected layer

4. Activation function

5. Forward propagation

6. Back propagation

7. Weights and updates

8. Learning rate

9. Epochs

10. Loss function

11. Accuracy

12. Image processing

REFERENCE: Prof.Ms.M.N.Bansode, Prof.S.D.Jambhale Dr. S. K. Dixit, Neural Network based finger counting Technique, International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014

Neural Network for counting by hand...

Documents

intuckeec.comintuckeec.com/noticedownload.php?view=Voters list - final 325.pdf · ajithkumar cv ajithkumar r ali a anand k ane-esh k b anil pm anil prasad.m anil v anilkumar anilkumar

Diagnostic Imaging Dataset Annual Statistical Release … · Diagnostic Imaging Dataset Annual Statistical ... Diagnostic Imaging Dataset Annual Statistical Release 2013 ... by GPs

ATOMIC PHYSICS Dr. Anilkumar R. Kopplkar Associate Professor S.S. Margol College, SHAHABAD Gulbarga,Karnatka- India

TABLE OF CASES REPORTED Abdul Wahab Anilkumar, C. V.

Dataset Shift

The highD Dataset: A Drone Dataset of Naturalistic Vehicle

Collecting and annotating the large continuous action dataset · uous action dataset (LCA). This dataset contains depictions of 24 action classes. The video for this dataset was ﬁlmed

R TUTORIAL LECTURE - chirayukong.github.iochirayukong.github.io/infsci2725/resources/R_TUTORIAL_LECTURE.pdf · Machine learning kaggle titanic ... Go to kaggle.com > Titanic: Machine

Coding ADO.Net DataSet Objects ISYS 512. DataSet Object A DataSet object can hold several tables and relationships between tables. A DataSet is a set

Mining and Utilizing Dataset Relevancy from Oceanographic ...commons.esipfed.org/sites/default/files/MUDROD_esip_v3.pdf · dataset relevancy from oceanographic dataset metadata, usage

Predictive modeling competitions - cas.confex.com · 20/10/2011 1 Anthony Goldbloom CEO, Kaggle e-mail anthony.goldbloom@kaggle.com twitter @antgoldbloom Predictive modeling competitions

SED2012 Dataset

intuckeec.com List - Full.pdf · ajithkumar cv ajithkumar r ali a amruthalal anand k aneesh k b pm anil prasad.m v anilkumar anilkumar k anilkumar p anilkumar s anilkumar s anilkumar

Spring 2017 Info Session - penndsg.compenndsg.com/slides/2017-01-26-info-session-slides.pdf · Recommended KaggleCompetitions • Titanic: Machine Learning from Disaster • kaggle.com/c/titanic

ANILKUMAR R SUROTIA-Presentation Recruitment Consultant

Anilkumar ao ra kptcl1 A Glimpse at Electricity Act 2003 - Anil Kumar K AO, Regulatory Affairs, KPTCL

Dynamic Tournament Design: An Application to … · Dynamic Tournament Design: An Application to Prediction Contests ... data from 17 prediction contests hosted by the website Kaggle.com

Attribute analysis with synthetic dataset for person re ...with VGG-19 network to perform dataset attribute analysis, consequently, a more suitable dataset (reliable dataset) is constructed

Supporting Information S1. The benchmark dataset consists ... · 1 Supporting Information S1. The benchmark dataset consists of a positive dataset and a negative dataset . The positive

SDSS Dataset and SkyServer Workloadsdatasys.cs.iit.edu/reports/2006_SDSS-DataSet-and-SkyServer-Workloads-v03.pdfSDSS Dataset and SkyServer Workloads Ioan Raicu Page 1 of 9 SDSS Dataset