semantic segmentation using cnn

Embed Size (px)

Citation preview

  • 7/24/2019 semantic segmentation using cnn

    1/4

    What is machine learning?

    Machine learning is a subfield of computer science that evolved from the study of patternrecognition and computational learning theory in artificial intelligence. In 1959, Arthur amuel

    defined machine learning as a !"ield of study that gives computers the ability to learn #ithout being

    e$plicitly programmed!. Machine learning e$plores the study and construction of algorithms thatcan learn from and ma%e predictions on data. uch algorithms operate by building a model from

    e$ample inputs in order to ma%e data&driven predictions or decisions.

    'ypes of machine learning?

    upervised learning( 'he computer is presented #ith e$ample inputs and their desired outputs,given by a !teacher!, and the goal is to learn a general rule that maps inputs to outputs.

    )nsupervised learning( *o labels are given to the learning algorithm, leaving it on its o#n to findstructure in its input. )nsupervised learning can be a goal in itself +discovering hidden patterns in

    data or a means to#ards an end +feature learning.

    Machine -earning Algorithms(

    Artificial neural net#or%s

    An artificial neural net#or% +A** learning algorithm, usually called !neural net#or%! +**, is a

    learning algorithm that is inspired by the structure and functional aspects of biological neuralnet#or%s. omputations are structured in terms of an interconnected group of artificial neurons,

    processing information using a connectionist approach to computation. Modern neural net#or%s arenon&linear statistical data modeling tools. 'hey are usually used to model comple$ relationships

    bet#een inputs and outputs, to find patterns in data, or to capture the statistical structure in an

    un%no#n /oint probability distribution bet#een observed variables.

    0o# do A**s #or%?

    An artificial neuron is an imitation of a human neuron

    An artificial neuron is an imitation of a human neuron

  • 7/24/2019 semantic segmentation using cnn

    2/4

    'he output is a function of the input, that is affected by the #eights, and the transfer

    functions.

    'hree types of layers( Input, 0idden, and utput(

  • 7/24/2019 semantic segmentation using cnn

    3/4

    Problem Statement:Semantic Segmentation Using convolutional neuralnetwork.

    "segmentation"is a partition of an image into several !coherent! parts, but withoutany attempt at

    understanding #hat these parts represents. ne of the most famous #or%s +but definitely not the

    first is hi and Mali% !*ormali2ed uts and Image egmentation! 3AMI 4. 'hese #or%s

    attempt to define !coherence! in terms of lo#&level cues such as color, te$ture and smoothness of

    boundary. 6ou can trace bac% these #or%s to the 7estalt theory.

    n the other have "semantic segmentation"attempts to partition the image into semantically

    meaningful parts, andto classify each part into one of pre&determined classes. 6ou can also achieve

    the same goal by classifying each pi$el +rather than the entire image8segment. In that case you are

    doing pi$elise classification, #hich leads to the same end result but in a slightly different path...

    o, I suppose you can say that !semantic segmentation!, !scene labeling! and !pi$el#ise

    classification! are basically trying to achieve the same goal( semantically understanding the role of

    each pi$el in the image.

    "or different methods of semantic segmentation follo# survey paper attached in mail(

    Convolutional Neural NetworkIn machine learning, a convolutional neural net#or% +**, or onv*et is a type of feed&for#ard

    artificial neural net#or% #here the individual neurons are tiled in such a #ay that they respond to

    overlapping regions in the visual field. onvolutional net#or%s #ere inspired by biological

    processes and are variations of multilayer perceptrons designed to use minimal amounts ofpreprocessing. 'hey have #ide applications in image and video recognition, recommender systemsand natural language processing.

    A multilayer perceptron+M-3 is a feedfor#ard artificial neural net#or% model that maps sets of

    input data onto a set of appropriate outputs. An M-3 consists of multiple layers of nodes in adirected graph, #ith each layer fully connected to the ne$t one. :$cept for the input nodes, each

    node is a neuron +or processing element #ith a nonlinear activation function. M-3 utili2es asupervised learning techni;ue called bac%propagation for training the net#or%. M-3 is a

    modification of the standard linear perceptron and can distinguish data that are not linearly

    separable.When used for image recognition, convolutional neural net#or%s +**s consist of multiple layers

    of small neuron collections #hich process portions of the input image, called receptive fields. 'he

    outputs of these collections are then tiled so that they overlap, to obtain a better representation of

    the original image< this is repeated for every such layer. 'iling allo#s **s to tolerate translation

    of the input image.

    onvolutional net#or%s may include local or global pooling layers, #hich combine the outputs of

    neuron clusters. 'hey also consist of various combinations of convolutional and fully connected

    layers, #ith point#ise nonlinearity applied at the end of or after each layer. 'o reduce the number of

    free parameters and improve generalisation, a convolution operation on small regions of input is

    introduced. ne ma/or advantage of convolutional net#or%s is the use of shared #eight inconvolutional layers, #hich means that the same filter +#eights ban% is used for each pi$el in the

    layer< this both reduces memory footprint and improves performance.

    http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdfhttps://en.wikipedia.org/wiki/Gestalt_psychologyhttps://en.wikipedia.org/wiki/Gestalt_psychologyhttp://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf
  • 7/24/2019 semantic segmentation using cnn

    4/4

    ome time delay neural net#or%s also use a very similar architecture to convolutional neural

    net#or%s, especially those for image recognition and8or classification tas%s, since the tiling of

    neuron outputs can be done in timed stages, in a manner useful for analysis of images.

    ompared to other image classification algorithms, convolutional neural net#or%s use relatively

    little pre&processing. 'his means that the net#or% is responsible for learning the filters that in

    traditional algorithms #ere hand&engineered. 'he lac% of dependence on prior %no#ledge and

    human effort in designing features is a ma/or advantage for **s.

    Algorithm for emantic egmentation )sing convolutional neural net#or%(