ACKNOWLEDGEMENT
I am grateful to GOD Almighty for giving me the courage and strength
to complete my seminar successfully. I am thankful to our beloved principal
Prof. Shahir V K and our respected Head of the Department of Computer
Science and Engineering Mr. Gireesh T K, for their parental guidance and
support.
I would like to thank our seminar co-ordinators Ms. Janitha Krishnan
and Ms. Greeshma K for giving me innovative suggestions and assisting in
times of need. I gratefully acknowledge the excellent and incessant help given
by our faculty and my guide Mohammed Jaseem N, Assistant Professor,
Department of Computer Science & Engineering, to incite the work. I am
thankful for valuable guidance and enduring encouragement throughout this
study.
I also remember with thanks the timely help and constant
encouragements induced by other faculties of AWH Engineering College, my
friends and parents. I express my sense of gratitude to Department of Computer
Science & Engineering, AWH Engineering College, for providing me with
facilities to complete my work.
DEEPA JOHNYDEEPA JOHNY
ABSTRACT
The first scheme “Model Of Saliency-Based Visual Attention” presents a
visual attention system, inspired by the behavior and the neuronal architecture
of the early primate visual system, is presented. Multiscale image features are
combined into a single topographical saliency map. A dynamical neural
network then selects attended locations in order of decreasing saliency. The
system breaks down the complex problem of scene understanding by rapidly
selecting, in a computationally efficient manner, conspicuous locations to be
analyzed in detail. The second scheme “Bilayer Segmentation Of Webcam
Videos” presents an automatic segmentation algorithm for video frames
captured by a (monocular) webcam that closely approximates depth
segmentation from a stereo camera. The frames are segmented into foreground
and background layers that comprise a subject (participant) and other objects
and individuals. The algorithm produces correct segmentations even in the
presence of large background motion with a nearly stationary foreground. The
last scheme “Exploring Visual And Motion Saliency For Automatic Video
Object Extraction” presents a saliency-based video object extraction (VOE)
framework. The framework aims to automatically extract foreground objects of
interest without any user interaction or the use of any training data (i.e., not
limited to any particular type of object).
CONTENTS
1. INTRODUCTION 1
2. LITERATURE SURVEY 10
2.1 MODEL OF SALIENCY – BASED VISUAL ATTENTION
SYSTEM 12
2.1.1 Extraction of Early Visual Features 14
2.1.2 The Saliency Map 15
2.2 BILAYER SEGMENTATION OF WEBCAM VIDEOS 18
2.2.1 Notation 20
2.2.2 Motons 20
2.2.3 Shape Filters 22
2.2.4 The Tree-Cube Taxonomy 23
2.2.5 Random Forests Vs Booster Of Trees Vs Ensemble Of
Boosters 25
2.2.6 Layer Segmentation 27
3. EXPLORING VISUAL AND MOTION SALIENCY FOR
AUTOMATIC VIDEO OBJECT EXTRACTION 28
3.1 AUTOMATIC OBJECT MODELING AND EXTRACTION 29
3.1.1 Determination of Visual Saliency 29
3.1.2 Extraction of Motion-Induced Cues 30
3.2 CONDITION RANDOM FIELD FOR VOE 33
3.2.1 Feature Fusion via CRF 34
3.2.2 Preserving Spatio-Temporal Consistency 35
4. COMPARISON 38
5. CONCLUSION 41
REFERENCES 42
GLOSSARY 43
LIST OF FIGURES
2.1 Motons 21
2.2 Shape Filters 22
2.3 The tree-cube taxonomy 25
LIST OF ABBREVIATIONS
1. FOA : Focus Of Attention
2. SM : Saliency Map
3. WTA : Winner-take-all neural network
4. DoG : Derivatives of Gaussian
5. EM : Expectation Maximization
6. LLR : log likelihood ratio
7. ARC : Adaptive reweighting and combining
8. RF : Random Forests
9. BT : Booster of Trees
10.EB : Ensemble of Boosters
11.GB : Gentle Boost
12.CRF : Conditional random field
13.EBT : Ensemble of Booster Trees
14.VOE : Video Object Extraction
15.HOG : Histogram of object gradients
16.GMM : Gaussian mixture models