Chenlei Guo Liming Zhang Image Processing 2010

A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression

A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video CompressionChenlei GuoLiming Zhang

Image Processing 2010OutlineIntroductionPhase Spectrum of Quaternion Fourier Transform(PQFT)Detect Proto-Objects in the Spatiotemporal Saliency MapHierarchical Selectivity (HS)Experiment ResultApplications in Image and Video CodingConclusions and DiscussionsIntroductionMost traditional object detectors need trainingGraph-based visual saliency detection can be very powerful but it demands a very high computational costMost of the models only consider static imagesPhase Spectrum of Quaternion Fourier Transform(PQFT) (1/3)Locations with less periodicity or less homogeneity create pop out proto objects in the reconstruction of the images phase spectrumAn early saliency detection model : PFT

Why phase? When the waveform is a positive or negative pulse, its reconstruction contains the largest spikes at the jump edge of the inputpulse. This is because many varying sinusoidal components locate there. In contrast, when the input is a single sinusoidal component of constant frequency, there is no distinct spike in the reconstruction. Less periodicity or less homogeneity of a location, in comparison with its entire waveform, creates more pop out. The same rule can be applied to two-dimension signals like images as well. [12] pointed out that the amplitude spectrum specieshow much of each sinusoidal component is present and the phase information species where each of the sinusoidal components resides within the image. The location with less periodicity or less homogeneity in vertical or horizonal orientation creates the pop out proto objectsin the reconstruction of the image, which indicates where the object candidates are located.

Early work=> PFT model + 4 channels = PQFT4Quaternion Representation (2/3)Define the input image captured at time t as F(t)r(t), g(t), b(t) are color channels of F(t)

Calculate the Saliency Map By PQFT (3/3)

2-D gaussian filter

Detect Proto-Objects (1/3)

7Alpha (2/3)

Gamma (3/3)

How PQFT Select Visual ResolutionPQFT simulates the human vision system(HVS)

PQFT resolutionPQFTresolution

10Hierarchical SelectivitySet hierarchical level

Experiment ResultsVideo SequenceNatural ImagesPsychological PatternsVideo Sequence (1/3)

Video Sequence (2/3)

Video Sequence (3/3)

Natural Image

Evaluation Method - ROCTrue Positive Rate(TPR), False Positive Rate(FPR)Receiver Operating Characteristic (ROC)ROC curve = TPR/FPRROC area = area beneath ROC curveThe larger ROC area is, the better the prediction power of a saliency map.

SR => spectral residual (use the SR of the amplitude spectrum to calculate the images saliency map)PFT => phase spectrum of Fourier transformSTB => saliency tool box (some models been proposed to simulate the behavior of eyes)NVT => Neuromorphic Vision C++ Toolkit (a bottom-up model used to simulate the humans virtual attention)

Evaluate saliency map

data settarget(1)background(0)binary mapsaliency mapTPR => target%pointssaliency mapfixationFPR => background%pointssaliency mapfixation17Psychological Patterns (1/3)

18Psychological Patterns (2/3)

19Psychological Patterns (3/3)

=20Applications in Image and Video CodingMultiresolution Wavelet Domain Foveation Model (MWDF)Evaluate the performance of the HS-MWDF model in Image and video compressionMultiresolution Wavelet Domain Foveation Model (MWDF)JPEG 2000 has included the region-of-interest(RoI) coding in draftsA better way to find RoI:use Hierarchical Selectivity

Multiresolution Wavelet Domain Foveation Model (MWDF)

The Performance of HS-MWDF in Image CompressionWe use HS-MWDF model as a front end before standard compression (JPEG 2000)Set

n fov => we only use the first n OCAs found by PQFTAuto fov => let the program itself decide the numberN fov => we only use the first n OCAs found by PQFTAuto fov => let the program itself decide the number of foveas24

The Performance of HS-MWDF in Video Compression

Conclusion and DiscussionExtend PFT model to PQFT modelPQFT model is independent of parameters and prior knowledge, and is fast enough to meet real-time requirementsDevelop a model called HS-MWDF as a front end before the image/video encoderProblems:Cant deal with closure patterns wellOnly considers bottom-up informationInsert the model into the image/video encodersReferences

Documents

Chenlei Guo Liming Zhang Image Processing 2010