Upload
priscilla-bradley
View
214
Download
0
Embed Size (px)
Citation preview
Distributed Video Coding
Arko BarmanComputer Vision & Artificial Intelligence Lab
Department of Electrical Engineering
Indian Institute of Science, Bangalore
New paradigm for video compression
Based on results from Information Theory proposed in 1970s
A radical departure from traditional video compression techniques
Well suited for applications which require many encoders and a single decoder
Introduction
Low-complexity, low-power encoder
Possibly higher complexity decoder
Should achieve coding efficiency similar to that of conventional video compression techniques
Should try to achieve the Rate-Distortion performance of conventional schemes
Goals
Number of encoders usually much higher than number of decoders (usually one)
Partially overlapping areas in multiple video sequences
Should exploit correlation between multiple encoded video sequences at the decoder
Low-complexity encoders required Decoder may be of higher complexity
Applications
Wireless low-power surveillance networks
Applications
Wireless Mobile Video
Both encoder and decoder must be of low cost and low complexity
Encoder must be Wyner-Ziv for low complexity Decoder must be MPEG-x or H.26x for low
complexity A base station receives Wyner-Ziv encoded
bitstream from transmitter, decodes it, re-encodes it as MPEG-x or H.26x and transmits to receiver
Applications
Wireless Mobile Video
Applications
Multi-view Acquisition
Neighbouring cameras of a large camera-array capture overlapping, and hence, correlated video sequences
Independent encoding of videos in individual cameras
Joint decoding at a central station – must exploit correlation between different views
Used for image-based rendering (3D reconstruction with texture-mapping)
Applications
Multi-view Acquisition
Applications like tracking a person throughout an environment, monitoring of activities, tracking events and creating alarms
Multiple sensors with video-acquisition capabilities – must be of low-cost, low-power and low-complexity
Central decoding device with high computational capabilities and storage
Applications
Video-based Sensor Networks
Entropy:
where,
Joint Entropy:
Conditional Entropy:
Background
Information Theory Fundamentals
Lower bound on the bitrate of signals:
Lower bound on total bitrate:
Background
Information Theory Fundamentals
Background
Slepian-Wolf Theorem
Consider two statistically dependent sequences X and Y separately encoded but jointly decoded
Is it possible to recover these dependent sequences with arbitrarily low reconstruction error probability?
In 1973, Slepian and Wolf determined possible rate combinations of RX and RY for reconstruction of X and Y with an arbitrarily small error probability
These bounds are given by the conditional entropies of the signals X and Y, and their joint entropy
Background
Slepian-Wolf Theorem
The bounds on the rates are determined to be
Background
Slepian-Wolf Theorem
Even when encoding of correlated sources performed independently, a total bitrate equal to the joint entropy is enough
Theoretically, separate encoding in distributed video coding schemes does not need to have any loss in compression efficiency compared to conventional video coding techniques
Defines an achievable rate region for reconstruction of dependent sequences with arbitrarily small probability of error
Background
Slepian-Wolf Theorem
Background
Slepian-Wolf Theorem
Background
Wyner-Ziv Theorem
In 1976, Wyner and Ziv studied a special case of Slepian-Wolf coding corresponding to the rate point
Deals with source coding of a sequence X considering the sequence Y (known as side information) to be available at the decoder
Known as lossy compression with decoder side information
Background
Wyner-Ziv Theorem
Source values X encoded without access to side information Y
Decoder has access to Y and obtains a reconstruction of the source values
Distortion is acceptable Wyner-Ziv Rate-Distortion function is the
achievable lower bound for the bitrate for a distortion D
Background
Wyner-Ziv Theorem
Mathematically,
where is the minimum rate necessary to encode X when Y is available at the encoder i.e. statistical dependency between X and Y is utilized while encoding X, for an average distortion D.
Background
Wyner-Ziv Theorem
Note that for no distortion i.e. D=0, we get the same result as Slepian-Wolf Theorem i.e.
Inequality of Wyner-Ziv Theorem reduces to equality for Gaussian memory-less sources and mean squared error distortion function i.e.
Background
Wyner-Ziv Theorem
In 1996, Zamir proved that for general statistics and mean-squared error distortion function, the rate loss is less than 0.5 bits/sample i.e.
Combining with the Wyner-Ziv Theorem, we have
Background
Wyner-Ziv Theorem
The term ‘distributed’ refers to the encoding operation mode and not location
Coding of two or more dependent sources in an independent way i.e. associating a separate independent encoder to each source
Independent bitstream sent from each encoder – signals are encoded without exploiting the correlation between them
A single decoder performs joint decoding of all received bitstreams using statistical dependencies between them
Distributed Source Coding
Basic Architecture
Pixel-domain Codec
Quantizer divides signal space into cells May consist of non-contiguous sub-cells mapped
into same quantizer index Q Practical implementations of Lloyd Algorithm for
optimal vector quantizers lack in performance or are prohibitively complex
Unfortunately, code cell contiguity precludes optimality of quantizers in general
Basic Architecture
Quantization & Dequantization
Introduction of a rate measure that depends on both quantization index and side information divorces dimensionality of the quantizer from block length of Slepian-Wolf coder – fundamental requirement for practical system design
At high rates and certain other conditions, lattice quantizers are optimal for Wyner-Ziv Coding
Disconnected quantization cells need not be mapped into the same index
Asymptotically, there is no performance loss by not having access to the side information at the encoder
Basic Architecture
Quantization & Dequantization
Unconventional video coding system
Encodes individual frames independently, but decodes them conditionally
Only intra-frame processing required at encoder
Inter-frame processing only at decoder
Basic Architecture
Slepian-Wolf Encoder & Decoder
Previously decoded frames used as side information for decoding a Wyner-Ziv coded frame
Performance closer to conventional inter-frame coding (MPEG) than conventional intra-frame coding (Motion-JPEG)
Encoding may be in pixel domain or transform domain
Basic Architecture
Slepian-Wolf Encoder & Decoder
Slepian-Wolf codec can be implemented using any of the following:
DISCUS (DIstributed Source Coding Using Syndromes)
Turbo codes, like RCPT (Rate-Compatible Punctured Turbo code)
LDPC (Low-Density Parity-Check) codes IRA (Irregular-Repeat-Accumulate) codes
Basic Architecture
Slepian-Wolf Encoder & Decoder
Encoding Techniques
Pixel-domain Codec using RCPT
A subset of frames, regularly spaced in the video sequence, selected as keyframes, K
Keyframes are encoded and decoded using conventional intraframe 8x8 Discrete Cosine Transform (8x8 DCT)
Frames between keyframes are called “Wyner-Ziv frames”
Wyner-Ziv frames are intraframe-encoded but interframe-decoded
Encoding Techniques
Pixel-domain Codec
For each Wyner-Ziv frame, S, each pixel value is uniformly quantized with intervals
Subtractive dithering done to avoid contouring and improve subjective quality of reconstructed image
Sufficiently large block of quantizer indices q provided to Slepian-Wolf encoder
Encoding Techniques
Pixel-domain Codec
RCPT provides rate-flexibility Rate adapts to changing statistics between side
information and frame to be encoded In this system, rate of RCPT is chosen by decoder
and relayed to encoder through feedback For each Wyner-Ziv frame, decoder generates side
information, , by using previously decoded keyframes, and possibly previously decoded Wyner-Ziv frames
Encoding Techniques
Pixel-domain Codec using RCPT
To exploit side information, decoder assumes a statistical model of the ‘correlation channel’
Laplacian distribution of difference between individual pixel values S and is assumed
Decoder estimates parameter of Laplacian distribution by observing the statistics from previously decoded frames
Encoding Techniques
Pixel-domain Codec using RCPT
Turbo decoder combines side information and received parity bits to recover symbol stream
If decoder cannot reliably decode original symbols, it requests additional parity bits from encoder buffer through feedback
This “request-and-decode” process is repeated until an acceptable probability of symbol reconstruction error is achieved
Encoding Techniques
Pixel-domain Codec using RCPT
Using side information, decoder predicts the quantization bin q
For this, decoder needs to request bits to establish which of the bins a pixel belongs to
With calculated values of and , decoder calculates MMSE reconstruction
of the original frame, S
Encoding Techniques
Pixel-domain Codec using RCPT
If side information is within reconstructed bin , then reconstructed pixel takes a value close to side information value
Otherwise, is outside and the reconstruction function forces to lie within the bin
Magnitude of reconstruction error limited to a maximum value determined by quantizer coarseness – perceptually desirable property since it eliminates large errors, which might me annoying to the viewer
Encoding Techniques
Pixel-domain Codec using RCPT
Compared to conventional motion-compensated coding, pixel-domain WZ coding is much less complex
Motion estimation, prediction and DCT not required for encoding of WZ frames
Slepian-Wolf encoder requires two feedback shift registers and an interleaver
Encoding Techniques
Pixel-domain Codec using RCPT
Encoding Techniques
Transform-domain Codec
Block-wise DCT is applied to WZ frame W in the encoder to generate transformed signal X
Transform coefficients are grouped together to form coefficient bands , where k denotes the coefficient number
Each transform coefficient band is then encoded independently
Encoding Techniques
Transform-domain Encoding using RCPT
For each , coefficients are quantized using uniform scalar quantizer with levels
Quantized symbols, are converted to fixed-length binary codewords
Corresponding bitplanes are blocked together forming bit-plane vectors
Each bit-plane vector coded by Slepian-Wolf encoder
Encoding Techniques
Transform-domain Encoding using RCPT
Slepian-Wolf coder is implemented using RCPT RCPT, combined with feedback, provides rate
flexibility which is essential in adapting to changing statistics between side information and frame to be encoded
Parity bits produced by turbo encoder are stored in a buffer
Buffer transmits a subset of these parity bits to decoder on request
Encoding Techniques
Transform-domain Encoding using RCPT
Decoder takes previously reconstructed frames to form side information , an estimate of W
Block-wise DCT of is taken to generate Transform coefficients from are grouped together
to form coefficient bands (side information corresponding to )
To be able to use at turbo decoder and reconstruction block, a statistical dependency model is assumed between and
Encoding Techniques
Transform-domain Decoding using RCPT
Given a coefficient band, the turbo decoder successively decodes bit-planes starting from most significant bit-plane
Decoder uses received subset of parity bits corresponding to that bit-plane and side-information to decode current bit-plane
If decoder cannot reliably decode the bits, it requests additional parity bits from the encoder buffer through feedback
Encoding Techniques
Transform-domain Decoding using RCPT
This “request-and-decode” process continues until an acceptable probability of reconstruction error is achieved
Probabilities generated for current bit-plane are used for decoding lower significance bit-planes
By using side information and successively decoding bitplanes, decoder needs to request
bits to decode which of the bins a transform coefficient belongs to
Encoding Techniques
Transform-domain Decoding using RCPT
When all bitplanes are decoded, bits are regrouped and the quantized symbol stream is reconstructed as
Reconstructed coefficient band is calculated as Assuming is error free, this reconstruction
function bounds magnitude of reconstruction distortion to a maximum value depending on quantizer coarseness
Encoding Techniques
Transform-domain Decoding using RCPT
This property is desirable since it eliminates large positive or negative errors for a given transform coefficient
Fewer errors are perceptible to the viewer and subjective quality of reconstructed video is improved
Finally reconstructed WZ frame is generated by taking IDCT of the reconstructed coefficient bands
Encoding Techniques
Transform-domain Decoding using RCPT
Motion compensated side information is generated at the decoder
As a result, decoders are more complex than encoders
Here we consider every odd frame to be a keyframe and every even frame to be a WZ frame
Frames may or may not be decoded in their actual sequence (similar to conventional video coding techniques)
Side Information
Motion-Compensated Side Information
Side information for a WZ frame at time index t is generated by motion-compensated interpolation using decoded keyframes at time and
Involves symmetrical bi-directional block matching, smoothness constraints for estimated motion and overlapped block motion compensation
Since next keyframe is needed for interpolation, frames are decoded out-of-order (similar to B frames in predictive video coding)
Side Information
Motion-compensated Interpolation (MC-I)
To generate side information for WZ frame at time index t , we estimate motion between previously decoded WZ frame at time and previously decoded keyframe at time using block matching and a smoothness constraint
Estimated motion is extrapolated to time t and side information is generated by performing overlapped motion compensation using pixel values from previous key frame
Side Information
Motion-compensated Extrapolation (MC-E)
Since a previously decoded WZ frame is used for motion estimation, reconstruction errors from all the previously decoded WZ frames can accumulate and degrade the reliability of motion compensation
Unlike MC-I, all frames can be decoded sequentially
Side Information
Motion-compensated Extrapolation (MC-E)
Simplified interpolation or extrapolation scheme to reduce decoder complexity at the expense of reduced compression efficiency
1. Average Interpolation (Ave-I): Side information for WZ frame is generated by averaging pixel values from keyframes at and
2. Previous Frame Extrapolation (Prev-E): Previous keyframe is used directly as side information
Side Information
Low-complexity side information
Performance
B. Girod, A. Aaron, S. Rane, D. Rebollo-Monedero, “Distributed Video Coding” (Invited Paper), Proc. IEEE Special Issue on Advances in Video Coding and Delivery, 2005
Catarina Isabel Carvalheiro Brites, “Advances on Distributed Video Coding”, MSc. Thesis, Technical University of Lisbon, Institute of Superior Technology
A. Aaron, R.Zhang, B. Girod, “Wyner-Ziv Coding of Motion Video”, Asilomar Conference on Signals and Systems, Pacific Grove, CA, Nov 2002
A. Aaron, E. Setton, B.Girod, “Towards practical Wyner-Ziv Coding of Video”, Proc. IEEE International Conference on Image Processing, Barcelona, Spain, Sept 2003
A. Aaron, S. Rane, E. Setton, B. Girod, “Transform-domain Wyner-Ziv Codec for Video” in Proc. SPIE Visual Communications and Image Processing, San Jose, CA, Jan. 2004
References
Thank You