Mmc2k13 Parti 2

Embed Size (px)

Citation preview

  • 7/29/2019 Mmc2k13 Parti 2

    1/88

    Video Coding and Evaluation

    Dr. DoTrong Tuan

    Department Communication Engineering

    School of Electronics and Telecommunications - SET@HUST

  • 7/29/2019 Mmc2k13 Parti 2

    2/88

    MULTIMEDIA

    Exploring

    multimedia

    components(text, images,

    animation, sound,

    video)

    COMMUNICATIONS

    Effectively

    communicate

    Clear

    Exact

    Professional

    Make an impression

    Whats this course

    all about?

    Video Coding/Compression Techniques

    Real-time Multimedia over IP Networks

    Multimedia over IP Networks QoS Monitoring and Evaluation

  • 7/29/2019 Mmc2k13 Parti 2

    3/88

    Agenda

    Coding process

    Video coding standards

    Quality evaluation

    Open issues

  • 7/29/2019 Mmc2k13 Parti 2

    4/88

    4

    Introduction

    Why video compression technique isimportant ?

    One movie video without compression

    720 x 480 pixels per frame

    30 frames per second

    Total 90 minutes

    32 bits color The full data quantity = 223.9488 GB !!

  • 7/29/2019 Mmc2k13 Parti 2

    5/88

    5

    Introduction

    Video Formats and uncompressed Data Rates

  • 7/29/2019 Mmc2k13 Parti 2

    6/88

    6

    Motivation for Video Coding

    Compression for storage

    DVD, Blu-ray, CD

    Camera (flash, memory stick, hard disc, tape, )

    Hard disc

    Mobile phone (UMTS, 3GPP), PDA

    Compression for transmission

    DVB, satellite-TV, cable-TV Internet

    Video-on-demand

    Mobile phone (UMTS, 3GPP), video phone

    Digital cinema, 2K or 4K resolution

  • 7/29/2019 Mmc2k13 Parti 2

    7/88

    What is Video Compression?

    7

    ...Orange Juice Analogy...

  • 7/29/2019 Mmc2k13 Parti 2

    8/888

    Introduction (2/2)

    What is the difference between video

    compression and image compression?

    Temporal Redundancy

    Coding method to remove redundancy

    Intraframe Coding

    Remove spatial redundancy

    Interframe Coding

    Remove temporal redundancy

  • 7/29/2019 Mmc2k13 Parti 2

    9/88

    Desired Features

    Better compression

    Improved quality

    Interactivity and Manipulation of Content Error Resilience

    Processing of content in the compresseddomain

    Identification and selective coding/decoding ofthe object of interest

    Facilitate Search / Indexing (MPEG-7)

  • 7/29/2019 Mmc2k13 Parti 2

    10/88

    Video Coding Standards

    Two Organizations

    ITU-T Video Coding Experts Group (VCEG)

    The International Telecommunications Union,

    Telecommunication Standardization Sector.

    => (H.261 H.262 H.263 H.264)

    ISO/IEC Moving Picture Experts Group (MPEG)

    International Standardization Organization and theInternational Electrotechnical Commission.

    => (MPEG1, MPEG2, MPEG4)1

    0

  • 7/29/2019 Mmc2k13 Parti 2

    11/88

    Video coding standard timeline

    11

  • 7/29/2019 Mmc2k13 Parti 2

    12/88

    Where is MPEG used?

    MPEG-1 Standard (1991) (ISO/IEC 11172)

    Target bit-rate about 1.5 Mbps

    Typical image format CIF, no interlace

    Frame rate 24 ... 30 fps

    Main application: video storage for multimedia (e.g., on CD-ROM)

    MPEG-2 Standard (1994) (ISO/IEC 13818-2)

    Extension for interlace, optimized for TV resolution (NTSC: 704 x 480 Pixel)

    Image quality similar to NTSC, PAL, SECAM at 4 -8 Mbps

    HDTV at 20 Mbps

    MPEG-4 Standard (1999) (ISO/IEC 14496-2)

    Object based coding

    Wide-range of applications, with choices of interactivity, scalability, error resilience,etc.

  • 7/29/2019 Mmc2k13 Parti 2

    13/88

    MPEG-4 Evolution

  • 7/29/2019 Mmc2k13 Parti 2

    14/88

    ITU-T Rec. H.261

    International standard for ISDN picture phones and forvideo conferencing systems (1990)

    Image format: CIF (352 x 288) or QCIF (176 * 144), framerate 7.5 ... 30 fps

    Bit-rate: multiple of 64 kbps (= ISDN-channel), typically 1kbps including audio.

    Picture quality: for 128 kbps acceptable with limited motionin the scene

    Stand-alone videoconferencing system or desk-topvideoconferencing system, integrated with PC

    14

  • 7/29/2019 Mmc2k13 Parti 2

    15/88

    ITU-T Rec. H.263

    International standard for picture phones over analogsubscriber lines (1995)

    Image format usually CIF, QCIF or Sub-QCIF, framerate usually below 10 fps

    Bit-rate: arbitrary, typically 20 kbps for PSTN

    Picture quality: with new options as good as H.261 (athalf rate)

    Software-only PC video phone or TV set-top box

    Widely used as compression engine for Internet videostreaming

    H.263 is also the compression core of the MPEG-4

    standard 15

  • 7/29/2019 Mmc2k13 Parti 2

    16/88

    H.264/AVC - MPEG-4 Part 10

    Duplex real-time voice services over wired or wireless(such as UMTS) networks (bitrate below 1 Mb/s andlow latency),

    Good or high quality video services for streaming oversatellite, xDSL, or DVD (bitrate from 1 to 8 Mb/s andpossibly large latency),

    Lower quality streaming for video services with lower

    bitrate such as over Internet (bitrate below 2 Mb/s andpossibly large latency).

    16

  • 7/29/2019 Mmc2k13 Parti 2

    17/88

    Major Applications

    Applications Speed Coding standards

    Digital television

    broadcasting

    2 . . . 5 Mbps

    (1020 Mbps for HD)

    MPEG-2

    H.264/AVC

    DVD video, HD-DVDBlu-ray Disk

    4 . . . 8 Mbps(1020 Mbps for HD)

    MPEG-2H.264/AVC

    Internet video streaming 20 . . . 600 kbps H.263, MPEG-4, or

    H.264/AVC

    Videoconferencing,Video telephony

    20 . . . 320 kbps H.261, H.263,H.264/AVC

    Video over 3G wireless 20 . . . 200 kbps H.263, MPEG-4,

    H.264/AVC

    17

  • 7/29/2019 Mmc2k13 Parti 2

    18/88

    R-D Performance of MPEG Codecs

    32

    34

    36

    38

    40

    42

    44

    46

    48

    50

    350 450 550 650 750 850 950 1050

    Bit rate (kbps)

    PSNR(

    Y)

    MPEG-1 MPEG-2 MPEG-4 H.264

  • 7/29/2019 Mmc2k13 Parti 2

    19/88

    Video Coding/Compression

    moving picture 1 moving picture 2

  • 7/29/2019 Mmc2k13 Parti 2

    20/88

    Measure of the compression

    Original image256 x 256 x 8 bits

    Compressed image40.000 bits

  • 7/29/2019 Mmc2k13 Parti 2

    21/88

    Measure of the compression

    Original image256 x 256 x 8 bits

    Compressed image40.000 bits

    40.000

    256 x 256Bits/pixel = = 0.61 bpp

    8 bpp

    0.61 bppC. F. = = 13.1

  • 7/29/2019 Mmc2k13 Parti 2

    22/88

    MPEG / H.26x videocompression process flow

  • 7/29/2019 Mmc2k13 Parti 2

    23/88

    Residue after motion compensation

    Pixel-wise difference w/o motion compensation

    Motion estimation

    Horse ride

  • 7/29/2019 Mmc2k13 Parti 2

    24/88

    24

  • 7/29/2019 Mmc2k13 Parti 2

    25/88

    Motion Estimation

  • 7/29/2019 Mmc2k13 Parti 2

    26/88

    Motion Estimation

    Help understanding the content ofimage sequence

    For surveillance

    Help reduce temporal redundancy ofvideo

    For compression

    Stabilizing video by detecting andremoving small, noisy global motions

    For building stabilizer in camcorder

  • 7/29/2019 Mmc2k13 Parti 2

    27/88

    27

    Motion Estimation

    Motion Estimation :

    (t-1)-th frame t-th frame

    Search RangeBest matching

    position

    Motion

    Vector

  • 7/29/2019 Mmc2k13 Parti 2

    28/88

    Motion Compensation

    Consecutive frames in a video are similar -

    temporal redundancy exists.

    Temporal redundancy is exploited so that notevery frame of the video needs to be coded

    independently as a new image. The difference

    between the reference frame and other frame(s)

    in the sequence will be coded. MV (Motion Compensation) aims to reduce the

    data transmitted by detecting the motion of

    objects.

  • 7/29/2019 Mmc2k13 Parti 2

    29/88

    Motion Compensated Coding

    Current Frame

    Encoder

    Previous

    Frame(s)

    +_

    NAL

    MotionCompensation

    Predictor

    MotionEstimation

    Residual Image

    Motion Vector

    Steps of Video compression based on Motion Compensation (MC):

    1. Motion Estimation (motion vector search).

    2. MC-based Prediction.

    3. Derivation of the prediction error, i.e., the difference.

  • 7/29/2019 Mmc2k13 Parti 2

    30/88

    CODEC Design

  • 7/29/2019 Mmc2k13 Parti 2

    31/88

    31

    The most intuitive method toremove temporal redundancy

    3-Dimensional DCT Remove spatiotemporal correlation

    Good for low motion video

    Bad for high motion video

    1 1 1

    0 0 0

    2 (2 1) (2 1) (2 1)( , , ) ( ) ( ) ( ) ( , , ) cos cos cos

    2 2 2

    N N N

    t x y

    x u y v t wF x y t C u C v C w x y t

    N N N N

  • 7/29/2019 Mmc2k13 Parti 2

    32/88

    Hybrid MC-DCT Video Encoder

    Intra-frame: encoded without prediction

    Inter-frame: predictively encoded => use quantized frames as ref forresidue

  • 7/29/2019 Mmc2k13 Parti 2

    33/88

    33

    Block Matching

    Search Window (in previous frame)

    Rectangle with the same coordinate as current block incurrent frame, extended by w pixels in each directions

    w

    w w

    w

    NM

    N+2w

    M+2w

  • 7/29/2019 Mmc2k13 Parti 2

    34/88

    Block Matching

    Divide current frame to small rectangular blocks

    Motion of each block is assumed to be uniform

    Find the best match for each block in previous frame

    Calculate motion vector (MV) between current block and itscounterpart in previous frame

    Typical size for blocks: 16x16 pixels

    Maximum movement: w: typically 8, 16 or 32

    Matching Criteria: Mean Absolute Error (MAE)

    Mean Square Error (MSE)

    Sum of the Squared Error (SSE)

    MAE is preferred due to its simplicity

  • 7/29/2019 Mmc2k13 Parti 2

    35/88

    Exhaustive Search (Full Search)

    All candidates within search window are examined

    (2w+1)2 positions should be examined

    Advantage: Good accuracy, Finds best match

    Disadvantage: Large amount of computation: (2w+1)2matches (Intensive computation) Impractical for real-time applications

    In order to avoid this complexity, we should reduce searchpositions Fast Block Matching Algorithms

  • 7/29/2019 Mmc2k13 Parti 2

    36/88

    36

    Fast Motion Estimation

    Fast Search Method for Motion Estimation :

    Search pattern

    Initial search-point prediction

    Early termination

    Greatly reduces the computation load

    Most are sub-optimal solutions, may be trapped atlocal minima.

  • 7/29/2019 Mmc2k13 Parti 2

    37/88

    37

    Fast Motion Estimation

    Fast Search Method for Motion Estimation :

    Trade-off between :Quality

    Calculation complexity

    Error measure :MSE or SSE

    SAD is usually used

    2D Logarithmic

    Three Step Search (TSS) Diamond Search (DS)

  • 7/29/2019 Mmc2k13 Parti 2

    38/88

    2D Logarithmic Search

    2D Logarithmic Search :

    Interactive comparison of error measure at 5neighboring points

    Logarithmic refinements (half) of the search pattern if :

    Best match is in the center.

    Center of the search pattern touches the border of the searchrange.

    Cannot determine the number of steps and the totalnumber of search points. But the best/worst cases canbe analyzed.

  • 7/29/2019 Mmc2k13 Parti 2

    39/88

    Block Matching (Fast Algorithms)

    2-D Logarithmic Search

    Examine central point & itsfour surroundings

    Distance from center: w/2 Find best match

    If best match is in center oron boundaries, halfdistance from center

    Examine five new pointscentering previous best

    When distance is 1, use all9 matches, find best. Stop

    1

    1

    1

    1 1

    2

    2

    2

    3

    3

    4

    4 44

    4

    4

    4

    4

  • 7/29/2019 Mmc2k13 Parti 2

    40/88

    Block Matching (Fast Algorithms)

    1

    1

    1

    1 1

    2

    2

    2

    3

    3

    4

    4 44

    4

    4

    4

    4

    Started at (0,0)

    Best mach:

    (0,0)(-2,0)(-2,0)(-2,-2)(-2,-2)(-1,-2)

    MV (-1,-2)

    Bl k M t hi (F t

  • 7/29/2019 Mmc2k13 Parti 2

    41/88

    Block Matching (FastAlgorithms)

    Three-Step Search (3SS)

    9 Points: Central point & its8 surroundings

    Distance: w/2

    Find the best match

    Use previous best ascenter

    Half distance, select 8 new

    Repeat algorithm 3 times Examines 25 points

    Assumes a uniformdistribution of MVs

    1

    1

    11

    11

    1 1

    1

    23

    2

    2

    222

    2

    2333 3 3

    33

  • 7/29/2019 Mmc2k13 Parti 2

    42/88

    42

    Three Step Search

    Three Step Search :

    -6 -5 -4 -3 -2 -1 654321

    6

    5

    4

    3

    2

    1

    -1

    -2

    -3

    -4

    -5

    -6

    1 1

    1

    1

    1

    222

    3

    1 2

    222

    2

    3

    3

    333

    3

    1

    1

    3

    1

    Started at (0,0)

    Best mach:

    ( , )( , )( , )( , )

    ( , )( , )

  • 7/29/2019 Mmc2k13 Parti 2

    43/88

    43

    Three Step Search

    Three Step Search :

    -6 -5 -4 -3 -2 -1 654321

    6

    5

    4

    3

    2

    1

    -1

    -2

    -3

    -4

    -5

    -6

    1 1

    1

    1

    1

    222

    3

    1 2

    222

    2

    3

    3

    333

    3

    1

    1

    3

    1

    Started at (0,0)

    Best mach:

    (0,0)(3,3)(3,3)(3,5)

    (3,5)(2,6)

  • 7/29/2019 Mmc2k13 Parti 2

    44/88

    44

    Three Step Search procedure

    The motion vector ?

  • 7/29/2019 Mmc2k13 Parti 2

    45/88

    45

  • 7/29/2019 Mmc2k13 Parti 2

    46/88

    Block Matching (Fast Algorithms)

    Diamond Search (DS)

    Start with 9 points creatinga large diamondwith center

    of search window in center Find best match, repeat

    with this best match ascenter

    If best match is in center,

    use 5 points creating smalldiamond, find best match,this is target

  • 7/29/2019 Mmc2k13 Parti 2

    47/88

    47

    DS Example

    1 1

    1

    1

    1

    1

    1

    1

    1

    2

    2

    2

    3

    3

    3

    3

    3

    4

    4

    4

    4

    Block Matching (Fast Algorithms)

  • 7/29/2019 Mmc2k13 Parti 2

    48/88

    48

    Cost Functions

    MAD : Mean Absolute Difference

    where- N is the side of the macro bock,- Cij and Rij are the pixels being compared in current

    macro block and reference macro block, respectively

    MSE : Mean Squared Error

  • 7/29/2019 Mmc2k13 Parti 2

    49/88

    49

    The MPEG-1 Standard

    Group of Pictures

    Motion Estimation

    Motion Compensation

    Differential Coding

    DCT

    Quantization

    Entropy Coding

  • 7/29/2019 Mmc2k13 Parti 2

    50/88

    50

    Group of Pictures (1/2)

    I-frame (Intracoded Frame) Coded in one frame such as DCT.

    This type of frame do not need previous frame

    P-frame (Predictive Frame) One directional motion prediction from a previous frame

    The reference can be either I-frame or P-frame

    Generally referred to as inter-frame

    B-frame (Bi-directional predictive frame) Bi-directional motion prediction from a previous or future frame

    The reference can be either I-frame or P-frame

    Generally referred to as inter-frame

  • 7/29/2019 Mmc2k13 Parti 2

    51/88

    51

    Group of Pictures (2/2)

    The distance between two nearest P-frame or P-frameand I-frame

    denoted by M

    The distance between the nearest I-frame

    denoted by N

  • 7/29/2019 Mmc2k13 Parti 2

    52/88

    52

    MPEG-1 = JPEG + Motion Prediction+ Rate Control

    Early motivation: to encode motion video at 1.5Mbits/s for transport overT1 data circuits and for replay from CD-ROM

    Defines the decoder but not the encoder

    Frames (pictures)

    Intra-coded using JPEG

    Inter-coded using (interpolated)motion estimation & compensation

    and JPEG for the residuals

    Predicted and Bi-directional

    Macro Blocks (MBs)

    1616 pixels block

    Rate control

    buffer at each end

  • 7/29/2019 Mmc2k13 Parti 2

    53/88

    53

    MPEG-1 Motion Prediction

    Motion prediction = motion estimation + error compensation

  • 7/29/2019 Mmc2k13 Parti 2

    54/88

    54

    MPEG-1 Motion Prediction

    Motion prediction = motion estimation + error compensation

    MPEG 1 DCT

  • 7/29/2019 Mmc2k13 Parti 2

    55/88

    55

    MPEG-1 DCT

  • 7/29/2019 Mmc2k13 Parti 2

    56/88

    MPEG-1 DCT

    ?

  • 7/29/2019 Mmc2k13 Parti 2

    57/88

    MPEG-1 Quantization

  • 7/29/2019 Mmc2k13 Parti 2

    58/88

    MPEG-1 Entropy coding

    Run-length encoding (RLE)

    "this is an example of a huffman tree"

    Huffman Coding is a very popular method of

    entropy coding

    MPEG-1 Entropy coding

  • 7/29/2019 Mmc2k13 Parti 2

    59/88

    MPEG 1 Entropy coding

  • 7/29/2019 Mmc2k13 Parti 2

    60/88

    60

    The MPEG-2 Standard

    The main encoder structure is similar to that of

    the MPEG-1 standard

    Field/frame DCT coding

    Field/frame prediction mode selection

    Alternative scan order

    Various picture sampling formats

    User defined quantization matrix

  • 7/29/2019 Mmc2k13 Parti 2

    61/88

    61

    MPEG-2 = MPEG-1 +

    Improvements Color space: could support 4:2:2 and 4:4:4 coding

    Quantization: could have 9- or 10- bit precision for DC coefficients

    Concealment motion vectors: used when an intra-MB is lost

    Pan and Scan: supports display of different aspect ratios, e.g., 16:9

    Profiles and levels

    Profiles: define the tools or syntactical elements

    Levels: define the permissible ranges of parameters

    Interlace tools

    Scalable coding profiles System layer: define two bit stream constructs

    Program stream (PS): modeled on MPEG-1 (backward compatibility)

    Transport stream (TS): more robust, does not need a common timebase, designed for use in error-prone environment.

  • 7/29/2019 Mmc2k13 Parti 2

    62/88

    62

    MPEG Scalable Coding (SC)

    Non-scalable coding

    To optimize video quality at a given

    bit rate.

    Base and enhancement layer SC

    To optimize video quality at two givenbit rates.

    SNR SC (different quantization accuracy)

    Temporal SC (different frame rates)

    Spatial SC (different spatial resolution)

    Fine granularity scalability (FGS)

    To optimize the video quality over a given bit rate range

    Also has base layer and enhancement layer

    Enhancement layer uses bit-plane coding

    Bit-plane coding considers each quantized DCT coefficient as a binaryinteger of several bits instead of a decimal integer of a certain value

    Frequency weighting and selective enhancement

  • 7/29/2019 Mmc2k13 Parti 2

    63/88

    63

    Field/Frame DCT Coding

    The field type DCT Fast motion video

    The frame type DCT

    Slow motion video

  • 7/29/2019 Mmc2k13 Parti 2

    64/88

    64

    Alternative Scan Order

    Zigzag scan order Frame DCT

    Alternative scan order

    Field DCT

  • 7/29/2019 Mmc2k13 Parti 2

    65/88

    65

    The MPEG-2 Encoder (1/2)

    Base Layer Basic quality requirement ; e.g. SDTV

    Enhanced Layer

    High quality service; e.g. For HDTV

    (*) http://www.bbc.co.uk/rd/pubs/papers/paper_14/paper_14.shtml

    (*)

    MPEG-4 = MPEG-

  • 7/29/2019 Mmc2k13 Parti 2

    66/88

    66

    MPEG-4 = MPEG-2+Objects+Other Enhancements

    Objects (optional)

    Video (texture+shape), image, audio, speech, text, etc.

    Encoded using different techniques

    Transmitted independently

    Compositedat the decoder using BInary Format for Scenes (BIFS) Improvements in MPEG-4 version2

    Global motion compensation (GMC)

    Quarter pixel motion compensation

    Shape-adaptive DCT

    Why is MPEG-4 not a success as MPEG-2?

    Not substantially better than MPEG-2

    Suffers from its sheer size and flexibility

    Issue of licensing

    MPEG 4 E R ili T l

  • 7/29/2019 Mmc2k13 Parti 2

    67/88

    67

    MPEG-4 Error Resilience Tools

    Video packet resynchronization Previous coding standards: Resynchronization markers are fixed at the beginning

    of each row of MBs

    MPEG-4: Resynchronization markers are inserted at every K bits

    Data partitioning

    Partitions the data in a video packet into a motion part and a texture partseparated by a motion boundary marker (MBM)

    Reversible variable length codes (RVLC)

    Finds the next resynchronization marker and decode backwards

    Header extension code (HEC)

    The header information is repeated after the 1-bit HEC

    Unequal error protection technique (UEP)

    Resync.

    marker

    MB

    No.QP HEC

    Repeated

    header info.

    Motion

    dataMBM DCT data

    A videopacket

    use discard use

    I-VOPVP

    Header

    DC DCT

    data

    AC DCT

    dataP-VOP

    VP

    Header

    Motion

    data

    Texture

    data

    New Features of H 264

  • 7/29/2019 Mmc2k13 Parti 2

    68/88

    New Features of H.264

    Multi-mode, multi-reference MC Motion vector can point out of image border

    1/4-, 1/8-pixel motion vector precision

    B-frame prediction weighting 44 integer transform

    Multi-mode intra-prediction

    In-loop de-blocking filter

    UVLC (Uniform Variable Length Coding)

    NAL (Network Abstraction Layer)

    SP-slices

  • 7/29/2019 Mmc2k13 Parti 2

    69/88

    4x4 integer transform

    The transform is based on the DCT but with some

    fundamental differences:

    Integer transform (all operations can be carried out

    with integer arithmetic, without loss of accuracy).

    The inverse transform is fully specified in the H.264

    standard and if this specification is followed correctly,mismatch between encoders and decoders should not

    occur.

  • 7/29/2019 Mmc2k13 Parti 2

    70/88

    4x4 integer transform

    The matrix multiplication can be factorized to the followingequivalent from :

    The 4x4 DCT of an input array X is given by:

  • 7/29/2019 Mmc2k13 Parti 2

    71/88

    4x4 integer transform

    CXCT is a core 2-D transform. E is a matrix of scaling

    factors and the symbol indicates that each element of

    (CXCT) is multiplied by the scaling factor in the same

    position in matrix E (scalar multiplication rather than matrix

    multiplication). The constant a and b are as before; d is c/b(approximately 0.414).

  • 7/29/2019 Mmc2k13 Parti 2

    72/88

    4x4 integer transform

    To simplify the implementation of the transform, d isapproximated by 0.5.

    To ensure that the transform remains orthogonal, b also

    needs to be modified so that :

  • 7/29/2019 Mmc2k13 Parti 2

    73/88

    4x4 integer transform

    The 2nd

    and 4th

    rows of matrix C and the 2nd

    and 4th

    columns of matrix CT are scaled by a factor of 2 and thepost-scaling matrix E is scaled down to compensate.

    This avoids multiplications by in the core transformCXCT which would result in loss of accuracy using integerarithmetic.

  • 7/29/2019 Mmc2k13 Parti 2

    74/88

    74

    4x4 integer transform

    The final forward transform becomes:

    The transform is an approximation to the 4x4 DCT.

    Because of the change to factors dand b, the output of thenew transform will not be identical to the 4x4 DCT.

  • 7/29/2019 Mmc2k13 Parti 2

    75/88

    75

    4x4 integer transform

    The inverse transform is given by:

    Y is pre-scaled by multiplying each coefficient by the

    appropriate weighting factor from matrix Ei.

    The forward and inverse transforms are orthogonal,i.e. T-1T(X) = X.

    B i M bl k C di St t

  • 7/29/2019 Mmc2k13 Parti 2

    76/88

    Basic Marcoblock Coding Structure

    EntropyCoding

    Scaling & Inv.Transform

    Motion-Compensation

    ControlData

    Quant.Transf. coeffs

    MotionData

    Intra/Inter

    CoderControl

    Decoder

    Motion

    Estimation

    Transform/Scal./Quant.-

    InputVideoSignal

    Split intoMacroblocks16x16 pixels

    Intra-framePrediction

    De-blockingFilter

    OutputVideoSignal

  • 7/29/2019 Mmc2k13 Parti 2

    77/88

    77

    Variable block size

    The fixed block size may not be suitablefor all motion objects Improve the flexibility of comparison

    Reduce the error of comparison

    7 types of blocks for selection 1616, 168, 816, 88, 84, 48, 44

    Motion Compensation

  • 7/29/2019 Mmc2k13 Parti 2

    78/88

    Motion Compensation

    EntropyCoding

    Scaling & Inv.Transform

    Motion-Compensation

    ControlData

    Quant.Transf. coeffs

    MotionData

    Intra/Inter

    Coder

    Control

    Decoder

    MotionEstimation

    Transform/Scal./Quant.-

    InputVideoSignal

    Split intoMacroblocks16x16 pixels

    Intra-framePrediction

    De-blockingFilter

    OutputVideoSignal

    Various block sizes and shapes

    8x8

    0

    4x8

    0 10 1

    2 3

    4x48x4

    1

    08x8

    Types

    0

    16x16

    0 1

    8x16

    MBTypes

    8x8

    0 1

    2 3

    16x8

    1

    0

  • 7/29/2019 Mmc2k13 Parti 2

    79/88

    B-frame Prediction Weighting

    Playback order: I0 B1 B2 B3 P4 B5 B6 ...

    Bitstream order: I0 P4 B1 B3 B2 P8 B5 ...

    I0 B1 B2 B3 P4 B5 B6Time

  • 7/29/2019 Mmc2k13 Parti 2

    80/88

    80

    H.264 Over IP

    Network Abstraction Layer Unit(NALU)

    A byte stream of variable length

    1-byte header

    NALU type (T)

    NALU importance (R)

    Error indication (F)

    RTP packetization

    Simple packetization

    One NALU in one RTP packet

    NALU header as RTP header NALU fragmentation

    NALU aggregation

    OSI/RM Protocols and specifi-

    cations for H.264

    Application LayerRTP (Real-Time Transport

    Protocol)

    Header size: IP/UDP/RTP =

    20+8+12=40 bytes

    Media-Unaware RTP payload

    specifications to reduce the lossrates observed by the decoder.

    Packet duplication/Packet based

    FEC/Audio redundancy coding

    Control protocols: H.245, SIP

    (Session Initiation Protocol),

    SDP (Session Description

    Protocol), RTSP (Real-TimeStreaming Protocol)

    Presentation

    Layer

    Session Layer

    Transport Layer UDP (User Datagram

    Protocol)

    Network Layer IP: best effort service

    Data Link Layer

    Physical Layer

    T F

    R

    Advanced Video Coding/ ITU-T Recommendation

  • 7/29/2019 Mmc2k13 Parti 2

    81/88

    81

    Advanced Video Coding/ ITU T RecommendationH.264/ ISO/IEC MPEG-4 (Part 10)

    H.264 structure

    Video coding layer (VCL) Network abstraction layer (NAL)

    Possible applications of H.264

    Conversational services operatedbelow 1Mbps with low latency.

    Entertainment services operated between 1-8+ Mbps with moderatelatency such as 0.5-2s in modified MPEG-2/H.222.0 systems.

    Broadcast via satellite, cable, terrestrial or DSL

    DVD for standard and high-definition video

    Video-on-demand via various channels

    Streaming services operated at 50-1500kbps with 2s or more of latency.

  • 7/29/2019 Mmc2k13 Parti 2

    82/88

    Video Quality Evaluation

    Similarities to Image Quality Evaluation

    Eye more sensitive to chrominance

    Some areas attract more attention that others

    Blocking/ringing artifacts are annoying

    Differences

    Temporal dimension

    Moving versus stationary objects Frame rate plays a significant role

    Any less than 24 Hz is not perceived as moving images but asslideshow

    Q

  • 7/29/2019 Mmc2k13 Parti 2

    83/88

    Video Quality Evaluation

    Subjective

    A human subject rates the video on a scale

    Double Stimulus ContinuousQuality Scale Method

    Hidden scale of 0-100

    Difference is calculated as the

    actual rating

    Vid Q lit E l ti

  • 7/29/2019 Mmc2k13 Parti 2

    84/88

    Video Quality Evaluation

    Objective

    A computer algorithm judges the distortion betweenvideos

    Attempts to model a human observer

    There is currently no standard method

    Obj ti M t i

  • 7/29/2019 Mmc2k13 Parti 2

    85/88

    Objective Metrics

    Peak Signal-To-Noise Ratio (PSNR)

    Used widely in evaluating coding performance

    Purely mathematical difference

    Vid Q lit St di

  • 7/29/2019 Mmc2k13 Parti 2

    86/88

    Video Quality Studio

    O I

  • 7/29/2019 Mmc2k13 Parti 2

    87/88

    Open Issues

    Future coding/compression standards: MPEG-21

    H265

    Improved mechanisms, algorithms for new

    standards motion estimation, transform/quantization, etc

    rate/distortion control

    Source coding and channel coding Hardware-based codec implementation

    Error resilience techniques

    O i

  • 7/29/2019 Mmc2k13 Parti 2

    88/88

    Open issues

    Packet scheduling resource-constraint/exploitation

    scalable multimedia streaming

    quality-resource optimization

    Receiver enhancement

    buffer management, feedback reporting

    error concealment techniques

    Multimedia Services: MobileTV, VoD ...