19
Simula Research Laboratory AS 2010-04-29 Pengpeng Ni Multimedia Quality Assessment Context

Multimedia Quality Assessment - Forsiden€¦ · assessment of the quality of television pictures” (1974-2002).-Recommendation ITU-T P. 910 “Subjective video quality assessment

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Simula Research Laboratory AS

    2010-04-29

    Pengpeng Ni

    Multimedia Quality Assessment

    Context

  • Content

    ✴What is multimedia “quality”?

    ✴How to measure multimedia “quality”?

    ✴What are the difficulties and challenges?

  • Quality of Service requirements in distributed multimedia systems

    Network QualityCapacity

    CoverageHandoff

    Link QualityBitrate

    Frame/Bit/Packet lossDelay

    User Perceived Quality Speech fidelityAudio fidelityImage fidelityVideo fidelity

    Quality of service (QoS) is defined in [ITU-T E.800] as the collective effect of performance which determines the degree of satisfaction of a user of the service.

  • Quality of Experience (QoE) :

    The overall acceptability of an application or service, as perceived subjectively by the end-user.

    ✴ QoE includes the complete end-to-end system effects.

    ✴ Over acceptability may be influenced by user expectations and context.

    Context

  • Quality of Service vs Quality of Experience

    ✴QoS: Value of the average userʼs experience richness estimated by a service/product/content provider

    - Service provider-centric concept

    ✴QoE: Value (estimated or actually measured ) of a specific userʼs experience richness.

    - User centric concept

    ✴QoE is the dual and extended view of QoS problem

  • Human centered multimedia services need measure/predict QoE accurately.

  • ✴ Subjective evaluation- Mean Opinion Score (MOS) rating test

    ✴ Objective quality metrics, estimating subjective MOS.

    - MOS predictors based on fidelity measures, examples include PSNR, SSIM, etc.

    How is the quality of multimedia experience measured today?

  • ✴International recommendation for subjective test methodologies:

    -Recommendation ITU-R BT. 500-11 “Methodology for the subjective assessment of the quality of television pictures” (1974-2002).

    -Recommendation ITU-T P. 910 “Subjective video quality assessment methods for multimedia applications” (1999).

    -Recommendation ITU-R BT. 1788 “Methodology for the subjective assessment of video quality in multimedia applications” (2007).

    ✴Viewing condition, Test material, Test procedure, Measurement scales and basic statistics for data analysis.

    Subjective evaluation tests

  • Double Stimulus (DS) methodSubjective Test Methodology I

    5  Impercep)ble4  Percep)ble  but  not  annoying3  Slightly  annoying2  Annoying1  Very  annoying

    Hidden reference

    Explicit reference

    A B

    A B

    Categorical or Numerical Scale

    Excellent

    Bad

  • Single Stimulus (SS) methodsSubjective Test Methodology II

    5  Impercep)ble4  Percep)ble  but  not  annoying3  Slightly  annoying2  Annoying1  Very  annoying

    Categorical scale5  Excellent4  Good3  Fair2  Poor1  Bad

    A 11-grade numerical categorical scale shows best results in SS method

  • Paired Comparison (PC) methodSubjective Test Methodology III

    The viewer judges the relation between two test pictures or sequences.

     -‐3  much  worse  -‐2  worse  -‐1  slightly  worse    0  the  same+1  slightly  beHer+2  beHer+3  much  beHer

  • Data analysis

    ✴Scores distributions across subjects is assumed to be close to normal distribution

    ✴Outlier detection and removal✴Mean Opinion Scores (MOS) and 95% confidence intervals

    !"#$!"#$%&'"()"#*(+'$,"""-%&",'#,"$%./01%."#2$

    %""!"".*3('&"%-"#*(+'$,#"45'&"%*,60'"&'3%7462"

    &'()*+,-%.$ /" ,8746*'" $%&&'#9%./0.:" ,%" 4" ,;%8,406'/" ,8

  • Comparison between Mean of Scores of different test conditions

    ✴Hypothesis test to find out whether the difference between two MOS values are statistically significant

    t  sta%s%c:

    Decision  rule  to  reject  H0:    

  • Objective Quality Metrics

    Obj.MetricPearson

    product-momentSpearman rank-

    order

    PSNR -0.534 -0.555

    SSIM1 -0.443 -0.451

    NTIA GM2 0.288 0.365

    SVQM3 -0.661 -0.684

    [1]: Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans.on Image Processing, vol. 13, no. 4, pp. 600–612, April 2004 [2]: M. Pinson and S. Wolf, “A New Standardized Method for objectively Measuring Video Quality,” IEEE Trans. on Broadcasting, vol. 50, no. 3, pp. 312–322, Sept. 2004[3]: C. S. Kim, S. H. Jin, D. J. Seo, and Y. M. Ro, “Measuring Video Quality on Full Scalability of H.264/AVC Scalable Video Coding,” IEICE Trans. on Communications, vol. E91-B, no. 5, pp. 1269–1278, 2008

    Correlation coefficients are not significant when compared to subjective MOS evaluation.

  • PSNR vs SSIMObjective Quality Metrics

    MSSIM=0.7052 MSSIM=0.9168 MSSIM=0.6949

    PSNR  =  24.9  dBfor  all  the  images

  • All video clips have similar mean PSNR, SSIM value

  • Useful for observed field studies in addition to laboratory experiments - Our first prototype is running on iPhone.

    Extended from conventional PC methodology, but much more flexible and easy-to-use.- Test duration is no more than 15 mins for each test participant.- Unique test session that contains a small subset of test stimulus for different participant due to random sampling.

    A practical method for subjective quality assessmentRandomised Pair Comparison

    NOSSDAVʼ10

  • ✴A strong need for a technique that can measure QoE efficiently and reliably.

    ✴Increased interest in multimedia community around the notion of quality assessment and metrics. - Content-dependent- Context-dependent- Multi-modal quality- 3D video- ...

    In summary,

  • Question?