Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Simula Research Laboratory AS
2010-04-29
Pengpeng Ni
Multimedia Quality Assessment
Context
Content
✴What is multimedia “quality”?
✴How to measure multimedia “quality”?
✴What are the difficulties and challenges?
Quality of Service requirements in distributed multimedia systems
Network QualityCapacity
CoverageHandoff
Link QualityBitrate
Frame/Bit/Packet lossDelay
User Perceived Quality Speech fidelityAudio fidelityImage fidelityVideo fidelity
Quality of service (QoS) is defined in [ITU-T E.800] as the collective effect of performance which determines the degree of satisfaction of a user of the service.
Quality of Experience (QoE) :
The overall acceptability of an application or service, as perceived subjectively by the end-user.
✴ QoE includes the complete end-to-end system effects.
✴ Over acceptability may be influenced by user expectations and context.
Context
Quality of Service vs Quality of Experience
✴QoS: Value of the average userʼs experience richness estimated by a service/product/content provider
- Service provider-centric concept
✴QoE: Value (estimated or actually measured ) of a specific userʼs experience richness.
- User centric concept
✴QoE is the dual and extended view of QoS problem
Human centered multimedia services need measure/predict QoE accurately.
✴ Subjective evaluation- Mean Opinion Score (MOS) rating test
✴ Objective quality metrics, estimating subjective MOS.
- MOS predictors based on fidelity measures, examples include PSNR, SSIM, etc.
How is the quality of multimedia experience measured today?
✴International recommendation for subjective test methodologies:
-Recommendation ITU-R BT. 500-11 “Methodology for the subjective assessment of the quality of television pictures” (1974-2002).
-Recommendation ITU-T P. 910 “Subjective video quality assessment methods for multimedia applications” (1999).
-Recommendation ITU-R BT. 1788 “Methodology for the subjective assessment of video quality in multimedia applications” (2007).
✴Viewing condition, Test material, Test procedure, Measurement scales and basic statistics for data analysis.
Subjective evaluation tests
Double Stimulus (DS) methodSubjective Test Methodology I
5 Impercep)ble4 Percep)ble but not annoying3 Slightly annoying2 Annoying1 Very annoying
Hidden reference
Explicit reference
A B
A B
Categorical or Numerical Scale
Excellent
Bad
Single Stimulus (SS) methodsSubjective Test Methodology II
5 Impercep)ble4 Percep)ble but not annoying3 Slightly annoying2 Annoying1 Very annoying
Categorical scale5 Excellent4 Good3 Fair2 Poor1 Bad
A 11-grade numerical categorical scale shows best results in SS method
Paired Comparison (PC) methodSubjective Test Methodology III
The viewer judges the relation between two test pictures or sequences.
-‐3 much worse -‐2 worse -‐1 slightly worse 0 the same+1 slightly beHer+2 beHer+3 much beHer
Data analysis
✴Scores distributions across subjects is assumed to be close to normal distribution
✴Outlier detection and removal✴Mean Opinion Scores (MOS) and 95% confidence intervals
!"#$!"#$%&'"()"#*(+'$,"""-%&",'#,"$%./01%."#2$
%""!"".*3('&"%-"#*(+'$,#"45'&"%*,60'"&'3%7462"
&'()*+,-%.$ /" ,8746*'" $%&&'#9%./0.:" ,%" 4" ,;%8,406'/" ,8
Comparison between Mean of Scores of different test conditions
✴Hypothesis test to find out whether the difference between two MOS values are statistically significant
t sta%s%c:
Decision rule to reject H0:
Objective Quality Metrics
Obj.MetricPearson
product-momentSpearman rank-
order
PSNR -0.534 -0.555
SSIM1 -0.443 -0.451
NTIA GM2 0.288 0.365
SVQM3 -0.661 -0.684
[1]: Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans.on Image Processing, vol. 13, no. 4, pp. 600–612, April 2004 [2]: M. Pinson and S. Wolf, “A New Standardized Method for objectively Measuring Video Quality,” IEEE Trans. on Broadcasting, vol. 50, no. 3, pp. 312–322, Sept. 2004[3]: C. S. Kim, S. H. Jin, D. J. Seo, and Y. M. Ro, “Measuring Video Quality on Full Scalability of H.264/AVC Scalable Video Coding,” IEICE Trans. on Communications, vol. E91-B, no. 5, pp. 1269–1278, 2008
Correlation coefficients are not significant when compared to subjective MOS evaluation.
PSNR vs SSIMObjective Quality Metrics
MSSIM=0.7052 MSSIM=0.9168 MSSIM=0.6949
PSNR = 24.9 dBfor all the images
All video clips have similar mean PSNR, SSIM value
Useful for observed field studies in addition to laboratory experiments - Our first prototype is running on iPhone.
Extended from conventional PC methodology, but much more flexible and easy-to-use.- Test duration is no more than 15 mins for each test participant.- Unique test session that contains a small subset of test stimulus for different participant due to random sampling.
A practical method for subjective quality assessmentRandomised Pair Comparison
NOSSDAVʼ10
✴A strong need for a technique that can measure QoE efficiently and reliably.
✴Increased interest in multimedia community around the notion of quality assessment and metrics. - Content-dependent- Context-dependent- Multi-modal quality- 3D video- ...
In summary,
Question?