11
White Paper Follow us on Designing the ideal video streaming QoE analysis tool Contents 1. Introduction .......................................................................................................................................................................... 1 2. Factors that impact QoE ................................................................................................................................................... 2 Encoding Profile ................................................................................................................................................................. 2 Network Conditions ........................................................................................................................................................... 3 Devices and Players ........................................................................................................................................................... 4 Combinations of Factors .................................................................................................................................................. 5 3. Measuring QoE – what are the output metrics? ........................................................................................................ 6 4. What does the ideal tool need to do? ........................................................................................................................... 7 5. Framework design ............................................................................................................................................................... 7 6. Video capture and analysis.......................................................................................................................................... 9 7. Summary ........................................................................................................................................................................ 10 1. Introduction Eurofins Digital Testing has built a purpose designed framework for conducting video streaming Quality of Experience (QoE) analysis. In this white-paper we describe the problem analysis, requirements definition and design process that resulted in the resulting tool that we have since used on multiple projects. Our basic goal was to play back a range of video streams on a range of devices and measure the resulting QoE. However, this is not as simple as it sounds. In this article we start by examining all the factors (or inputs) that can ultimately affect the end user’s QoE when watching streamed video, and how these inputs can be repeatedly controlled. We then consider what QoE metrics (or outputs) should be measured and put this altogether to deduce a set of requirements that should be met by an ideal QoE analysis framework. Finally, we describe how we engineered a framework to meet these requirements, resulting in a tool that can measure QoE for different content, network conditions and devices/apps in a way that is automated and easily visualised while providing deep offline analysis capabilities.

Designing the Ideal Video Streaming QoE Analysis Tool 140618

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

Designing the ideal video streaming QoE analysis tool

Contents 1. Introduction .......................................................................................................................................................................... 1

2. Factors that impact QoE ................................................................................................................................................... 2

Encoding Profile ................................................................................................................................................................. 2

Network Conditions ........................................................................................................................................................... 3

Devices and Players ........................................................................................................................................................... 4

Combinations of Factors .................................................................................................................................................. 5

3. Measuring QoE – what are the output metrics? ........................................................................................................ 6

4. What does the ideal tool need to do? ........................................................................................................................... 7

5. Framework design ............................................................................................................................................................... 7

6. Video capture and analysis .......................................................................................................................................... 9

7. Summary ........................................................................................................................................................................ 10

1. Introduction EurofinsDigitalTestinghasbuiltapurposedesignedframeworkforconductingvideostreamingQualityofExperience(QoE)analysis.Inthiswhite-paperwedescribetheproblemanalysis,requirementsdefinitionanddesignprocessthatresultedintheresultingtoolthatwehavesinceusedonmultipleprojects.

OurbasicgoalwastoplaybackarangeofvideostreamsonarangeofdevicesandmeasuretheresultingQoE.However,thisisnotassimpleasitsounds.Inthisarticlewestartbyexaminingallthefactors(orinputs)thatcanultimatelyaffecttheenduser’sQoEwhenwatchingstreamedvideo,andhowtheseinputscanberepeatedlycontrolled.

WethenconsiderwhatQoEmetrics(oroutputs)shouldbemeasuredandputthisaltogethertodeduceasetofrequirementsthatshouldbemetbyanidealQoEanalysisframework.

Finally,wedescribehowweengineeredaframeworktomeettheserequirements,resultinginatoolthatcanmeasureQoEfordifferentcontent,networkconditionsanddevices/appsinawaythatisautomatedandeasilyvisualisedwhileprovidingdeepofflineanalysiscapabilities.

Page 2: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

2. Factors that impact QoE ThefactorsthataffectQoEcanbesummarisedinthefollowingpicture:

Figure1:OverviewoffactorsthatimpactQoE

Broadlyspeaking,thesefactorscanbeputintothreecategories:encodingprofile,end-to-endnetworkconditions,andthedevice/applicationthecontentisplayedbackon.Let’sconsidereachoftheseinturn.

Encoding Profile

Forourpurposesan“encodingprofile”referstotheentiresetofaudiovideodataandparametersdescribingasinglecontentasset(containingaudioandvideo,inmultipleadaptationsetsinthecaseofABRcontent).ThisistypicallystoredasasinglefragmentedMP4filebutdoesn’thavetobe.Anencodingprofilecanvaryinmanyways:

a) Contentcharacteristics–e.g.sport/movie/news/cartoonb) Codec–e.g.H.264/H.265/VP9/AV1c) Encodervendorchoice–identicalcodecscanhaveverydifferentperformanceaccordingto

theencoderusedd) ABRformat:

• HLSvsMPEG-DASH• Segmentlength

e) Numberandbitrateofrepresentationsorvariants–i.e.thedesignofthebitrateladderf) Videoresolutiong) Videoframe-rateh) Otherelementarystreamencodingparameters,including:

• Profile/Level• GOPstructure• CBR/VBR

Typically,higherbitratesareusedforhigherpictureresolutionsandframeratesandareimplicitlyassociatedwithhigherQoE.However,it’salsowidelyunderstoodthatthereisanon-linear

Page 3: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

relationshipbetweenbitrateandvideoqualityforagivenvideoresolutionandthat,aboveacertainthreshold,

simplyincreasingthebitrateavailabletoencodethevideoresultsininsignificantincreasesinpicturequality.Thismeanstherearesituationswhere,ifthereismorebandwidthavailable,simplyallocatingmorebitratebaseduponthecurrentprofile/resolution/frame-rateusedintheencodingprofilemaynotresultinnoticeablyimprovedQoEand,instead,perhapsahigherresolutionpictureshouldbeoffered.Ofcourse,mostoftheabovevideofactorshaveanaudioequivalent.FormostAVcontent,thevideobitrateisanorderofmagnitudegreaterthantheaudiobitratesohasgreaterimpactonQoE,whichiswhywefocussedonvaryingandcontrollingvideocharacteristic

Network Conditions

InanABRstreamingscenario,theselectionofvideostreamtoconsumefromasetofvideostreamsofvaryingbitratesisbasedontheprevailingnetworkconditionsatthetimeofviewing.Fluctuationsinthenetworkconditionscancausethequalityoftheviewingexperiencetodegradeorimprove.Thereisnosimplemodeloftheeffectdifferentnetworkimpairmentshaveonanygivenviewingeventsincethealgorithmusedbydifferentplayerswillresponddifferently,andtheimpactonqualityisoftenhighlynon-linear.

Thesignificantfactorsthatimpactnetworkconditionsforoneclientplayingonestreamincludethefollowing:

a) Effectivebandwidth/throughput(Mb/s)b) Packetlatency,jitter,reorderingandlossc) HTTPconnectionerrorrate

Allstagesintheend-to-endnetworkconnectionarerelevant:theperformanceoftheoriginserver,theCDN,theISP’slastmileconnection,andtheconsumer’shomenetworkallcombinetodeterminetheoverallnetworkconditions.Ourgoalwastobeabletomodelandcontroltheend-to-endnetworkcharacteristics,notanyindividualstage.

Thefactorslistedaboveallvaryovertimeandthenatureofthisvariationisacriticalfactorinmodellingreal-worldbehaviour.Weareabletosimulateawiderangeofbehaviourthroughaccesstoaverylargeandrichdatasetofreal-worldend-to-endnetworkconditions.

SincethemajorityofABRvideoistodaycarriedviaHTTPoverTCP,weassumedthatpacketlevelbehaviour(i.e.thesecondbulletpointabove)ishiddenbytheTCPlayerandweconstrainedjusttheeffectiveavailablebandwidth,ratherthanindividuallycontrolledfactorssuchaslatencyandjitter.InthecaseofmanyCDNstheHTTPconnectionerrorrate(i.e.thenumberoftimesaclientreceivesanHTTP404responseandhastore-requestasegment)issosmallthatitdoesnothaveamaterialimpactonQoE.

Thetimevaryingbitratethroughputavailabletotheclientwecallthe“networkprofile”

Page 4: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

Devices and Players

ThedeviceandviewingenvironmentusedtowatchthecontenthasaclearimpactontheQoEduetothefollowingfactors.

a) Decodercapabilities–formatsupportb) Deviceresourceload(CPU/memory)c) Displaysized) Displayresolutione) Otherdisplayproperties:

• Pixelrefreshspeed• Displayfilteringand“pixelimprovement”algorithms• Brightness• Viewingangle

f) ABRimplementationandbufferingalgorithmg) Networkstackcharacteristicsh) Viewingdistance&lightingconditions

Factorsa)andb)arecharacteristicsofthedevicesprocessinghardware,whilec),d)ande)arecharacteristicsofthedevice’sdisplayhardware.

Sometimethecontentmightnotbesupportedbythedeviceatallbecausethemediacodeclevelofthecontentisnotsupportedbythehardwarecomponentswithinthedevice(especiallywhenconsideringmobiledevicessuchasphonesandtablets).ThisisaparticularproblemonAndroiddeviceswherethemandatedlevelofcodecsupportisquitelowrelativetootherdevices,butitisknownthatmanydeviceshavehigherlevelsofcodecsupport.

Whiletheend-userdevicehardwareanddisplayitselfisobviouslyassociatedwithaviewer’sQoE,theimpactofthevideoplayerclientsoftwareusedtowatchthecontentisoftennotconsideredexplicitly.Factorsf)andg)abovearebothdeterminedbythedevice’ssoftware.Theplayersoftware,whetherattheapplicationlayerorpartofthebuilt-inmediaplayer,hasamaterialimpactontheQoEbecauseinABRvideoconsumptiontheplayerisresponsibleforchoosingwhichvariantbitratetostream,whentoswitch,andwhatbufferingstrategytoadopt.ABRalgorithmscanvaryveryconsiderablybetweenplayerapplicationsrunningonthesamedevice.

UnderstandinghowthesoftwareimpactstheQoEandaffectsthechoiceofencodingprofilesiscomplicatedbythefactthattherearedifferenttypesofplayeragentsoftwareandtheabilityofanOTTserviceprovidertoaffectthebehaviouroftheplayerdependsonthetypeofplayerinuse.Broadlyspeakingtherearetwoprimarycategoriesofplayer:

Embedded.Thisiswherethechoiceofbitratetostreamandpresentisdeterminedbyanunderlyingcapabilityoftheplatformtheplayerapplicationisrunningon.InthecaseofAppleiOS,HLSplaybackandbitrateselectionarecontrolledbyiOSitselfandtheappauthorcan’taffecttheadaptationbehaviourdirectly(ignoringoneortwoexceptionalcases,suchasNetflix).InthecaseofAndroid,thereisnativeHLSplaybackavailableviatheMediaPlayerAPI,butitislimitedbasedonthespecificversionofAndroid.Anotherexamplewouldbefor

Page 5: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

HbbTVdeviceswhichsupportMPEG-DASHviaprovidingthemanifest(MPD)directlytoanHTML5videoelementandtheunderlyingmiddlewarecontrolstheadaptationalgorithm;inotherwords,althoughit'sanHTMLapplication,HTMLMediaStreamingExtensions(MSE)arenotused.

Applicationcontrolled.Thisiswheretheserviceprovider’sapplicationitselfcontrolstheselectionofthebitrateandhasanalgorithmthatdetermineswhentoswitchprofile.Examplesinclude:HTMLbasedapplicationsusingMSE,e.g.theopensourcedash.jsandHLS.jsplayers:nativemediaplayerlibrariesthatcanbeembeddedintheapplication,e.g.theopen-sourceExoPlayerapplicationforAndroid;andcommerciallyavailablethird-partyplayerapplicationsthattheserviceprovidercancustomisetheUIfor.

InthisproposalthecombinationofdeviceandplayersoftwareisconsideredasasinglevariabledimensionwhenconsideringimpactonQoE.

Combinations of Factors

AllofthefactorsdescribedwithinthesethreecategoriescometogethertoaffecttheQoE.TheycanbevisualisedasdimensionsofsearchspacewhereeachpointinthatspacecorrespondstodifferentinputconditionsthatwillresultindifferentoutputQoE.

Inputconditions ResultingoutputQoEmetricsDevice/AppCombinations

EncodingProfiles

NetworkProfiles

QoEmetric1

QoEmetric2

DA1 EP1NP1 …

NPo

…NP1 … NPo

EPnNP1 … NPn

EP1NP1 … NPo

…NP1 … NPo

EPnNP1 … NPn

DAm

EP1NP1 … NPo

…NP1 … NPo

Figure2:AsearchspacerepresentationofthefactorsthatimpactQoE.Thetablerowsindicatethetestrunresultsfor‘m’device/appcombinations,‘n’encodingprofilesand‘o’networkprofiles.

NetworkProfiles

Device/Apps

Encoding Profiles

Page 6: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

Abriefconsiderationofallthefactorslistedabovequicklysuggeststhattestingallpossiblecombinationsleadstoavastsearchspaceduetothecombinatorialexplosion.Forexample,thefollowingexperimentwouldresultin25,000testruns!

• 50encodingprofiles,from,say,5contenttypes,2codectypes,5bitrateladderdesigns.

• 10networkprofiles• 50device/playercombinations,from,say,10deviceswith5differentweb-player

applications

Evenarelativelynarrowexperiment–e.g.examiningwhichsoftwareplayerondeviceXperformsbestforaspecifictypeofcontentandcodec–canrequirehundredsoftestrunstoenablestatisticallyrobustevaluation.Forthisreason,enablingefficientandrepeatablerunningofalargenumberoftestrunswasafundamentalrequirementofourtool.

3. Measuring QoE – what are the output metrics? ThefollowingmeasuresareallrelevanttoassessingtheQoEofstreamingvideo.

• Videostart-uptime:timefrominitiationofplaybacktofirstvideoframesbeingdisplayed.• Bufferingevents:thenumber,durationandlocationofeventsduringplaybackwherethe

videoispausedornotshowingwhilefurthersegmentsareacquired.• Switchingevents:thenumberandlocationofeventswheretheplayeradaptsfromone

bitratetoanother.• Frame-by-frameobjectivevideoquality.Tocalculatethis,weusedtheSSIMPLUSalgorithm–

afull-reference,objectivevideoqualitymetric.Wemadethischoiceafterevaluatinganumberofalternatives,includingVMAF,andthereasonforthechoiceofSSIMPLUShasbeenexplainedinaseparatearticle.

• Audioquality,includingaudio-videosync.

MeasuringABRQoEisanactiveandrelativelyimmatureresearchtopicandthereisnoagreedwayofcombiningthesemeasurementstoasimpleoverallQoEmetric.Worse,forsomeofthemetricsresearchexperimentshavecontradictoryconclusionashowevenasinglemeasureimpactsQoE.Forexample,considerthefrequencyandvisibilityofswitchingevents.Someresearchershaveconcludedthatswitchingeventsshouldbeavoidedasfaraspossibleleadingto“widelyspaced”encodingladderswithfewerrepresentations.OtherresearchershaveshownthatbarelyperceptibleswitchingeventshavelittleimpactonQoE,leadingto“tightlypacked”encodingladderswithahighernumberofrepresentations,whereeachadjacentrepresentationhasaqualitydifferencethatisonlyjustdistinguishable.Inpracticeitseemsthatencodingandstoragecostsarethemoreimportantfactorindeterminingthischoice:OTTproviderswithhighvalue,relativelystaticcatalogues(e.g.Netflix)willprefer“tightlypacked”ladders,whereasthosewithfastturnoverorlivecontent(e.g.broadcastercatch-upservices)arelikelytoprefer“widelyspaced”ladders.

Page 7: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

TheimpactonQoEofthenumberanddurationofbufferingeventsisequallypoorlyunderstood.Whilelessbufferingisobviouslydesirable,thereisnoconclusiveresearchonwhether,say,5x6sbufferingeventsisbetterorworsethanasingle60sevent.

4. What does the ideal tool need to do? So,havingevaluatedtheimpactfactorsthatneedtobevaried(networkconditions,encodingprofiles,device/players),andtheoutputmetricsthatneedtobemeasured,wearrivedatthefollowingsetofrequirementsforouranalysisframework.

• Automated.Thisistheonlywaytocostefficientlyrunhundredsofexperiments:itissimplynotviabletousemanualexecution.Additionally,measuringQoEaccuratelyandinarepeatablefashionmakestestautomationthebestchoice.

• Deviceagnostic.Theframeworkmustbeusableonanytypeofdevice(set-top-box,HDMIstick,mobilephone,tablet,computer,smartTV)andawidevarietyofscreentypeswithoutrelyingonframecaptureandanalysistechniquesthatworkonsomedevicesbutnotonothers.

• Contentandencoderagnostic.Wewantedthesolutiontobeagnosticofthecodec,encodingandpackagingplatformwithoutrelyingonparticulartricksorfeaturesofaparticularencoderpipeline.Thiswastomakesurewecouldevaluatedifferentcodecs,encodersandABRformatswithouthavingtore-architecttheframework.

• Networkreproducibility.Topreciselyandrepeatedlycontrolnetworkconditionsmeanstheframeworkhastorunwithinadedicatedlocalnetwork,whereavarietyoftypicalnetworkconditionscanbeaccuratelyreproduced.

• Powerfulresultsvisualisation.Allthemeasurementsmustbeautomaticallycapturedintoaspreadsheettoallowdetailedandcomprehensiveanalysis.However,wealsowantedtoprovideeasy,visualandintuitivebrowsingofindividualtestruns,allowingtheusertoseehowmetricsvaryovertimewhilealsobeingabletoseethecorrespondingvideothatwasbeingdisplayedontheclientdevice.Itwasalsoimportanttoenabledeepanalysisoftheresultswithouthavingtogobackandre-runaparticulartest–meaningthatallrelevanttestdate,suchasnetworktrafficlogs,shouldbestoredforlateruse.

• Accurate.Theframeworkmustofferframeaccurateanalysisoftheresultswhendeducingmetricslikeamountofbufferingtime,switchingeventsandstart-uptimes.Gatheringthesemetricswithsuchhighprecisionisnottrivial.

5. Framework design Theaboverequirementsresultedinthefollowingframeworkdesign,showninthediagramonthenextpage.

Page 8: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

Figure3:Schematicrepresentationoftestframework

Eachtestrunconsistsofplayingbackasinglepieceofvideocontentonasingletestdevice,whilemodifyingpropertiesofthenetworklinkbetweenthevideoserverandthedevice-under-test.Atestruncanbebrokenintothefollowingsteps:

• UsingAppium,SeleniumorIR,thedeviceisdriventoruntheappandplayavideofile.Theprecisesequenceofcontrolstepsdependsuponthedeviceinquestionandtheappbeingused.Carefultimesynchronisationismaintainedbetweenthedevicecontrolscript,thenetworkemulator,andthevideocaptureunit.

• Thenetworkprofileshapingscriptisexecutedwhichvariesthebandwidthforthedurationofthetestasrequired.Thenetworkshapingunitisalsoresponsibleforloggingallnetworktrafficduringthetests.

• ThecameraorHDMIcapturerecordsthedisplayscreenofthedevice-under-testfortwicetheknowndurationofthecontent(toallowforanybufferingevents).Therecordedvideoisstoredforeverytestrunforsubsequentanalysisandforusewithintheresultsvisualisation.

• Thedevicestartstoplaybackthelocalvideocontentandcontinuesuntilplaybackiscomplete.• Therecordedvideoisstoredandanalysedoffline,givingalistofalltheframe/representations

playedateverypointoftimeduringthecontentplayback.• Thisdataisusedtocalculateasetofqualitymetrics(listedabove)andstorethemina

spreadsheet.AtthesametimeanHTML/JavaScriptvisualisationofalltheresultsiscreatedandhostedinthecloudtoenablevisualisationandanalysisoftheresultsfromanylocation.

TestRunDescription

Device/appscript

Encodingprofile

Networkprofile

NetworkShaping

VideoCapture

VideoServer

DeviceControl

Testcontrol&sync

ResultsAnalysis

ViaIR,AppiumorSelenium

LAN

Devicesundertest

Camera

HLSsegmentsoverHTTP

Resultsandvisualisation

HDMICapture

Page 9: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

Thesestepsarethenrepeatedforeachtestrun,withtheURLofthevideocontentandthenetworkprofileadjustedtowhatisrequiredforthattestrun.Betweeneachtestrunthedeviceisrebootedandtheapplication

isrestartedtoensurethatallcachesarecleared.

6. Video capture and analysis EachsourcevideotypehastwosmallQRcodeembeddedinthesourcecontentthatvariesforeveryframeandforeveryrepresentation/variant.UsingacamerarecordingorHDMICapturefromthedevice/appplayingthestreamedvideo,werobustlyandautomaticallydetecteveryframethatwasplayedoutonthedevice/appundertestanddeterminewhichrepresentationwithintheABRformattheframewastakenfrom.Thisallowsfordetectionofdroppedframes,blankframesorbuffering,repeatedframes,anddetectionofswitchingevents.

ThecapturedvideoandcontainedQRcodesareonlyusedtocalculatetheexactframethatisbeingdisplayedatagivenmomentintime.Thequalityofthevideodisplayedbythedeviceisnotdeterminedusingtherecordedvideo.Instead,eachencodingofthesourcecontentispre-processedtodeterminetheSSIMPLUSscoreofeveryframewithineveryrepresentation/variant.WhenanalysingthetestrunthisSSIMPLUSscoreisthenlookedupforeverycapturedframe,meaningthatthequalityoftherecordedvideohasnoimpactontheSSIMPLUSscores.

Theresultsetforeachtestrunisveryrich,containinginformationabouthowthedisplayedframe,SSIMPLUSscore,availablebandwidthandutilisedbandwidthvaryovertime.Tofacilitateinterpretationofthisinformation,displayedalongsidethevideoandtheinferredQoEmetricstheresultscanbepresentedinaninteractivemannerinanywebbrowser.Byhoveringthemouseoverthetimeline,theusercanseethevalueofvariouspropertiesatthatmomentintime,aswellastheprecisevideoframethatwascapturedatthattime.

Figure4onthenextpageshowsatypicalresultsvisualisationforasingletestrun.

Page 10: Designing the Ideal Video Streaming QoE Analysis Tool 140618

WhitePaper

Follow us on

Figure4:Atypicalresultsvisualisationforasingletestrun

7. Summary OurQoEanalysistoolhassucceededinfulfillingourdesignrequirementsandhasbeensuccessfullyusedtoevaluatevideostreamingQoEperformanceacrossmultipleprojects.Thishasinvolvedmorethan20device/appcombinations,10encodingprofilesandover10networkprofiles;leadingtohundredsofindividualtestruns.AlreadythishasledourcustomerstohaveagreaterdeeperinsightintohowtheirsystemchoicesimpactQoE–inmanycasesreachingconclusionsthatarenon-obviousandcounter-intuitive.

Ifyouwouldliketoseeademonstrationofthesystem,[email protected],youcanexploreasubsetoftheresultsbyvisitinghttps://ott-qoe.eurofins-digitaltesting.com

Page 11: Designing the Ideal Video Streaming QoE Analysis Tool 140618

Follow us on