93
Weighing the world one click at a 1me George Michaelson [email protected]

Weighing the world one click at a time

  • Upload
    apnic

  • View
    997

  • Download
    1

Embed Size (px)

Citation preview

Weighingtheworldoneclickata1me

[email protected]

Weighingtheworldoneclickata1me

[email protected]

“weigh1ng”theworldonepixelata1me

[email protected]

HowScienceworks

HowScienceworks•  Some1messcienceisdoingexperimentsandseeingwhathappens

HowScienceworks

“anexperimentonabirdinanairpump”byJosephWrightofDerby

HowScienceworks

HowScienceworksNextweekwecanputyourliClebrotherinthevacuumchamber

HowScienceworks•  Wedon’tput1nybirds(orbrothers)invacuumchambers– Weput1x1pixelsoutintothebrowserandseewhocanfetchthem

–  Itsnotasmuchfunbutit’salotlessmessy– We’redoingaround10,000,000experiments/day

HowScien1stswork

HowScien1stswork•  Some1mesScien1stsworkbythinking.

HowScien1stswork

“Etplurimamor1simago”byWilliamHogarth1736

HowScien1stswork

“Etplurimamor1simago”byWilliamHogarth1736

Jeez,Iwishoneofushadacluewhatwasgoingon

Scienceistalking•  Soitsimportantwetalkabouthowwemeasure,getsomesenseofthecaucusaroundwhatmeasurementisbeingdone,howitworks.

•  Sothismee1ng?ThisisScien1ng!

AtypicalAPNICLabstalk

AtypicalAPNICLabstalk

AnothertypicalAPNICLabstalk

AnothertypicalAPNICLabstalk

H-Bombthisway

AusualAPNICLabstalk

Whatwe’reaimingfor

IthinktheBBQisready

now

A]erthetalk…

A]erthetalk… Wedidn’tgetcakelikethisatthecocktailslast

night

TheusualAPNICLabsTalk•  We’re‘explodingbombs’aboutworldwideIPv6uptake,orexploringDNSSEC,pu`ngstoriesoutthere,communica1ngabouttheresults.

•  BOOM!

TheusualAPNICLabsTalk•  We’re‘explodingbombs’aboutworldwideIPv6uptake,orexploringDNSSEC,pu`ngstoriesoutthere,communica1ngabouttheresults.

•  BOOM!Thatwasfun.Letsdoanother.

TheusualAPNICLabsTalk•  We’re‘explodingbombs’aboutworldwideIPv6uptake,orexploringDNSSEC,pu`ngstoriesoutthere,communica1ngabouttheresults.

•  BOOM!Thatwasfun.Letsdoanother.•  BOOM!!!!!!!

ThisAPNICLabstalk•  ..IsabouthowwereflectedonwhatotherpeoplearedoingmeasuringIPv6,andotherquali1esintheInternet,andwhatwechangedasaresult,andwhy.

•  It’salook‘underthecovers’athowwe’redoingthings.Noexplosionsalas.

ThisAPNICLabstalk

ThisAPNICLabstalkME

“whendidyoulastseeyourfather”1878WilliamFrederickYeames

ThisAPNICLabstalkME

“whendidyoulastseeyourfather”1878WilliamFrederickYeames

Please,don’task…

Whydon’tyourfiguresagreewithGoogle

WorstCase30minutesofmylifeIwil

nevergetback

WherewasI…

APNIC1x1tests•  Embeddedadsprimarilyshowninyoutube– WriCeninHTML5–  Shownworldwidecon1nuously

•  TheHTML5invokesjavascript“getURL()”calls–  TestsDNS,Webfetch– DNSdeterminesifv4only,v6only,dualstack

•  PacketcaptureatheadforDNS,webtraffic–  120Gb/dayoflogs

Payingdues•  APNIC1x1measurementsystem2010->present•  Wegotourbasictechniquesfromarangeofsources

–  EmileAbenRIPENCC(javascript,ideas)–  ToreAndersen–  JasonFesler–  ???

•  Visualisa1onmodelshamelesslycopiedfromCisco!–  EricVynke–  Googlechartapifor1meline,growablemodel,RESTfulAPI

Eric..Belgium..

Eric..Belgium..

Herge…

ThosecleverAPNICLabspeoplehave

goodIPv6data!ItsJSON

Eric..Belgium..

IfIcouldgetenoughpointsof

measurementmaybethey’dagree!

Eric..Belgium..

CGN/NATDevice

MoarIPv6!!!

Eric..Belgium..

CGN/NATDevice

MoarIPv6!!!

HowbigisACGNtheseDays?MaybeitneedsACiscoNuclearReactortorun?

APNICMeasurementsystemreprise

•  Paidadverts(previouslyflash,nowHTML5)– Placementis“mostlyrandom”butconsistentlygooduniqueIpspersampleday

– WithHTML5getsmobiledevices,cellular

•  Testsquali1esinDNS,TCP,IPv4/IPv6– MakinginferencesaboutpMTU,NAT/CGN

PayingDues#2•  Originaladvertfeedwas2buckets– Hugepeaksinadvertserve

•  BerlinIETF,Cisco6labsinternspresent–  Pierre-AlainDupont&NicholasLooss

•  Unmask1me-biasinAPNICmeasurementontwo‘buckets’•  Unmaskssystema1cpopula1onover-sampleissues

–  “domorefeeds”issimple

Thisishowitusedtobehave..

0

1000

2000

3000

4000

5000

00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00

22/Mar23/Mar24/Mar25/Mar

PayingDues#2•  Wedeploy12buckets,withoverlap– Smootherdata,moreaCunedtoworldbehaviour

•  Catch-upfeedtofillintheUSA(morepossible)

*

*

Thanksfortheclues1ckhit•  Scienceadvancesbystandingoneachothersshoulders(shouldersofgiants:newton)– Computerscienceadvancesbystandingoneachotherstoes

•  Manythanksforthe‘feedbackloop’informa1onfromtheBerlinIETF.– Welistened.

Thisisnotrocketsurgery •  Iverymuchhopeatsomelevelyouareaskingyourself– WhyamIhere?Isn’tthisobvious?–  Itisreasonablyobvious•  Except,nobodyelseseemstobedoingit…

Ourbasicmodel•  Assumerandomselec1onofusers•  Assumelowrepeatpresenta1onrate•  Adjustgoogleadvertplacementmodel– Preferenceviewsoverclicks– Avoidkeywordselec1onbias(genericterms)

Flawsinthebasicmodel•  Randomselec1onofusers?–  Biassedtoyoutubebecausebid-ratesolow– Ar1ficialbrowser/deviceselec1on

•  LowRepeatrate?–  Emergingevidenceofhighrepeatratesinsomecircumstances•  IsthisNAT/CGN/Proxy?•  NocookiepermiCedingoogleadframework

Flawsinthebasicmodel•  Wehavenofixfortheseatthis1me

–  Weneedanindependentplacementregimeoutsidegoogleads,tounderstandthequali1esofbiasgoogleadsbringstothemeasurement

•  Itspossiblewecouldbringamodelofthedistribu1onofNAT/CGNtothetableifwecouldunderstandhowmanyareoutthere–  WehadaWebRTCandRTMFPmeasurebutitwastoointrusive–  We’rebeginningtoseesignsofshortNATbindinglife1me,whichmaybeausefulsignal

Flawsinthebasicmodel•  Howsthatworldwidedistribu1onworkingoutforus?

•  Well..Wedogetsignalofdifferent1me-of-dayoutcomesbyeconomy:

Uh-oh…

Uh-oh…

Uh-oh…

DZ==Algeria

ThisshouldNOTbesecond-topbysamplecount

Ok.Flawsnoted.So..•  Betweeneconomies,itsheavilybiassedto“cheap”placements– VolumeoftrafficbyeconomydoesnotmatchanymodelweunderstandofInternet“dimension”

– Massiveover-countinsomesmallereconomies– Massiveunder-countinsomesignificantbigones

•  Howtofix?

Ok.Flawsnoted.So..•  Betweeneconomies,itsheavilybiassedto“cheap”placements– VolumeoftrafficbyeconomydoesnotmatchanymodelweunderstandofInternet“dimension”

– Massiveover-countinsomesmallereconomies– Massiveunder-countinsomesignificantbigones

•  Howtofix?Applyexternalworldmodel

Ourworldmodelsources•  ITU/UNpopula1onsta1s1cs–  InfrequentheadcountofInternet‘subscribers’

•  GDPmodels(OECD,other)•  ‘Allowforgrowth’–  (Japanshrank.Butwegrowatalinearfixedbirthrateworldwide)

–  Chinafound150mextrausersoneyear==20%!!– Ok.Soit’sacrudemodel.GotabeCerone?

Weightthedata•  Foreacheconomy,assignaweightbasedonthemodeldefinedrela1vevolumeexpectedinatrulyrandomsample– “morechinathanindia,bothmorethanUSA”– “lessafrica:theyhavepeoplebutlowuptake”

•  Thismodelisnotaconstant:itrequirespolicingover1me

Applytheweights•  Ifwesee1/3fromAlgeria,2/3fromChina– …andthemodelpredicted1/3fromChina2/3fromAlgeria...

– Thenre-factorbythera1oofseentoweighted•  ½asmuchdatafromthechinesesource•  2xasmuchdatafromthealgeriansource

–  (inthisexample.Itvarieseachdaydependingonthesamples)

Applyto%values•  IfinaggregateChinawas10%v6capable•  Thencarry10%forwardasthechinacontribu1onbutatthecorrectweighted‘intensity’againstallsamplesseen

•  Thistechniqueworksforanysubsetoftheworldbecausetherela1vera1osareconstant–  Iewecan–precalculatetheweightsfortheworldandre-useforanysetofeconomies,region

Anexamplere-weigh1ng

Thiswillreducethemassiveover-countinAlgeria

WeightymaCersWhydon’tyourfiguresagreewithGoogle

Doesnobodyelseweighttheirdata?

Aha!moment•  Googlesays10%•  Wesay5%–  Becauseweadjustforahugeover-countandunder-countseeninsomeeconomies,usingasingleconsistentweight

–  But(wethink)googleiscomparingun-adjustednumberstogetasimpletotal(capable)/total(seen)ra1o

Thosebrowser,OSdistor1ons•  Googleappeartohavemodelsofthe‘expected’ra1oofmobiledevice,desktop-device,android/iOS/Windowstobeseenworldwide–  Butweknowthisisprobablynotgloballytrue

•  SouthAmericahavestrongtax/currencybarriersinplaceagainstApple,andweshouldseesignificantlymoreAndroidthanwedo

•  SouthAsia,ChinashouldseesignificantlylessApplethanwedo–  Sowhatcanwedo?

•  Nothing!Wedon’thaveamodelforexpectedra1oofdevices

Weighttheeconomy•  AssumingRandomplacement,eyeballshareinsideagiveneconomyislookinglikeagoodapproxima1onformarket(ish)shareofeyeballs– WorksinUS,GB,FR,DE…– Doesn’tworkinKR.Why?

KR:top-10bysamplecountASN ASName IPv6

CapableIPv6Preferred

#Samples

AS4766 KIXS-AS-KRKoreaTelecom 0.02% 0.00% 1147955

AS9318 HANARO-ASHanaroTelecomInc. 0.02% 0.01% 357278

AS17858 KRNIC-ASBLOCK-APKRNIC 0.01% 0.00% 255188

AS3786 LGDACOMLGDACOMCorpora1on 0.01% 0.00% 118949

AS9644 SKTELECOM-NET-ASSKTelecom 23.66% 23.47% 114849

AS17853 LGTELECOM-AS-KRLGTelecom 0.00% 0.00% 70558

AS16509 AMAZON-02-Amazon.com,Inc. 0.00% 0.00% 43708

AS10036 CNM-AS-KRCMCommunica1onCo.,Ltd. 0.10% 0.10% 34424

AS17864 HANVITIAB-AS-KRHanvitIB 0.00% 0.00% 13124

AS17839 DREAMPLUS-AS-KRDreamcityMedia 0.01% 0.00% 9416

KR:top-10bysamplecountASN ASName IPv6

CapableIPv6Preferred

#Samples

AS4766 KIXS-AS-KRKoreaTelecom 0.02% 0.00% 1147955

AS9318 HANARO-ASHanaroTelecomInc. 0.02% 0.01% 357278

AS17858 KRNIC-ASBLOCK-APKRNIC 0.01% 0.00% 255188

AS3786 LGDACOMLGDACOMCorpora1on 0.01% 0.00% 118949

AS9644 SKTELECOM-NET-ASSKTelecom 23.66% 23.47% 114849

AS17853 LGTELECOM-AS-KRLGTelecom 0.00% 0.00% 70558

AS16509 AMAZON-02-Amazon.com,Inc. 0.00% 0.00% 43708

AS10036 CNM-AS-KRCMCommunica1onCo.,Ltd. 0.10% 0.10% 34424

AS17864 HANVITIAB-AS-KRHanvitIB 0.00% 0.00% 13124

AS17839 DREAMPLUS-AS-KRDreamcityMedia 0.01% 0.00% 9416

…Becausegoogleisseekingdevices?•  MaybetheKRmarketcannotdeliveriOS,Androidatlevelsexpected–  50%+oftheInternetisonmobile/cellular–  Significantlylessthanexpectedisonhomecable– HugedeploymentofCPEinhomebehindoldNATmodel

•  Forwhateverreason,theKReyeballshareisseverelydistorted,ongooglesadvertfeed

..sofindindependentmodel•  IFwehadamodelofrela1veweightperASN– Wecouldre-weightthedatapereconomybyASNrela1veweight,aswedoforEconomiestoRegions/World

•  IFwehadamodelofrela1veweightperdevice/OSpereconomy– Wecouldre-weightthedatapereconomyyaddayadda

•  Soitlookslikewhatwewant,isgoodweigh1ngmodels.Parametricweigh1ngmodelswecanagreeon

TMNetMalaysia

79

•  Notechnologybias:whateverIPv6mechanismisbeingused,isneutraltodeviceclassandOS.

SKTelecomKorea

80

•  StrongbiasagainstiOS,infavourofmobiledevices•  Providerisusing464XLAT.iOSdoesn’tplay(sorunsonv4only

APN)

SKTelecomKorea

81

•  Reasonablyconstantlong-termra1oofiOStoAndroid•  Buthowdoweknowthisreflectsactualhandsetuseinthistelco?

Ifmodels,whatparameters?•  ByGovernmentFiatorRegula1on?

–  Painful,butastatutoryrepor1ngshedulemighthelp–  EgG20/OECDquarterlyreportsonsubscribers?

•  Byindirectmethods–  Packe�lowswhichshowrela1onshipstosubscribers–  VolumesseenatIXsensi1vetosubscribervolume

•  Thisispoten1allytractablebutinvasive•  Off-exchangetrafficwoulddistortseverely(privatepeering)

•  Or..Findothersourcesofdataandtryandaccountforthedistor1ons.

Ifmodels,whatparameters?•  ByGovernmentFiatorRegula1on?

–  Painful,butastatutoryrepor1ngshedulemighthelp–  EgG20/OECDquarterlyreportsonsubscribers?

•  Byindirectmethods–  Packe�lowswhichshowrela1onshipstosubscribers–  VolumesseenatIXsensi1vetosubscribervolume

•  Thisispoten1allytractablebutinvasive•  Off-exchangetrafficwoulddistortseverely(privatepeering)

•  Or..Findothersourcesofdataandtryandaccountforthedistor1ons.Which..Iskind-ofwhatEricisdoing.

TelescopesonthesizeofIPv6

IfIcouldgetenoughKINDSofmeasurement

maybethey’dCONVERGE!

Telescopeso]enhaveverybigbills….

OurtelescopeispreCycheap:order1mmeasurementsfor$20perday

Maybeweneednewgoals?•  Findindependentsourcesforthequali1esoftheglobalInternetwecanagreeoninparametricterms:–  Spanseconomies–  Spansdevicetypes–  Cansupplyvolume,randomeyesatscale

•  Akamai,Cloudflarebackedservices?•  Wikipedia?Whataretheubiquitousservices

Maybeweneednewques1ons?•  Howmanydevicesperpersonis‘normal’now?–  I’mcarrying5.Ile]another2-3athome.

•  They’reswitchedon–  Japan,Korea,Chinaisnow2.xperperson

•  Webservicesmeanwebmeasuresmaybemeasuringapplica1onswhichneverseeahuman

Maybeweneednewques1ons?•  HowmanyNATsandCGNS?– UnpublishedmeasurementsinwebRTC()suggestthatover95%ofallendusersliebehindsomekindofNAT

–  EmergingvolumesofICMPwhichmayrelatetotheubiquityofCGNandNATwithshortbindinglife1me

•  NumberofdevicesperIPseenontheInternet–  The{src,dst,port…..}5-Tupleisnowveryimportant

Somucheasier

withIPv6

IfweputthepigeoninsidetheCGNdowe

getpotaufeu?

IPv6addressdynamics•  DavePlonka,Akamai,isdoingnovelworkinthedistribu1onfunc1onsoftheIPv6addressesseen– Heuris1csofaddressassignments– Detec1onoffunc1onalboundariesofIPv6addressdeployment

•  MappingIPv6toIPv4addressconvergeance?

You(EP)guyshaveskillshere•  Thereisalotofcompetencyhere,thatcanexplorethisspace– You’vealreadymadecontribu1onsinthisspace

•  Scienceislikezombies–  Itneedsfreshmindstoconsume– “newblood”iswelcome(andtas1er)

BIPMBis!!

Measurementcon1nuesinParis….

ThankYou!