3rd Assignment Remarks

Embed Size (px)

Citation preview

  • 8/12/2019 3rd Assignment Remarks

    1/5

    Remarks Mark /15

    20083695 5

    20120786 9

    20120959 8

    20120969 9

    20121109 8

    20121287 13

    matricule

    The number of correct decimal oints is of minor imortance in this ro!ect since aMonte "arlo method to calculate #i is not an efficient method$ The focus of thisro!ect %as the e&ected/measured seed'u of a concurrent ro(ram )s aseriali*ed one and the factors that affect this seed'u$ #refer+,stem$nanoTime- . o)er +,stem$currentTimeMillis- .$ ou should referabl,reort the seed'u o)er the seuential e&ecution rather than the ra% results$ffect of rounds is missin($ timal helers'er'bucker ratio stud, is missin($ er,oor anal,sis$

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ ut itbecomes slo%er if there are more buckets than there are helers4 because %hen

    %e%ant to count the number of hits and the number of samles4 there are morebuckets to access$ o4 if ,ou ha)e more buckets that helers4 the additionalbuckets are siml, not used$ o multile rounds scenario studied$ #refer+,stem$nanoTime- . o)er +,stem$currentTimeMillis- .$ &tremel, oor anal,sis$ccordin( to ,our measurements4 calculatin( 10000 oints in a sin(le'heler run

    %ith batch enabled is faster than calculatin( 100 oints %ith : helers andbatchin( enabled$ ;o% %ould ,ou e&lain that$$$?@ it is not ob)ious andcould easil, be demonstrated$ Aast fi(ure is out of bounds$

    "omarison to the sin(le thread4 lockless benchmark scenario is missin($ Bh,does batchin( lead to faster e&ecution time< =nsufficient e&lanation$ Testin( onl,for 1': threads is a oor test sace$ Bhat haens %hen the number of threadsCC number of cores< =n the number of buckets section4 %hat is the number ofheler threads< #oor anal,sis$ =n ,our reort is not clear the number of roundsused$ The number of correct decimal oints is of minor imortance in this ro!ectsince a Monte "arlo method to calculate #i is not an efficient method$ The focusof this ro!ect %as the e&ected/measured seed'u of a concurrent ro(ram )s aseriali*ed one and the factors that affect this seed'u$ #refer+,stem$nanoTime- . o)er currentTimeMillis- . for time measurements$

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ Thenumber of correct decimal oints is of minor imortance in this ro!ect since aMonte "arlo method to calculate #i is not an efficient method$ The focus of thisro!ect %as the e&ected/measured seed'u of a concurrent ro(ram )s aseriali*ed one and the factors that affect this seed'u$ The correlation of thenumber of threads and critical sections is not clear in ,our reort$ atch@ a seriesof !obs that a comuter does as a set$ ach@ Derman comoser$ The number ofbuckets reresents the number of searate memor, blocks$@ no$ The stren(th ofthe number of buckets is (reater %hen$$$ incomrehensible sentence$ The sectionThe influence of the bach arameter is incomrehensible and so is table 3$ o

    multile rounds scenario studied$ #refer +,stem$nanoTime- . o)er+,stem$currentTimeMillis- .$

    "omarison %ith the benchmark case of sin(le threaded4 lockless e&ecution ismissin($ The effect of multile rounds is not included in ,our reort$ Bhat is thero(ram beha)iour as the number of threads -oints allocated to each %orker.increases -decreases.< ;o% does the measured throu(hut increase< Ainearl,4linearithmeticall,4 e&onentiall,< =s the 16'thread scenario an outlier or thebe(innin( of a attern< More threads %ere necessar, to ro)e ,our oint$=ncludin( the set'u hase in the time measurements soils ,our results$

  • 8/12/2019 3rd Assignment Remarks

    2/5

    20122772 7

    20126:71 3

    20071937 2

    20126028 2

    2012086: 9

    The (ranularit, of the time measurements is of imortance since it affects theaccurac, of the measurements4 eseciall, for a small inut$ There is no such thin(as a more distant )ariable$ Bhen batchin( is set to false the additional cost ofacuirin(/ releasin( the monitor of the ob!ect becomes more freuent and4 as aresult4 more e)ident$ "omarison %ith the benchmark case of sin(le threaded4lockless e&ecution missin($ =n ,our reort ,ou ha)e actuall, t%o threads in thebenchmark case@ the main thread and a %orker thread$ =n the second art of ,our

    reort4 ,ou should ha)e directl, reorted the seed'u of the concurrente&ecutions$ =n ,our second table4 columns 3 and : e&hibit similar beha)iour tocolumns 1 and 2 resecti)el, because ,ou cannot assi(n a %orker thread to morethan one bucket and thus if the number of buckets is (reater than the number of

    %orkers4 these buckets are not used$ =n ,our reort it is not clear if there is anumber of oints calculated er round / %hat this number is$ The effect of man,

    %orkers er bucket is not e&amined in ,our reort$

    Reuested@ a 2 a(e reort$ +ubmitted@ 23: a(es in totalE = decided not to makean, (rahics because the, didnFt look clear enou(h to me4 as the differences oftime are rett, i(@ use lo(arithmic scale$ "omarison %ith the benchmark caseof sin(le threaded4 lockless e&ecution missin($ umbers ro)ide hard e)idence tosuort ,our claims4 ad)erbs do not$ =ncomrehensible reort as to the number of

    helers er bucket and the number of rounds$ "omlete absence of ualitati)eand uantitati)e measurements$ Gsin( more buckets than helers is meanin(lesssince e)er, heler thread can be attributed to one bucket onl,$

    "omarison to the sin(le thread4 lockless benchmark scenario is missin($=ncomrehensible4 e&tremel, oor anal,sis$ Bhat factors affect the otimalnumber of threads< The effect of the number of rounds %as not studied$

    The results of ,our test should ha)e been art of ,our reort$ #lacin( them in asearate sreadsheet make it imossible to link them to the corresondin(section$ Bhen batch H false @ it is better to use a number of threads H number ofsamles@ so if %e use 1025samles %e %ould create 1025threads and eachthread %ould calculate one oint$ This %ould thrash the s,stem$ Bhen batch Htrue @ it is better to use a number of threads eas, to s,nchroni*e %ithout usin( toomuch buckets@ This sentence is both (rammaticall, and lo(icall, inaccurate$callin( more threads than the number of samle >$$$?@ %h, %ould an,one do sucha thin(< The number of buckets is at least eual to the number of threads@ha)in( more buckets that helers is redundant since ,ou cannot allocate a helerto more than one bucket$ =ncomrehensible reort$ #refer +,stem$nanoTime- .o)er +,stem$currentTimeMillis- .$

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ =n the first(rah4 the seed'u is measured o)er %hat< Gnclear oint$ The number of roundser scenario is not mentioned in ,our reort$ The batch otion reduces thecontention for a bucketIs monitor onl, if there are more than 1 %orker allocateder bucket$ +ince ,ou claim that ,ou allocated one %orker er bucket4 it is not thecontention for the monitor that is resonsible for the additional e&ecution time$=nsufficient e&lanation of the sur(e obser)ed in the second (rah$ #erformanceof bucket otion anal,sis4 is )er, oor$ There is no oint reortin( more buckets

    than %orkers$ ou should ha)e reorted )arious helers/buckets ratio$ #refer+,stem$nanoTime- . o)er currentTimeMillis- . for time measurements$ =ncludin(the set'u hase in the time measurements affects ,our results$

  • 8/12/2019 3rd Assignment Remarks

    3/5

    20131:27 5

    20133008 11

    20135010 2

    20071290 9

    2007367: 9

    The course is #ro(rammin( Techniues not =ntroduction to net%orkin($ Thesetests are erformed on a : cores MJ rocessor and %ithout others rocessesrunnin( on the comuter@ there are a do*en of daemon rocesses launched b,the + runnin( in the back(round$ =n section 2 ,our formula su((ests that thefootrint of the seuential art of the code is *ero %hich is an incorrectassumtion$ ou do not ro)ide a thorou(h descrition of the test case -totalnumber of oints4 number of rounds4 helers er bucket.$ Kinall, ,ou do not

    e&lain adeuatel, the beha)iour of the cur)e %hen batch is disabled$ The effectof buckets %ith batch enabled is not studied$ Bh, does time reduce e&onentiall,and not linearl,< =nsufficient anal,sis of the effect of helers/bucket ratio$ ;o%does the total number of oints affect the calculation< "omarison to the sin(lethread4 lockless benchmark scenario is not included in ,our reort$ Jo not meddle

    %ith thread riorit,$

    =s the 217threads al%a,s slo%er than a sin(le thread e&ecution or does thisboundar, also deend on the amount of %ork allocated to each thread< Kurtheranal,sis reuired$ =n ,our reort ,ou should also include that the e&ecution timeincreases %ith the number of rounds because the arent thread oerates inlockste4 i.e.it %aits for all the heler threads to finish before it starts creatin( andlaunchin( ne% ones$ ou do not e&lain sufficientl, the findin(s in ,our second(rah$ =f ,ou ha)e 2 helers er bucket and batchin( is disabled4 it is faster than a

    dedicated heler er bucket$ Bh,< ou claim that the concurrent e&ecution %ithbatchin( disabled is al%a,s slo%er than a sin(le'heler e&ecution$ Joes thisstatement still stand %ith a (reater number of samles< ou could mer(e fi(ures 3and : and sa)e some sace$ ou do not sa)e the returned )alue of i$ ;o% doesthe number of oints affect the e&ecution time of each scenario< Jo not meddle

    %ith thread riorit,$

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ &ecutiontime under a )ar,in( number of threads and %orkloads is missin($ &tremel, ooranal,sis$ +tud, case of helers'to'bucket ratio missin($ The effect of number ofrounds is not studied$ The number of correct decimal oints is of minorimortance in this ro!ect since a Monte "arlo method to calculate #i is not anefficient method$ The focus of this ro!ect %as the e&ected/measured seed'uof a concurrent ro(ram )s a seriali*ed one and the factors that affect this seed'

    u$ #refer +,stem$nanoTime- . o)er currentTimeMillis- . for time measurements$The number of correct decimal oints is of minor imortance in this ro!ect since aMonte "arlo method to calculate #i is not an efficient method$ The focus of thisro!ect %as the e&ected/measured seed'u of a concurrent ro(ram )s aseriali*ed one and the factors that affect this seed'u$ #refer+,stem$nanoTime- . o)er currentTimeMillis- . for time measurements$ =f %eincrease the number of helers and buckets -helers H buckets.4 the linearit, isal%a,s obser)ed but the time isnFt the same@ claim not suorted b,measurements$ Bh, the time reaches a lateau after : threads< =nadeuatee&lanation$ The number of rounds is not included in ,our anal,sis$

    "omarison to the sin(le thread4 lockless benchmark scenario is missin($ lastobser)ation is that in the case of disable batch mode the best time are obtained

    %ith more threads than %e ha)e cores$@ e&lanation missin($ There is no oint in

    ha)in( more buckets than helers since %e cannot assi(n a heler to more thanone bucket$ The effect of the number of rounds is not e&amined$ Reort isdominated b, assumtions mainl, due to oor e&eriments$ #refer+,stem$nanoTime- . o)er currentTimeMillis- . for time measurements$

  • 8/12/2019 3rd Assignment Remarks

    4/5

    20082091 10

    20091017 9

    20091510 1:

    20091918 8

    200922:1 6

    2009::83 9

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ Timeelased of )ar,in( samle sace o)er a ran(e of heler threads is missin($Measurements of the effect of multile rounds is missin($ =f helers'to'bucketsratio euals 1 then there %ill be no locks blockin( a heler@ ,ou still call as,nchroni*ed method and acuire/ release the monitor of the bucket$ There is nocontention for the monitor but ,ou still use locks$ Bh, is the seedu of a hi(hcount of threads in the batch disabled scenario (reater than the corresondin(

    batched scenario< =nsufficient anal,sis$ =n this ro!ect4 %e could see ho% =/could se)erel, hinder the erformances of a arallel al(orithm@ this assi(nmentdid not include an, =/$

    Di)in( a method a ublic scoe is an #= )iolation$ The default amount ofsamles is e&tremel, small to sho% the effect of the other )ariables$ "omarisonto the sin(le thread4 lockless benchmark scenario is missin($ Bhen multilethreads tr, to access a critical section rotected b, the same mute&4 there is not acollision4 rather a contention for the mute&$ %ider )ariet, of helers er bucket isnecessar, to confirm the ratio ,ou claim as otimal$ ffect of number of rounds isnot e&amined$

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ lthou(hit should take the same time for all threads to comlete if the, %ere all runnin(

    euall,4 there is a ossibilit, that one thread mi(ht be scheduled less often$ Thislate thread %ill ut back the comletion of the round4 e)en if all other threadsha)e finished and the "#G is mostl, idle$ This is due to the fact that %ere ha)e afi&ed set of !obs for each thread instead of ullin( them from a thread ool$ This

    %ould be a ma!or fla% of the imlementation for a real alication$@ = be( to differ$The thread ool reuires internal s,nchroni*ation %hich comes at a rice and allthe threads are (i)en the same time slice$ +o e)en if a set of helers is dela,edfor some reason4 %hich is indeed ossible4 this does not ro)e that a task ool

    %ould lead to better results$ Ki(ures contain too man, measurements makin( itimossible to discriminate the beha)iour of each scenario$ ou should isolate andinclude onl, the cases that suort ,our ar(ument$

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ #refer+,stem$nanoTime- . o)er +,stem$currentTimeMillis- .$ The test si*e of batchenabled )s batch disabled is )er, disroortionate$ =ncreasin( the number ofbuckets to the number of threads@ the findin(s of comarin( tests 5 and 6 seem asan outlier comarin( to all the other tests that sho% the oosite of ,our claims$ =tis not the ration of helers/ buckets but the lar(e number of buckets in (eneraland the disabled batchin( that lead to a hi(h number of cache misses and anincrease in e&ecution time$ test that sho%s the otimal number of threads underthe secific testbed is missin($

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ =n fi(ures2 and 3 the &'a&is is a lo( scale of threads/ buckets resecti)el, and contains theoints 0 and 1$ s e&ected4 %ith batchin(4 the e&ecution time is hi(h at first4>$$$?@ such beha)iour %as not e&ected$ Moreo)er4 %ith batchin(4 addin( threadsis useful u to a certain oint >$$$?@ %hat is that oint< Ki(ure 3 does not a(ree %iththe anal,sis$ r(uments made in the reort are not suorted b, numerical

    e)idence$ our reort has no reference to the effect of the rounds$ #oor anal,sis$#refer +,stem$nanoTime- . o)er currentTimeMillis- . for time measurements$

    "omarison to the sin(le thread4 lockless benchmark scenario is missin($ =n fi(ure1 ,ou do not state the number of buckets used %hich is an imortant factor to theamount of arallelism that can be achie)ed$ The critical section %hen batchin( isdisabled is entered )er, freuentl, and that is the reason that the need for a lo%heler'to'bucket ratio becomes aarent$ The fact that the critical section is )er,short leads on less idle time er heler$ Gsin( more buckets than helers isredundant since a heler cannot be allocated to more than one bucket$

  • 8/12/2019 3rd Assignment Remarks

    5/5

    200925:0 13

    "omarison to the sin(le thread4 lockless benchmark scenario missin($ The !oinmethod does not affect the e&ecution time as the number of helers allocated to abucket )ar,$ =t is the contention for the monitor of the bucket that createsadditional dela,s and seuential e&ecution amon( the threads of the bucket$ Thenumber of correct decimal oints is of minor imortance in this ro!ect since aMonte "arlo method to calculate #i is not an efficient method$ The focus of thisro!ect %as the e&ected/measured seed'u of a concurrent ro(ram )s a

    seriali*ed one and the factors that affect this seed'u$ #refer+,stem$nanoTime- . o)er currentTimeMillis- . for time measurements$