cmerow.github.io--Tufte, Visual Display of Quantitative Information, page 191 Title: 03_Plotting...

Preview:

Citation preview

!"#$%&'($()*%$'+,-.'),%&%/*(*#-/'01'2#3('4"$(0(5'6,()7#8'9%&#:/%,

!"#$%#$&'()*+"#,*--.;%&&-/&'<,-.'=$>(,$'?@+*%

Il lustrated by Merchantfor the Brunswick Review

WhoIsEdwardTufte?ananalyticaldesigntheorist,educator,andlandscapesculptorbestknownforhisself-publishedbooksonanalyticaldesign

Anscombe’s Quartet

x y x y x y x y10 8.04 10 9.14 10 7.46 8 6.588 6.95 8 8.14 8 6.77 8 5.7613 7.58 13 8.74 13 12.7 8 7.719 8.81 9 8.77 9 7.11 8 8.8411 8.33 11 9.26 11 7.81 8 8.4714 9.96 14 8.1 14 8.84 8 7.046 7.24 6 6.13 6 6.08 8 5.254 4.26 4 3.1 4 5.39 19 12.512 10.8 12 9.13 12 8.15 8 5.567 4.82 7 7.26 7 6.42 8 7.915 5.68 5 4.74 5 5.73 8 6.89

I II III IV

Howwouldyoudescribethedifferencesinthesedatasets?

Anscombe’s QuartetMetric Valuemeanoftheyvalues 7.5equationoftheleast-squaredregressionline y =3+0.5x sumsofsquarederrors(aboutthemean) 110regressionsumsofsquarederrors(varianceaccountedforbyx) 27.5residualsumsofsquarederrors(abouttheregressionline) 13.75correlationcoefficient 0.82coefficientofdetermination 0.67

;$2,0<=9!2 >6*)898

6,()7#8&'!"#"$%$(*(

;$08"9)'9?*<+-9@'A9*)20$'B0))9-*8#0$

Otherreasons?

Visualizationisthehighestbandwidthchannelintothehumanbrain

Asdatavolumesgrow,visualizationbecomesanecessityratherthanaluxury.◦ “Apictureisworthathousandwords”

--Grolemund &Wickham,RforDataScience,O'Reilly2016

Visualizationisimportant…

GraphicalExcellence1. Showthedata

2. Inducetheviewertothinkaboutsubstanceratherthanmethodology

3. Avoiddistortion

4. Presentmanynumbersinasmallspace

5. Makelargedatasetscoherent

6. Encouragecomparisons

7. Revealthedataatseverallevelsfrombroadoverviewtofinestructure

8. Serveapurpose

9. Beintegratedwithstatisticalandverbaldescriptionsofdata

-- PrinciplesofGraphicalExcellence,EdwardTufte

6,()7#8("'%P8%""%/8%':#D%&'*7%'D#%>%,cY *7%':,%(*%&*'d'-+'#$%(&'Y #/'*7%'&7-,*%&*'*#.%Y>#*7'*7%'"%(&*'#/3Y #/'*7%'&.(""%&*'&)(8%I

C$9'07'8"9'0-4928'&)*+"#,*-'4#2+-*.2e@'f7#'?7@6$'+74+5("+5!$*89+74+:3+5("+&!"$5

f(,D%$'#/'&*-/%'#/'SSMH'4I9I'Q#$%('),-0(0"1'-"$%,R

g-*7#/:'&#.#"(,'#/'=@,-)%'@/*#"'hSJJO

^/%'-+'*7%'%(,"#%&*'*#.%&%,#%& 87(,*&

G#79H,.,-9'07'8"9'I*+*$929'J998-9

;I'F@:7'g%>.(/'6$.+$.0+;.9"*59+<=7.07./+>?@AB/+''+>CDE>CA

StriveForGraphicalIntegrity

Visualrepresentationsofdatamusttellthe

truth.

488-,$#/:'*-'?@+*%'*7%';#%'<(8*-,'-+'*7#&':,()7'#&'S`IUI''4'/@.%,#8("'87(/:%'-+'JMi'#&',%),%&%/*%$'01'(':,()7#8("'87(/:%'Q&#X%'-+'7-,#X-/*("'"#/%&R'-+'HUMiI

!"9'G#9'M*,80)

;#%'<(8*-,'L !#X%'-+'%++%8*'#/'*7%'D#&@("#X(*#-/!#X%'-+'%++%8*'#/'*7%'$(*(

GraphicalIntegrityTherepresentationofnumbers,asphysicallymeasuredonthesurfaceofthegraphitself,shouldbedirectlyproportionaltothenumericalquantitiesrepresented

?7%'/@.0%,'-+'#/+-,.(*#-/'8(,,1#/:'QD(,#(0"%R'$#.%/&#-/&'$%)#8*%$'&7-@"$'/-*'%P8%%$'*7%'/@.0%,'-+'$#.%/&#-/&'#/'*7%'$(*(I

()*+"#,*-'1$89&)#8.

6,()7#8&'.@&*'/-*'K@-*%'$(*('-@*'-+'

8-/*%P*I

()*+"#,*-'1$89&)#8.

6,()7#8&'.@&*'/-*'K@-*%'$(*('-@*'-+'

8-/*%P*I

()*+"#,*-'1$89&)#8.

6,()7#8&'.@&*'/-*'K@-*%'$(*('-@*'-+'

8-/*%P*I

()*+"#,*-'1$89&)#8.

6,()7#8&'.@&*'/-*'K@-*%'$(*('-@*'-+'

8-/*%P*I

()*+"#,*-'1$89&)#8.

N*?#<#E9'O*8*'1$%

3?*<+-9@'P#&"'0)'G05'O*8*'1$%')*8#0:

N*?#<#E9'O*8*'1$%

?,#.'*7%'+"@++'+,-.'1-@,':,()7#8&'(&'.@87'(&')-&&#0"%j

AvoidChartJunk

Theexcessiveandunnecessaryuseofgraphicaleffectsingraphsusedtodemonstratethe

graphicabilityofthedesignerratherthandisplaythedata.

Possiblytheworstgraphevermade

5Colors2Curvylines2Polygons2BrokenAxis3DimensionsTodisplay:

CHARTJUNK

5numbers

Aimforhighdatadensity

Theproportionofthetotalsizeofthegraph

thatisdedicateddisplayingdata.

Shrink

PrincipleManygraphscanbeshrunkwaydownwithoutlosinglegibilityorinformation

Seriesofthesamesmall graphrepeated

inonevisual

SmallMultiples

ClutterVisualizationtoocluttered?

Don'tremovedata,changethedesign.

Credibilitycomesfromdetailandinmanycasesonecanclarifya

designbyaddingdetail.

HyperaktandEkeneIjeomavisualizedmigrationsovertimeandspaceinTheRefugeeProjecthttp://www.therefugeeproject.org

Layering&

Separation

Usecolororotherdifferentiationtoseparateimportant

classesofinformation.

Mutedcolors,subtleshadingandthincontourlinesallowmultipletypesofdatatobelayeredtogetherinthis1958topographicmapofChattanooga,Tennessee.

Effectivelayeringofinformationisdifficultbecauseofinteractions

Can create non-informationpatternsandtexture

1+1=3(ormore)

N0#)9 9779,8

Selecttheappropriategraphicforthedata&analysis

Chart by Andrew Abela

Aesthetics&Technique

Carefullychosengraphicalformat

Designuseswords,numbers,anddrawings

Displaysanaccessiblecomplexityofdetail

Hasanarrativequality,astorytotellaboutthedata

Technicaldetailsofproductiondonewithcare

Avoidcontent-freedecoration,includingchartjunk

JuanVelasco.CornellornithologistEdwingScholesandbiologist/photographerTimLaman.SeniorGraphicsEditorFernandoBaptista,GraphicsSpecialistMaggieSmithandfreelanceresearcherFannaGebreyesus.NationalGeographic

Thefinalword…

Theprinciplesshouldnotbeappliedrigidlyorinapeevishspirit;theyare

notlogicallyormathematicallycertain;anditisbettertoviolate

anyprinciplethantoplacegracelessorinelegantmarksonpaper

--Tufte,VisualDisplayofQuantitativeInformation, page191

Recommended