38
ALESSIO CIMARELLI Data scientist at Dataninja [email protected] | @ jenkin27 dtnj.it/erice14 International School of Science Journalism The Digital World (Erice, June 10th, 2014)

When data journalism meets science | Erice, June 10th, 2014

Embed Size (px)

DESCRIPTION

Introductory lesson about data journalism within science journalism and science communication during the International School of Science Journalism 2014 in Erice (June 10th, 2014).

Citation preview

Page 1: When data journalism meets science | Erice, June 10th, 2014

ALESSIOCIMARELLIDatascientistatDataninja

[email protected]|@jenkin27

dtnj.it/erice14

InternationalSchoolofScienceJournalismTheDigitalWorld(Erice,June10th,2014)

Page 2: When data journalism meets science | Erice, June 10th, 2014

akajenkin

PASTMasterDegreeinPhysicsattheUniversityofRome"LaSapienza"

MasterinScienceCommunicationattheInternationalSchoolforAdvancedStudies(SISSA-ISAS)inTrieste

PressofficerattheEuropeanLaboratoryforNon-LinearSpectroscopy(LENS)inFlorence

PRESENTFreelancedatajournalist,webdeveloper,opendataactivist,citizenscientist,...

Page 3: When data journalism meets science | Erice, June 10th, 2014

Datajournalism&datavisualizationmadeinItaly

Page 4: When data journalism meets science | Erice, June 10th, 2014
Page 5: When data journalism meets science | Erice, June 10th, 2014

Youknowverywellhowitworks...:)

Page 6: When data journalism meets science | Erice, June 10th, 2014

Astopic

Storiesabouttheedgeofscientificresearchandhumanknowledge.

Keyroleinrelationshipbetweenscienceandsociety.

Sciencejournalistcanbeawatchdogagainstfalsescienceand scientificfrauds.

Page 7: When data journalism meets science | Erice, June 10th, 2014

Asmethod

Itwouldbeevidentin ,becausetheworkflowissimilartopoliceinquiriesorscientificresearch.

Manyinformationsfromdifferentsources,accountabilityproblems,hypothesisandproofs,trialanderrorcycles,andsoon.

Notonlyastory,butalsoadiscoveryitself...

Page 8: When data journalism meets science | Erice, June 10th, 2014

Awordinabuzzwordsera

whenhisinvestigationisultimatelybasedon(ordrivenby)digitaldata,heacquiressuchprefix.

Ifajournalistwanttotelltheworld,andtheworldisnowmadeofdigitalandquantitativeinformations,hehastoacquireskillsinmanagementandinterpretationofdata,orhewillmissanopportunity.

Page 9: When data journalism meets science | Erice, June 10th, 2014

Teamworkandmultidisciplinary

Nosefornews,publicinterest,intuitionbasedoncontestknowledge

Analyticalmind,mathematicalandstatisticalskills,intuitionbasedonscienceofnumbers

Page 10: When data journalism meets science | Erice, June 10th, 2014

Teamworkandmultidisciplinary

Problemsolving,hi-techknowledgeinhardwareandsoftware,nerd(orgeek,ifyouprefer)mood

Artisticsensibilityandintuition,knowledgeinUserExperiencetheoryandtechniques

Page 11: When data journalism meets science | Erice, June 10th, 2014

Miners,dustmen,researchers,andstorytellers

Publicsearchenginesordeepweb?Official5-starsopendataorwebspidersandscreenscrapers?Monitorandkeyboard,smartphoneandtouch,orbootsandmud?

Datashouldbereadbymachinesandnotbyhumans!Datasetscouldhideerrors,inconsistencies,lies...orshowonlyapartofastory.

Page 12: When data journalism meets science | Erice, June 10th, 2014

Miners,dustmen,researchers,andstorytellers

Normalizationsandcomparisons,filtering,grouping,aggregation,correlations,...

Howtorepresentnumbersandrelationsamongnumbers?Yes,witharabicnumerals,butpicturesareworthathousandwords...aslongasyoukeepinmindthattherearefactsbehindthenumbers,and

(copyrightofTheGuardian).

Page 13: When data journalism meets science | Erice, June 10th, 2014
Page 14: When data journalism meets science | Erice, June 10th, 2014

Inmethod

Yourunintoadatasetandfeelthepresenceofapossiblenews...OR

...youhaveaninterest,anidea,athesis,soyouarelookingfordata.

Havingquantitativedataaboutaphenomenonmeansthatsomewherethereisa youhavetounderstand,test,verify...andinterpret!

Datathemselvescansuggestnewwaysforyourinvestigationorevenfalsifysomehypothesisorassumptions.

Commonsense,intellectualhonesty,professionalethics

Page 15: When data journalism meets science | Erice, June 10th, 2014

Somerandomexamples

NewScientistAppstornadoeswarmingworldexoplanetsplancksealevel

TheTelegraphmapofwindfarm

SortingalgorithmsMeteorites

EarthJournalismNetwork

Page 16: When data journalism meets science | Erice, June 10th, 2014

by GlobalEditorsNetworkHealth

(NYT)(ProPublica)

(ProPublica)Environment

(ProPublica)(CenterforPublicIntegrity)

(LaStampa)(LaStampa)

Astronomy(NYT)

Energy(PlanbureauvoordeLeefomgeving)

AmericanWayofBirth,CostliestintheWorldInsidetheGovernment'sDrugDataWhichEmergencyRoomWillSeeYoutheFastest?

NewYorkfloodsBreathlessandBurdenedWhenItalyisshakingItaly,adelicateland

Kepler’sTallyofPlanets

Biomassa

Page 17: When data journalism meets science | Erice, June 10th, 2014

Researchdata,scienceworld,citizenscience

Page 18: When data journalism meets science | Erice, June 10th, 2014

Hardsciencesandsocialsciences

Ok,neitherLHCpetabytesareforjournalists,norstatisticaldatafromepidemiologicsurveys.

But ,or(open),whynot?

Ifyouarenotspecializedinaspecifictopicorifyoulacktheknowledgeabouttheframework,youcanasktoanexpertyoutrust.

Youcanalsousenumbersnotinaninvestigation,buttotellacomplexstoryusinginfographicsandinteractivevisualizations.

Page 19: When data journalism meets science | Erice, June 10th, 2014

Bibliographies,socialnetworksofscientists,infrastructures

Scienceisahumanactivityandanindustry(almost)likeanyother.

HowaretheEuropeanfundsinvestedinscientificresearch?Wherearethecentersspecializedinthetreatmentofspecificdiseases?Whysomewellknownmonitoringtechnologiesarenotusedinsomecountries?

Page 20: When data journalism meets science | Erice, June 10th, 2014

Sensor-basedjournalism

Cheapelectronicsandsensors+

openhardware+

freeinformationsharing=

datafromstakeholdersotherthanscientists

It'searly,butpromising:SwissMakeOpenDataCampsJapanGeigermapat-a-glanceCitizenScience&Sensors

Page 21: When data journalism meets science | Erice, June 10th, 2014

Ifyouhavedata,it'sbetterifyouknowhowtodealwiththem.

Ifyouthinkyoumayfindsomedata,it'sbetterifyouusethem.

Ifsomeoneusedata,it'sbetterifyoucancheckhisclaims.

Playwithdataisfunny!

Page 22: When data journalism meets science | Erice, June 10th, 2014

Welcometothejungle!

Page 23: When data journalism meets science | Erice, June 10th, 2014

Someexamples

PublicadministrationInternationalorganizationsNGOsCivicactivistsPressofficesLeaksSocialnetworksJournalisticsourcesSinglejournalistsOurselves...

Page 24: When data journalism meets science | Erice, June 10th, 2014

Datamadepublicandreusable

(USA)(UK)

(Italy)(Indonesia)

...

Data.govData.gov.ukOpenDataHubOpenIR

Page 25: When data journalism meets science | Erice, June 10th, 2014

Rememberthebuzzwordera?

Datafrombigscienceexperiments(Atlas,HumanBrainProject,...)

Socialnetworks(Facebook,Twitter,butalsoeBay,Amazon,...)

Maybeit'snotforjournalists,butit'sahottopic...

GoogleEarthEngine

Page 26: When data journalism meets science | Erice, June 10th, 2014

Formachine,notforhuman

Thekeywordis !

Awell-formedtablerepresentastructureddataset.Alistoffacebookcomments,articlesofanewspaper,arecordedspeecharenotstructureddata(andsoarenotmachine-readable).

Page 27: When data journalism meets science | Erice, June 10th, 2014

Italldependsontheformat

IfwehaveGladstoneGanderasbestfriend:spreadsheet(xls,xlsx,ods,csv,tsv);not-so-commongoodformats(xml,sql,json,shp,kml,...).

Ifwearenotsolucky:tablesorlistsinwebpages(html);simpletablesinwell-donepdfs(pdf).

IfwehaveMurphyasworstenemy:scannedimages,evenifinapdfwrapper(png,jpg,pdf);digitaldatabehindcomplexsearchengines.

Andifwehavethebestdataever,butunderclosedlicense?

Page 28: When data journalism meets science | Erice, June 10th, 2014

Well-formeddatasets

Numbersarenumbers,stringsarestringsandnotnumbers,datetimemustalwayshaveasingleformat(ie.yyyy/mm/dd),localizationisimportant,nogendervaluesinnames'columnorsimilarmixings,everyelementsshouldbenamedwithaUniqueIdentifier(ID).

Datatypescomputerunderstands:integers(withsign,zeroincluded),floatingnumbers(withsign),datetime,charactersandstring(casesensitive),nullvalue(thestrangecaseofavaluethatstates"I'mnotavalue").

Andsimplecomparisonsarestrictlyequalities,alsoinstrings!

Page 29: When data journalism meets science | Erice, June 10th, 2014

Aggregation,average,normalization,relativedifference,distribution,...

Asinglerule:correlationdoesnotimplycausation!Spuriouscorrelations:Correlated:

http://www.tylervigen.com/http://www.correlated.org/

Page 30: When data journalism meets science | Erice, June 10th, 2014

Ataglance

Page 31: When data journalism meets science | Erice, June 10th, 2014

Withgreatpowercomesgreatresponsibility

Thebasicideaisquitesimple:youhavequantitiesexpressedinnumbersandgeometricobjectsdefinedbydimensions(ie.radiusinacircle),soyoujusthavetodecidehowconnectyourquantitiestovisualdimensions.

Thereareseveral(un)commonchartsandendlesscombinations:scatterplots,lines,bars,areas,pies,donuts,bubblecharts,treemaps,wordclouds,alluvionaldiagrams,dendrograms,networks,streamgraphs,gauges,chorddiagrams,motioncharts,parallelcoordinates,sankeydiagrams,maps,choropleth,...

On thereisanendlesslistofexamples!d3js.orggallery

Page 32: When data journalism meets science | Erice, June 10th, 2014

Buildingasimpledatasetoralargeandcomplexdatabasefocusedonatopicofpublicinterestleadstoavaluableproduct:thedatabaseitself,intendedasacollectionof(linked)dataplusmetadata.

Canapublicfrontendtosuchdatabase,designedforcitizens,journalists,stakeholders,beconsideredajournalisticoutcome?Ifjournalismisapublicgood,itcanbeaservice,notonlyaproduct...

Page 33: When data journalism meets science | Erice, June 10th, 2014

Scraping"Copy&Paste"combo

forChromebrowserGoogleSpreadsheetfunction

forsimplepdfsPython(orotherlanguages)scriptsandlibraries

CleaningFiltersand"Find&Replace"toolsinspreadsheets

AnalysisPivottablesandsimplechartsinspreadsheetsDedicatedsoftwares(ie.open-source or )

Viz, , , , ,

, , , , , ,...

DataMinerIMPORTXML()Tabula

OpenRefine

QtiPlot QGIS

Datawrapper RAW GoogleFusionTables Tableau CartoDBinfogr.am easel.ly Timelinejs Timemapper StoryMap d3js

Page 34: When data journalism meets science | Erice, June 10th, 2014

TinaCasagrand," ",TheOpenNotebook(2014)PaulBradshaw," ",Leanpub(2014)JohnMair,RichardLanceKeeble," ",abramis(2014)PaulBradshaw," "ClaireMiller," ",Leanpub(2013)NathanYau," ",Wiley(2013)SimonRogers," ",Faber&Faber(2013)JonathanGray," ",O'Reilly(2012)NathanYau," ",Wiley(2011)

Datajournalismforsciencejournalists

ScrapingforJournalistsDataJournalism

DataJournalismHeistGettingStartedwithDataJournalismDataPoints

FactsareSacredTheDataJournalismHandbook

VisualizeThis

Page 35: When data journalism meets science | Erice, June 10th, 2014

Alessio"jenkin"[email protected]

@

Dataninja

Q&A

SWIM

jenkin27

www.dataninja.itschool.dataninja.it

dataninja.it/newsletter

school.dataninja.it/qa

sciencewritersinitaly.wordpress.com

Page 36: When data journalism meets science | Erice, June 10th, 2014

Hacking+Marathon=Hackathon

ESPAD(Europeanstudentsanddrugs): http://www.espad.org/en/

RASFF(EUfoodsafety): http://ec.europa.eu/food/food/rapidalert/

Page 37: When data journalism meets science | Erice, June 10th, 2014

http://ec.europa.eu/food/food/rapidalert/

TheRapidAlertSystemforFoodandFeed(RASFF)wasputinplacetoprovidefoodandfeedcontrolauthoritieswithaneffectivetooltoexchangeinformationaboutmeasurestakenrespondingtoseriousrisksdetectedinrelationtofoodorfeed.ThisexchangeofinformationhelpsMemberStatestoactmorerapidlyandinacoordinatedmannerinresponsetoahealththreatcausedbyfoodorfeed.

dtnj.it/rasff2013

Page 38: When data journalism meets science | Erice, June 10th, 2014

http://www.espad.org/en/

Thisisthereportfromthefifthdata-collectionwaveoftheEuropeanSchoolSurveyProjectonAlcoholandOtherDrugs(ESPAD).Itisbasedondatafrommorethan100,000Europeanstudents.Overtheyearsabout500,000EuropeanstudentshaveansweredtheESPADquestionnaire.Atotalof36countriesandregionshavecontributeddatatothe2011ESPADDatabase.Drugslistincludescigarettes,alcohol,cannabis,otherillecitdrugs,tranquillantsandsedativeswithoutprescriptions.

dtnj.it/espad2011