Upload
william-hsiao
View
214
Download
0
Embed Size (px)
Citation preview
IRIDA:Canada’sfederatedplatformforgenomic
epidemiologyWilliamHsiao,Ph.D.
[email protected]@wlhsiao
BCCentreforDiseaseControlPublicHealthLaboratoryandUniversityofBritishColumbia
IRIDAPlatformOverview
• IRIDA=Integrated Rapid Infectious DiseaseAnalysis
• Afree,opensource,standardscompliant,highqualitygenomicepidemiologyanalysisplatformtosupportreal-timediseaseoutbreakinvestigations
CoreFunctions:• Managementofstrainandgenomicsequencedata• Rapidprocessingandanalysisofgenomicdata• Informativedisplayofgenomicresults• Sample,Case,andaggregatedata(“metadata”)Management
Targetaudience:• Publichealthagencieswhoneedaplatformtomanageand
processgenomicdata• Publichealthagencieswhoneedaplatformtousegenomicsfor
outbreakinvestigations
IRIDA
SequencingInstruments
WebApplication
Datamanagement
Built-inAnalyticalTools
ExternalGalaxy
Command-lineTools
10simplerules(wishlist)tobuildabetterpublichealthmicrobiologygenomicepidemiologyanalysissystemDownloadLatestversionathttps://github.com/phac-nml/irida
1: Engage the Users Through the Entire Software Development Cycle
NationalPublic Health Agency
Provincial Public Health Agency Academic/Public
- ProjectTeamhasdirectaccesstostateoftheartresearchinacademia
- ProjectTeamisdirectlyembeddedinuserorganization
2: Have A Simple User Interface
LineListView(undertesting)
TimelineView(Conceptualization)
Selectablefields
Travel
SymptomsandOnset
ExposureTypes
Hospitalization
Launchapipeline
BeLike
3: Build a Robust, Extensible Platform
• IRIDAusesGalaxytomanageworkflows
• Addingadditionalpipelinesisrelativelyeasy
• UsingastandardAPItoallow3rd partytoolstoobtaindatafromIRIDA(e.g.IslandViewer andGenGIS)
IRIDA
ServletContainer
RESTAPI CentralFileStorage
WebInterface
ApplicationLogic
ComputeClusterGalaxy
$~>_ Galaxy
http://www.pathogenomics.sfu.ca/islandviewer/http://kiwi.cs.dal.ca/GenGIS/Main_Page
4: Have Extensive Documentation
• Documentationshouldbeavailablefor• Users – stepbysteptutorialwithscreenshots/FAQ• SystemAdministrators– installationinstructions/issuetrackers• Developers– opensource,collaborativedevelopment/IRCChannel
• EasilyAccessibleathttps://irida.corefacility.ca/documentation/
5: Implement QC Throughout the Whole Application
• Genomicsissensitiveandsequencedataareinherentlynoisy
• Genomicsisarapidlyadvancingtechnology• Standardizingpipelinesdifficultandcanstifleinnovation• Bettertostandardizetheperformanceandreportingmetricsandensureanyvalidatedpipelinesmeetthetestingcriteria
• DevelopingageneralQCtestingmodule(RCQC)thatuseontologytostandardizeQCmetrics(https://github.com/Public-Health-Bioinformatics/rcqc)
• DataProvenanceandVersionControl(data+Pipelines)aremust’sforDiagnosticLabs
6: Build to Enable Collaboration
• Beabletocomparepipelines• PipelineimplementedusingGalaxy– transparentandshareable
• DefineQCcriteriausingontologytocomparethedifferentpipelinesofthesamepurpose
• Beabletosharedatainstandardformatstominimizedatare-entryfromoneplatformtoanother
• FederationofplatformsusingstandardAPItosharedataandanalysisresults
7: Use Compatible Data Standards
• Sequencedataaremorecompatible/shareablebutmetadataarecurrentlyinsiloandincompatible
• CollaborationandSharingaredifficultwhendataareincompatible
• Compatibility!=Sameness
• UseOntologytoallowcustomizationoftermlistbutalltermswithsamemeaning(semantics)shouldhavethesameuniversalID(e.g.anURL)tofacilitatemappingofterms
8: Implement Fine Grained Access Control
DetailedView RestrictedView
E.g.Userrolepermissions controlvisibilityandeditingofcontent
Authorization
• Industry-standardauthenticationandauthorizationmechanisms
• Localauthorizationperinstance.
• Method-levelauthorization.• Object-levelauthorization.
9: Use Technology to Safeguard Patient Privacy
It’seasytolosecontroloftheExcelLineList-someonecanmakeacopyofthecontentandpassitaroundwithoutyourknowledge;typosarecommonandcumulative!
Technologycancontrolwhoseeswhatandwhen
Separateoutsensitivepatientdatafrompathogensequencedatabutbeabletobringthemtogetherwhennecessarywithoutresortingtoemailingoflinelists!
10: Have Multiple, Flexible Access Options
• Noonesizefitsallsolution;Havingmanyplatformstochoosefromisagoodthing(butdatashouldbeportableacrossplatforms!)
• IRIDAisavailableinseveraldifferentflavours:LocalInstall VirtualMachine CloudInstance PublicVersion
Advantages Fullcontrolofthesystem; yourdataneverleaveyourcentre
Fullcontrolofthesystem;Easytosetup
Fullcontrolofthesystem;doesnotrequirelocalcomputinginfrastructure
Nosetuprequired,uploadyourdataandhaveitprocessedusingComputeCanadaResource
Disadvantages Computinginfrastructure andITsupportneeded tomaintheresource
Not reallyscalableifrunonyourowndesktop;someperformance loss
Datago intoacloudenvironment;uploading tocloudenvironmentcanbeslow
Datagointoapublicinstance(dataremainprivatetoyouraccount);uploadcanbeslow
AcknowledgementsProjectLeadersFionaBrinkman– SFUWillHsiao– PHMRLGaryVanDomselaar – NML
UniversityofLisbonJoᾶoCarriҫo
NationalMicrobiology Laboratory (NML)FranklinBristowAaronPetkauThomasMatthewsJoshAdamAdamOlsonTarah LynchShaunTylerPhilipMabonPhilipAuCelineNadonMatthewStuart-EdwardsMoragGrahamChrystalBerryLorelee TschetterAleisha Reimer
Laboratory forFoodborne Zoonoses (LFZ)EduardoTaboadaPeterKruczkiewiczChadLaingVicGannonMatthewWhitesideRossDuncanStevenMutschall
SimonFraserUniversity(SFU)MelanieCourtotEmmaGriffithsGeoffWinsorJulieShayMatthewLairdBhavDhillonRaymondLo
BCPublicHealthMicrobiology &ReferenceLaboratory (PHMRL)andBCCentre forDiseaseControl (BCCDC)Judy Isaac-RentonPatrickTangNataliePrystajeckyJenniferGardyDamion DooleyLindaHoangKimMacDonaldYinChangEleni GalanisMarshaTaylorCletusD’SouzaAnaPaccagnella
UniversityofMarylandLynnSchriml
CanadianFood Inspection Agency(CFIA)BurtonBlaisCatherineCarrilloDominicLambert
DalhousieUniversityRobBeikoAlexKeddy
14
McMasterUniversityAndrewMcArthurDaim Sardar
European NucleotideArchiveGuyCochranePetratenHoopenClaraAmid
European FoodSafetyAgencyLeibana Criado ErnestoVernazza FrancescoRizzi Valentina
1515
IRIDAAnnualGeneralMeetingWinnipeg,April8-9,2015