19
CaRE Center Informatics NHLBI CaRE Center Meeting Bethesda, MD July 25, 2006 Marcia Nizzari

CaRE Center Informatics NHLBI CaRE Center Meeting Bethesda, MD July 25, 2006 Marcia Nizzari

Embed Size (px)

Citation preview

CaRE Center Informatics

NHLBI CaRE Center MeetingBethesda, MD

July 25, 2006

Marcia Nizzari

CaRE Center Informatics• Builds on existing Genetic Analysis Platform

– Operational for 2+ years– Genotyping and Resequencing – Code base successfully reused

• CaRE Center enhancements:– Data sharing strategy– Phenotype/Trait thesaurus, meta thesaurus– Customizable analytic pipelines

User Experience – ProductionThree “portals” or dashboards –• Sample Management

– Register and fingerprint samples, manage storage and aliquots for experiments

– Record phenotypes for Individuals and Samples

• Project Management– Manage Groups, Projects, plan your experiments– Shunt filtered results into analysis pipelines

• Process/LIMS Management– Design and execute experiments per platform, curate

results• Affy, Illumina, Sequenom or resequencing

High Level Workflow – for CaRE

Upload Samples, Peds, Individuals,

Phenotypes

Create Experiments(Samples x Features)

Summarize/FilterPLINK

Data VaultQC/Curate Results

Design and Execute

Experiments

ProjectDB

LIMS DBs

BSP DB

Association & Statistics Viewers

Cohort’s CustomAlgorithms, Viewers

Web

Ser

vice

s

Data Compile

FeatureDB

Analysis: Gene Pattern + CaRE analysis tools

Production:BSP/GAP + CaRE enhancements

Production Screenshots

Upload Phenotypes, Create Experiments, Curate Results,

Filter by Phenotype for Analysis

Project Management dashboard

Showing Phenotype Upload

Anticipate significant enhancements to handle CaRE Center requirements.

Project Management dashboard

Showing Experiment Definition

Experiments flow through the Process Dashboard for execution; they provide the unit of logical reporting on progress.

Process Dashboard

Showing QC Report on Affy chemistry plates – Fingerprints to the right!

Lab techs and coordinators can view and curate plates; set up re-hyb and redo pipelines.

Project Management dashboard

Showing QC Statistics and Pheno Query

Production analysis workflow executed prior to exporting data for Gene Pattern pipeline association study analyses.

Project Management dashboard

Search phenotypes to slide and dice results for analysis

Resulting subset will be piped into Gene Pattern pipeline for analysis on derived, curated dataset.

User Experience -- Analysis• GenePattern framework

– Provides “pluggable” backplane – Can string together tools in a pipeline– Tracks everything for ‘reproducible research’

• For CaRE Center– We create templates for our standard analysis

methods– Cohort teams can customize– Streamlines publication!

Screenshots for Analysis

Gene Pattern framework with PLINK and custom reporting

High Level Workflow – for CaRE

Upload Samples, Peds, Individuals,

Phenotypes

Create Experiments(Samples x Features)

Summarize/FilterPLINK

Data VaultQC/Curate Results

Design and Execute

Experiments

ProjectDB

LIMS DBs

BSP DB

Association & Statistics Viewers

Cohort’s CustomAlgorithms, Viewers

Web

Ser

vice

s

Data Compile

FeatureDB

Analysis: Gene Pattern + CaRE analysis tools

Production:BSP/GAP + CaRE enhancements

Complied Files for PLINK

QC Report

(In browser)

Issues/Questions

• Scope of phenotype-related enhancements

• Group/Project structure for CaRE Center

• CaRE user visibility into Process Dashboard/LIMS

• Data release model decision– Data Enclave scenarios and security

• User training and doco– Analysis methodology– System and security training

Security for Production & Analysis

Groups, Projects,Grants,Panels, Feature Sets,Sample Sets

Project Management

CaRECohortTechnician

Proj MgtSecurityContext(Project)

Users in JAAS domain

Biological Samples Platform

BSP Security Context (Sample Collection)

BSP LabTechnician

Analysis Pipelines

CaRE Analysis Security Context (Scope based on rules of Data Enclave, could cover multiple Projects)

CaREScientist

Process/ LIMS

Broad Lab Technician, Coordinator

Lab SecurityContext(X-Project)

Shareable Objects:Peds, Individuals,

Phenotypes, Samples, Features

PIPS DB

LSIDs

Feature DB

Internet“Cloud”

MIT

The Broad InstituteMITThe World

Firewalls

CiscoPix

CiscoPix

Core Router

RadiusDB

Used for authentication forVPN access

Host A

Host B

Access Rules for Subnets:Explicit allows, e.g., allow host on LIMS to talk to host on server

Must be in the list to permit access

Allow Rules:Explicit allows – http = 80 -> hostSsh = 22 -> hosthttps = 443 (SSL)

Wireless

Open jackUnregistered 10.10 domain

On LIMS

Host on server

Acknowledgements

• Genetic Analysis Platform team• Biological Sample Platform team• GenePattern team• Stacey Gabriel, David Altshuler, Mark Daly

• URLs:– GenePattern: http://www.broad.mit.edu/cancer/software/genepattern/– PLINK: http://pngu.mgh.harvard.edu/~purcell/plink/– Haploview: http://www.broad.mit.edu/mpg/haploview/– Center for Genotyping and Analysis:

http://www.broad.mit.edu/gen_analysis/genotyping/