Upload
frederica-welch
View
214
Download
2
Embed Size (px)
Citation preview
Creating a …
Community DatabaseOrganism-Specific DatabaseModel-Organism Database
SRI InternationalBioinformaticsWhy Create a PGDB?
Perform pathway analyses as part of a genome project
Analyze omics data
Create a central information resource for the organism
Create an FBA model
Perform comparative analyses
SRI InternationalBioinformaticsModel Organism Databases
DBs that describe the genome and other information about an organism
Curated by experts for that organism No one group can curate all the world’s genomes Distribute workload across a community of experts to create a community
resource
Every sequenced organism with an active experimental community requires a MOD
Integrate genome data with information about the biochemical and genetic network of the organism
Integrate literature-based information with computational predictions
SRI InternationalBioinformaticsRationale for MODs
Each “complete” genome is incomplete in several respects:
40%-60% of genes have no assigned function Roughly 7% of those assigned functions are incorrect Many assigned functions are non-specific
MODs are platforms for global analyses of an organism
Interpret omics data in a pathway context In silico prediction of essential genes Characterize systems properties of metabolic and genetic
networks
SRI InternationalBioinformaticsWhat is Curation?
Ongoing updating and refinement of a PGDBCorrect false-positive and false-negative
predictionsIncorporate information from experimental
literature Update genome sequence Update gene functions, gene positions, gene names Author comments and citations Add new pathways, modify existing pathways Enter information about regulatory networks
SRI InternationalBioinformaticsIssues in Creating Public MODs
Obtaining fundingScoping the projectIdentify user communityObtain buy-in and help from scientific communityIT: Set up database server, Web serverHire and train curators
SRI InternationalBioinformaticsQuestions
Do you intend to make your PGDB public and to update it on an ongoing basis?
To create a Model Organism Database?
Administering Pathway Tools
SRI InternationalBioinformaticsObtaining Pathway Tools
Free to non-commercial organizations
To obtain license agreement go to BioCyc.org and click on Software/Database Download
Follow Installation Guide
ptools-local directory Locate in common directory PGDBs created by all users who use this ptools installation PGDBs downloaded via the registry ptools-init.dat for this ptools installation
SRI InternationalBioinformaticsNew Pathway Tools Releases
Major releases = External software releases Twice per year Announced on ptools-users mailing list
Minor releases twice per year affect only our BioCyc.org Web site and flatfile distributions
We support one prior release only Releases announced on [email protected] Read release notes at
http://brg.ai.sri.com/ptools/release-notes.html
Install process: Upgrade schema of your DB (software assisted)
SRI InternationalBioinformaticsPGDB Storage:
File or Relational Database
File storage: Advantages:
No RDBMS installation and configuration Disadvantages:
Must be loaded and saved in its entirety No transaction history No concurrent access for multiple users
Oracle/MySQL storage: Advantages:
Faster read access, faster saves Concurrent update access for multiple users Stores history of all PGDB updates
Disadvantages: RDBMS must be installed and configured
SRI InternationalBioinformaticsMultiuser Access to PGDBs
PGDB stored within one Oracle or MySQL server
Each curator installs PTools on their workstationDifferent curators can use different software
platformsWorkstations query RDBMS server via internetLocal disk cache speeds accessFor each frame access, PTools queries
In-memory cache, disk cache, RDBMS serverAfter curator saves changes, all changes made by
other users are loaded into curator’s session
SRI InternationalBioinformaticsHow to Release a PGDB?
Decide on release frequency and schedule Don’t wait until it’s perfect to release it!
Freeze curation for 1 week Quality assurrance
Run consistency checker Tools -> Consistency Checker Also updates organism-summary statistics
Update publications, authors in organism frame Update via Organism editor
Create new version of PGDB ptools-local/pgdbs/yeastcyc/1.0/kb/yeastbase.ocelot Edit against the new version, release the old version
Author release notes Register PGDB in SRI PGDB registry
Will allow SRI to include it in BioCyc
SRI InternationalBioinformaticsPathway Tools Data
Import/Export
File->Export File->Import
Export/import to/from tab-delimited files
Export to Genbank, SBML, BioPAX
Export to attribute-value files
Attribute-value files can be imported into BioWarehouse Relational database system for bioinformatics database integration
SRI InternationalBioinformaticsNapster Comes to
Bioinformatics
Public sharing of Pathway/Genome Databases
PGDB registry maintained by SRI at URL http://biocyc.org/registry.html
Registry operations List contents of registry Download PGDBs listed in the registry Register PGDBs you have created
SRI InternationalBioinformaticsRegistry Details
Why register your PGDB? Declare existence of your PGDB in a central location Facilitate its download by other scientists Facilitate its inclusion in BioCyc.org
Why download a PGDB? Desktop Navigator provides more functionality than Web Comparative operations Programmatic querying and processing of PGDB
Registration process Registered PGDBs have open availability by default Authors can provide their own license agreements Registered PGDBs reside in authors’ FTP site or HTTP server
SRI InternationalBioinformaticsDesktop versus Web Mode
Pathway Tools runs in two different modes: Desktop mode Web mode (e.g., BioCyc.org)
Desktop vs Web functionality in Pathway Tools
http://biocyc.org/desktop-vs-web-mode.shtml
You can run both desktop and web modes at your site
Your PTools web server need not be open to the public