Upload
vincent-walker
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
1
SAIF-Effects on Data Service
Specifications
Baris Suzek
Georgetown University
Architecture/VCDE Joint Face-to-Face
June,3, 2010
St. Louis, Missouri
22
Team
ECCF Guidance Team•Baris Suzek (Lead)•Lewis Frey •Raghu Chintalapati•Brian Davis •Charlie Mead
caBIO Team• Juli Klemm• Sharon Gaheen • Konrad Rokicki• Jim Sun• Liqun Qi
3
SAIF-Effects on Data Service Specifications
• Goals• Contribute to the ECCF Implementation Guide• Provide guidance around existing ECCF processes and artifacts for
Molecular Annotation Service specification/development team
• Approach:• Attend weekly developer meetings• Present current/draft ECCF processes• Develop artifact lists• Respond to questions• Reviews CIM/PIM/PSM level specifications• Identify issues and provide recommendations/solutions
• Wiki Pages:• https://wiki.nci.nih.gov/x/h1xyAQ • https://wiki.nci.nih.gov/x/RHFyAQ (caBIO ECCF Team Page)
4
SAIF and ECCF Effects on Data Service Specification and Development
• Following new processes:• Enterprise Service Specification • Life Science Domain Analysis Model (LSDAM) Use/Expansion• NCI-localized ISO 21090 Data Types’ Use/Expansion
• Going through new reviews:• Scope & Description Review• ECCF Specification Review• Conformance Validation/Certification
• Developing artifacts using new templates/resources for ECCF viewpoints:• Enterprise Service Specification Templates• LSDAM/LSBAM• Artifacts to describe behavior (e.g. sequence/activity diagrams,
functional profiles) , engineering approach (e.g. deployment diagrams), business needs/goals (e.g. use case diagrams)
5
SAIF and ECCF Effects on Data Service Specification and Development
• Interacting with several new team/people:• LSDAM Team• Enterprise Service Specification Team• Life Sciences Governance Team• Life Sciences Composite Architecture Team• ISO 21090 Project Manager
• Considering new ECCF concepts in development/review:• Traceability• Consistency• Conformance• Compatibility• Localize/Constrain• Compliance
6
New Processes - NCI Enterprise Service Specification Development and Review (Draft)
Document and Specification
Reviews by LSCAT, ESST, GT
Scope & Description Document, Computation-Independent (CIM), Platform-Independent (PIM), Platform-Specific Model (PSM) Specifications by Developers
7
New Processes – Specification Development
• Defining scope and describing services using “Scope and Description Document”
• Specification by developing artifacts for 4 viewpoints from RM-ODP at 3 abstraction levels; Computation-Independent, Platform-specific, Platform-independent Model
• Business• Information• Computational (Behavior) • Engineering (Deployment)
8
New Artifacts - Scope and Description
• A document describing:• What is the service?• How does it support/extend business processes in LSBAM?• What is the rationale for creating this specification/services?• Who are the stakeholders? • What is the scope?
•Reviewed and approved by Governance Team (e.g. LSGT)
9
New Artifacts – Service Specification Documents
• Specification documents developed using CIM, PIM, PSM templates• Templates helps to consolidate artifacts; point to and/or brings together various artifacts (e.g. UML diagrams)
•Reviewed and approved by ESST and CAT (e.g. LSCAT)
1010
New Artifacts – Service Specification Document: CIM
• Business Storyboard for Business Viewpoint
1111
New Artifacts – Service Specification Document: CIM
• Semantic Profiles for Information Viewpoint• Project Analysis Model is derived from Life Sciences Domain Analysis Model and BRIDG
class Molecular Annotation
domain::MolecularSequenceAnnotation
- date: TS
domain::Gene
- symbol: ST
domain::GeneIdentifier
- databaseName: CD- identifier: II
domain::NucleicAcidSequenceFeature
- orientat ion: ST
domain::AdditionalOrganismName
- comment : ST- source: CD- value: ST
domain::Organism
- commonName: ST- ncbiTaxonomyId: CD- scientificName: CD- taxonomyRank: CD
domain::MolecularSequence
- value: SC
domain::NucleicAcidPhysicalLocation
- endCoordinate: INT- startCoordinate: INT
domain::NucleicAcidSequence
domain::GeneticVariation
domain::SingleNucleotidePolymorphism
domain::SingleNucleotidePolymorphismIdentifier
- databaseName: CD- identifier: II
BRIDG 2.1 - ISO::TherapeuticAgent
+ identifier: II+ statusCode: CD+ statusDateRange: IVL<TS>
BRIDG 2.1 - ISO::Material
- actualIndicator: BL+ descript ion: ST+ formCode: CD+ identifier: DSET<II>+ name: DSET<EN.TN>+ statusCode: CD+ statusDateRange: IVL<TS>
BRIDG 2.1 - ISO::Produc t
+ classCode: DSET<CD>+ expirationDate: TS+ pre1938Indicator: BL+ typeCode: CD
0..*
identifies / is identif iedby
1
0..*
is included in /inc ludes
1.. *
1.. *
is designated by / designates
0..*
0..*
is included in /inc ludes
1
0..*
reports / is reported by
0..*1
is included in /inc ludes
0..*
0..*
identifies / is identif iedby
1
+product 1
plays / is played by
+therapeuticAgent 0..1
+product 0..1
has component /used as component
+productCollection0..*
12
New Artifacts – Service Specification Document: CIM
•Capabilities for Computational Viewpoint
13
New Artifacts – Service Specification Document: CIM
•Functional Profiles for Computational Viewpoint
14
New Artifacts – Service Specification Document: CIM
• Conformance Statements for all viewpoints
15
New Reviews - ECCF Specification Review
• Uses a review criteria list developed by Enterprise Service Specifications Team (ESST)
• Conducted by Composite Architecture Teams (CAT) on CIM, PIM, PSM level specifications
• Should check:• Traceability • Consistency• Compliance• Localizations• Completeness
16
New ECCF Concepts
Compliance: Artifacts derived from other artifacts by traversal of
successive levels of abstraction
Localization: Custom modifications or other
alterations
Conformance Statements/Profiles: Explicit testable representations of explicit assumptions.
Conformance Assertions: Assertions against a conformance statement that can be verified as True or False.
17
New ECCF Concepts – CIM to PIM Traceability/Localization
• No localizations at PIM level:• Genome• Chromosome
• Localization at PIM level (e.g. addition of attributes):
• Gene
class Gene and related classes
LSDAM_R1 _1::Gene
- symbol: ST
domain::Gene
- fullName: ST- symbol: ST
LSDAM_ R1_1::Chromosome
- name: ST
domain::Chromosome
- name: ST
LSDAM_R1_ 1::Genome
- assemblyVe rsion: ST- assemblySo urce: ST
domain::Genome
- assemblySo urce: ST- assemblyVe rsion: ST
«trace»
«trace»
«trace»
1
contains / is part of
1. .*1. .*
contains / is part of1
18
New Reviews - Conformance Validation/Certification
• Testing conformance assertions (true/false) linked pair-wise to specific conformance statements or profiles• Conformance is contextualized to an implementation. • Topic for “SAIF Effects on Interoperability Reviews” talk
19
Example of one issue and resolution – LSDAM and MA Information Model
• Not all classes needed to support Molecular Annotation Service capabilities exist in LSDAM
• TherapeuticAgent/Drug
• Disease
• Missing classes needed for other Life Sciences models in long term
• Solution:
• Continue with specification• Leverage BRIDG 3.0 for Drug class
• Add a new Disease class
• In parallel, exercise (draft) LSDAM expansion process where LSDAM Analyst works with ICR Information Representation WG (SMEs)
20
Current Status
• Scope & Description Document• Development – Completed• Review/Approval – Completed
• CIM Specifications• Development – Completed• Review/Approval – In Progress
(presented to CAT Team)• PIM Specifications
• Development – Draft completed (internal review)
• Review/Approval – Not started• PSM Specifications
• Development – Initial draft completed
21
Conclusions
• Data service specification/development will involve new:• Processes• Reviews• Artifacts• Teams
• Data service developers or reviewers will consider new ECCF concepts to assess the quality and completeness of specifications/services
• Developing “good” specification is not easy and may take multiple iterations• Removing all implicit assumptions in specifications will need
through thinking, careful review and tools
22
Recommendations (Tools)
• Current MS Word templates lead to replication of the same content in specification documents (e.g. overlap between scope & description and CIM)• Recommendation: Document management tools that support
• Modular views to avoid duplication of content • Support for versioning
• Ensuring traceability and consistency of specifications is challenging:• Recommendation: Tools to verify/check traceability between the
specifications (CIM, PIM, PSM) and consistency between viewpoints (e.g. Information -> Computational). This may require more computable document formats.
23
Recommendations (Tools)
• ESST/CAT approval/review have some communication overhead; within CAT, between CAT and developers, ESST and CAT… etc.• Recommendation: Tools to streamline the ESST/CAT
approval/review and outcome notification: • Notify reviewers of impending actions • Track approval status e.g. provide dashboard with status
information
• The contents of specification documents can be used for service discovery/presentation purposes:• Recommendation: Tools that displays/searches a list of available
services and service status along with their mapping to capabilities. The specifications can be used for generating organized “service” metadata.
24
Recommendations (Process)
• Specification documents are first reviewed by the ESST Lead and then reviewed by the LS CAT team. The review process takes additional time with multiple review cycles. • Recommendation: Having one review with both ESST and LSCAT
representation
• Service Scope and Description documents are reviewed by the Life Sciences Governance Team (LS GT) • Recommendation: Guidelines for reviewing service scope documents
along with evaluation criteria.
• DAM/BAM may change throughout the specification development process• Recommendation: Provide a process for maintaining versions within
documents and address backward compatibility and overall change management
25
Recommendations (Process)
• The reviews at checkpoints (at the end of CIM, PIM, PSM specifications), although works, may slow down development process due to dependencies between them (e.g. review changes the CIM specifications substantially)• Recommendation: Interim reviews (not for approval) by all or
subgroups of CATs to prevent substantial changes at the checkpoints.
• The developers will need more interactive guidance around using DAMs and ISO 21090 data types (e.g. which class to use/constrain before considering localization)• Recommendation: Teams/people to answer questions or review
working copies of related artifacts.• Recommendation: User manuals for using DAMs.• Recommendation: Rules for “constraining”/”localizing”.
26
Recommendations (Templates)
• Data service developers, who will use caGrid as a platform, will need to specify the “query” method with CQLQuery input and CQLResults output at PSM level:• Recommendation: A template that contains the necessary PSM-
level localizations with conformance profiles, artifacts and/or pointers to artifacts as generated by Introduce Toolkit.
• Developers should be able to use PSM (even PIM) level specifications for code generation purposes. Current MS Word templates does not provide fully computable forms for all artifacts:• Recommendation: More computable template/document formats
that can easily be transformed to implementation (or partial implementation)
27
Questions