27
1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face- to-Face June,3, 2010 St. Louis, Missouri

1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

Embed Size (px)

Citation preview

Page 1: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

1

SAIF-Effects on Data Service

Specifications

Baris Suzek

Georgetown University

Architecture/VCDE Joint Face-to-Face

June,3, 2010

St. Louis, Missouri

Page 2: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

22

Team

ECCF Guidance Team•Baris Suzek (Lead)•Lewis Frey •Raghu Chintalapati•Brian Davis •Charlie Mead

caBIO Team• Juli Klemm• Sharon Gaheen • Konrad Rokicki• Jim Sun• Liqun Qi

Page 3: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

3

SAIF-Effects on Data Service Specifications

• Goals• Contribute to the ECCF Implementation Guide• Provide guidance around existing ECCF processes and artifacts for

Molecular Annotation Service specification/development team

• Approach:• Attend weekly developer meetings• Present current/draft ECCF processes• Develop artifact lists• Respond to questions• Reviews CIM/PIM/PSM level specifications• Identify issues and provide recommendations/solutions

• Wiki Pages:• https://wiki.nci.nih.gov/x/h1xyAQ • https://wiki.nci.nih.gov/x/RHFyAQ (caBIO ECCF Team Page)

Page 4: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

4

SAIF and ECCF Effects on Data Service Specification and Development

• Following new processes:• Enterprise Service Specification • Life Science Domain Analysis Model (LSDAM) Use/Expansion• NCI-localized ISO 21090 Data Types’ Use/Expansion

• Going through new reviews:• Scope & Description Review• ECCF Specification Review• Conformance Validation/Certification

• Developing artifacts using new templates/resources for ECCF viewpoints:• Enterprise Service Specification Templates• LSDAM/LSBAM• Artifacts to describe behavior (e.g. sequence/activity diagrams,

functional profiles) , engineering approach (e.g. deployment diagrams), business needs/goals (e.g. use case diagrams)

Page 5: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

5

SAIF and ECCF Effects on Data Service Specification and Development

• Interacting with several new team/people:• LSDAM Team• Enterprise Service Specification Team• Life Sciences Governance Team• Life Sciences Composite Architecture Team• ISO 21090 Project Manager

• Considering new ECCF concepts in development/review:• Traceability• Consistency• Conformance• Compatibility• Localize/Constrain• Compliance

Page 6: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

6

New Processes - NCI Enterprise Service Specification Development and Review (Draft)

Document and Specification

Reviews by LSCAT, ESST, GT

Scope & Description Document, Computation-Independent (CIM), Platform-Independent (PIM), Platform-Specific Model (PSM) Specifications by Developers

Page 7: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

7

New Processes – Specification Development

• Defining scope and describing services using “Scope and Description Document”

• Specification by developing artifacts for 4 viewpoints from RM-ODP at 3 abstraction levels; Computation-Independent, Platform-specific, Platform-independent Model

• Business• Information• Computational (Behavior) • Engineering (Deployment)

Page 8: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

8

New Artifacts - Scope and Description

• A document describing:• What is the service?• How does it support/extend business processes in LSBAM?• What is the rationale for creating this specification/services?• Who are the stakeholders? • What is the scope?

•Reviewed and approved by Governance Team (e.g. LSGT)

Page 9: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

9

New Artifacts – Service Specification Documents

• Specification documents developed using CIM, PIM, PSM templates• Templates helps to consolidate artifacts; point to and/or brings together various artifacts (e.g. UML diagrams)

•Reviewed and approved by ESST and CAT (e.g. LSCAT)

Page 10: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

1010

New Artifacts – Service Specification Document: CIM

• Business Storyboard for Business Viewpoint

Page 11: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

1111

New Artifacts – Service Specification Document: CIM

• Semantic Profiles for Information Viewpoint• Project Analysis Model is derived from Life Sciences Domain Analysis Model and BRIDG

class Molecular Annotation

domain::MolecularSequenceAnnotation

- date: TS

domain::Gene

- symbol: ST

domain::GeneIdentifier

- databaseName: CD- identifier: II

domain::NucleicAcidSequenceFeature

- orientat ion: ST

domain::AdditionalOrganismName

- comment : ST- source: CD- value: ST

domain::Organism

- commonName: ST- ncbiTaxonomyId: CD- scientificName: CD- taxonomyRank: CD

domain::MolecularSequence

- value: SC

domain::NucleicAcidPhysicalLocation

- endCoordinate: INT- startCoordinate: INT

domain::NucleicAcidSequence

domain::GeneticVariation

domain::SingleNucleotidePolymorphism

domain::SingleNucleotidePolymorphismIdentifier

- databaseName: CD- identifier: II

BRIDG 2.1 - ISO::TherapeuticAgent

+ identifier: II+ statusCode: CD+ statusDateRange: IVL<TS>

BRIDG 2.1 - ISO::Material

- actualIndicator: BL+ descript ion: ST+ formCode: CD+ identifier: DSET<II>+ name: DSET<EN.TN>+ statusCode: CD+ statusDateRange: IVL<TS>

BRIDG 2.1 - ISO::Produc t

+ classCode: DSET<CD>+ expirationDate: TS+ pre1938Indicator: BL+ typeCode: CD

0..*

identifies / is identif iedby

1

0..*

is included in /inc ludes

1.. *

1.. *

is designated by / designates

0..*

0..*

is included in /inc ludes

1

0..*

reports / is reported by

0..*1

is included in /inc ludes

0..*

0..*

identifies / is identif iedby

1

+product 1

plays / is played by

+therapeuticAgent 0..1

+product 0..1

has component /used as component

+productCollection0..*

Page 12: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

12

New Artifacts – Service Specification Document: CIM

•Capabilities for Computational Viewpoint

Page 13: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

13

New Artifacts – Service Specification Document: CIM

•Functional Profiles for Computational Viewpoint

Page 14: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

14

New Artifacts – Service Specification Document: CIM

• Conformance Statements for all viewpoints

Page 15: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

15

New Reviews - ECCF Specification Review

• Uses a review criteria list developed by Enterprise Service Specifications Team (ESST)

• Conducted by Composite Architecture Teams (CAT) on CIM, PIM, PSM level specifications

• Should check:• Traceability • Consistency• Compliance• Localizations• Completeness

Page 16: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

16

New ECCF Concepts

Compliance: Artifacts derived from other artifacts by traversal of

successive levels of abstraction

Localization: Custom modifications or other

alterations

Conformance Statements/Profiles: Explicit testable representations of explicit assumptions.

Conformance Assertions: Assertions against a conformance statement that can be verified as True or False.

Page 17: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

17

New ECCF Concepts – CIM to PIM Traceability/Localization

• No localizations at PIM level:• Genome• Chromosome

• Localization at PIM level (e.g. addition of attributes):

• Gene

class Gene and related classes

LSDAM_R1 _1::Gene

- symbol: ST

domain::Gene

- fullName: ST- symbol: ST

LSDAM_ R1_1::Chromosome

- name: ST

domain::Chromosome

- name: ST

LSDAM_R1_ 1::Genome

- assemblyVe rsion: ST- assemblySo urce: ST

domain::Genome

- assemblySo urce: ST- assemblyVe rsion: ST

«trace»

«trace»

«trace»

1

contains / is part of

1. .*1. .*

contains / is part of1

Page 18: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

18

New Reviews - Conformance Validation/Certification

• Testing conformance assertions (true/false) linked pair-wise to specific conformance statements or profiles• Conformance is contextualized to an implementation. • Topic for “SAIF Effects on Interoperability Reviews” talk

Page 19: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

19

Example of one issue and resolution – LSDAM and MA Information Model

• Not all classes needed to support Molecular Annotation Service capabilities exist in LSDAM

• TherapeuticAgent/Drug

• Disease

• Missing classes needed for other Life Sciences models in long term

• Solution:

• Continue with specification• Leverage BRIDG 3.0 for Drug class

• Add a new Disease class

• In parallel, exercise (draft) LSDAM expansion process where LSDAM Analyst works with ICR Information Representation WG (SMEs)

Page 20: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

20

Current Status

• Scope & Description Document• Development – Completed• Review/Approval – Completed

• CIM Specifications• Development – Completed• Review/Approval – In Progress

(presented to CAT Team)• PIM Specifications

• Development – Draft completed (internal review)

• Review/Approval – Not started• PSM Specifications

• Development – Initial draft completed

Page 21: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

21

Conclusions

• Data service specification/development will involve new:• Processes• Reviews• Artifacts• Teams

• Data service developers or reviewers will consider new ECCF concepts to assess the quality and completeness of specifications/services

• Developing “good” specification is not easy and may take multiple iterations• Removing all implicit assumptions in specifications will need

through thinking, careful review and tools

Page 22: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

22

Recommendations (Tools)

• Current MS Word templates lead to replication of the same content in specification documents (e.g. overlap between scope & description and CIM)• Recommendation: Document management tools that support

• Modular views to avoid duplication of content • Support for versioning

• Ensuring traceability and consistency of specifications is challenging:• Recommendation: Tools to verify/check traceability between the

specifications (CIM, PIM, PSM) and consistency between viewpoints (e.g. Information -> Computational). This may require more computable document formats.

Page 23: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

23

Recommendations (Tools)

• ESST/CAT approval/review have some communication overhead; within CAT, between CAT and developers, ESST and CAT… etc.• Recommendation: Tools to streamline the ESST/CAT

approval/review and outcome notification: • Notify reviewers of impending actions • Track approval status e.g. provide dashboard with status

information

• The contents of specification documents can be used for service discovery/presentation purposes:• Recommendation: Tools that displays/searches a list of available

services and service status along with their mapping to capabilities. The specifications can be used for generating organized “service” metadata.

Page 24: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

24

Recommendations (Process)

• Specification documents are first reviewed by the ESST Lead and then reviewed by the LS CAT team. The review process takes additional time with multiple review cycles. • Recommendation: Having one review with both ESST and LSCAT

representation

• Service Scope and Description documents are reviewed by the Life Sciences Governance Team (LS GT) • Recommendation: Guidelines for reviewing service scope documents

along with evaluation criteria.

• DAM/BAM may change throughout the specification development process• Recommendation: Provide a process for maintaining versions within

documents and address backward compatibility and overall change management

Page 25: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

25

Recommendations (Process)

• The reviews at checkpoints (at the end of CIM, PIM, PSM specifications), although works, may slow down development process due to dependencies between them (e.g. review changes the CIM specifications substantially)• Recommendation: Interim reviews (not for approval) by all or

subgroups of CATs to prevent substantial changes at the checkpoints.

• The developers will need more interactive guidance around using DAMs and ISO 21090 data types (e.g. which class to use/constrain before considering localization)• Recommendation: Teams/people to answer questions or review

working copies of related artifacts.• Recommendation: User manuals for using DAMs.• Recommendation: Rules for “constraining”/”localizing”.

Page 26: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

26

Recommendations (Templates)

• Data service developers, who will use caGrid as a platform, will need to specify the “query” method with CQLQuery input and CQLResults output at PSM level:• Recommendation: A template that contains the necessary PSM-

level localizations with conformance profiles, artifacts and/or pointers to artifacts as generated by Introduce Toolkit.

• Developers should be able to use PSM (even PIM) level specifications for code generation purposes. Current MS Word templates does not provide fully computable forms for all artifacts:• Recommendation: More computable template/document formats

that can easily be transformed to implementation (or partial implementation)

Page 27: 1 SAIF-Effects on Data Service Specifications Baris Suzek Georgetown University Architecture/VCDE Joint Face-to-Face June,3, 2010 St. Louis, Missouri

27

Questions