22

Scientific Workflows Scientific workflows describe structured activities arising in scientific problem-solving. Conducting experiments involve complex

Embed Size (px)

Citation preview

Scientific WorkflowsScientific workflows describe structured

activities arising in scientific problem-solving.Conducting experiments involve complex and

structured computations.Semantic mismatches among resources

involve much human intervention.Participating services are owned by different

organizations, defining compensations is critical to a successful recovery from failures.

Semantically Resolving Type Mismatches in Scientific Workflows

Semantically Resolving Type Mismatches in Scientific Workflows

Workflows in BioinformaticsIntegrating different tools to solve biological

problemsUsually involves:

Manual data transfer between applicationsUnderstanding data formatsConverting file formats where appropriate

Manual workflows involve a large number of steps. Manual execution is time-consuming and error-prone

User is required to possess a deep knowledge and understanding of disparate application environments

Semantically Resolving Type Mismatches in Scientific Workflows

WSWUBlast Service OperationWSWUBlast Service Operation

Using tools in Bioinformatics

blastn(query, database, email)

blastn(query, database, email)

Blast Service

Blast Service

Specification of an in silico experimental design: Sequence Similarity Search

DNA Sequence Similarity Search

DNA Sequence Similarity Search

Semantically Resolving Type Mismatches in Scientific Workflows

Automated WorkflowsMake the task of creating a workflow a simple

“drag and drop” processMake the resulting workflow diagram self

documenting, showing exactly how to perform bioinformatics experiment

Automatic execution of steps specified in workflow

Monitoring workflow execution to help debugging and intervention

Reduces complexity for scientific users, as well as support sharing and allow repeatability

Semantically Resolving Type Mismatches in Scientific Workflows

Bioinformatics Workflow SystemsSpecialized workflow systems designed to develop

workflows in bioinformaticsDifferent workflow standards and systems:

BPEL: Business workflow standard adapted for scientific workflows

UNICORE: a Grid middleware, it provides a GUI for workflow devcelopment

Globus: an open source toolkit implementing many Grid related standards

Kepler: graph based modelling language to develop workflows

Taverna Workbench: choreography tool for bioinformatics Web Services

Triana: develop component based workflow and provide coupling with Grid middleware tools

Windows Workflow Foundation (1)Part of .NET Framework 3.0Workflows are a collection of activities.Components

Base Activity Library: Out-of-box activities and base for custom activities.

Runtime Engine: Workflow execution and state management.

Runtime Services: e.g. RDBMS, persistence, transactions

Visual Designer: Graphical and code-based construction in Visual Studio or standalone

Semantically Resolving Type Mismatches in Scientific Workflows

Windows Workflow Foundation (2)

Semantically Resolving Type Mismatches in Scientific Workflows

Semantic Web ServicesThe augmentation of Web service

descriptions with Semantic annotations.Aims to automate Web service discovery,

composition, invocation, and monitoring.Two different approaches:

Revolutionary: OWL-S, and WSMO.Evolutionary: WSDL-S, and SAWSDL.

The SAWSDL approach builds on existing Web service standards and is agnostic to ontology representation.

Semantically Resolving Type Mismatches in Scientific Workflows

Semantic Annotations for Web Services Description LanguageSAWDL is an extension of WSDL using the

extensibility elements.Two basic types of annotations:

Model reference, associates selected WSDL components with Semantic concepts.

Schema mapping, deals with data heterogeneity by transforming one data representation into another.

Annotations for WSDL 1.1 and WSDL 2.0.API and tool support including: SWASDL4J,

Woden4SAWSDL, Radiant...

Semantically Resolving Type Mismatches in Scientific Workflows

SAWSDL Scope

Annotated using modelReference

Annotated using modelReference with schemaMapping

Note:- All elements may have <documentation> as first child

Semantically Resolving Type Mismatches in Scientific Workflows

SAWSDL Example<wsdl:definitions targetNamespace="http://www.w3.org/2002/ws/sawsdl/spec/wsdl/order#" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:sawsdl="http://www.w3.org/ns/sawsdl"> <wsdl:types> <xs:element name=“purchaseOrderResponse“ type=“xs:string”sawsdl:modelReference="http://www.w3.org/2002/ws/sawsdl/spec/ontology/purchaseorder#PurchaseOrderResponse" sawsdl:liftingSchemaMapping="http://www.w3.org/2002/ws/sawsdl/spec/mapping/Response2Ont.xslt"> …… </xs:element> </wsdl:types> <wsdl:portType name=“PurchaseOrder"> <wsdl:operation name="order"> <sawsdl:attrExtensions sawsdl:modelReference="http://www.w3.org/2002/ws/sawsdl/spec/ontology/purchaseorder#RequestPurchaseOrder"/> <wsdl:input messageLabel="OrderRequestMessage" element=“purchaseOrderRequest"/> <wsdl:output messageLabel="OrderResponseMessage" element=“tns:purchaseOrderResponse"/> </wsdl:operation> </wsdl:portType> </wsdl:definitions>

Semantically Resolving Type Mismatches in Scientific Workflows

Leveraging existing Java for .NETA C# implementation of the SAWSDL

specification.Support of Model Reference annotations,

OWL/RDF definitions.Lifting/Lowering schema support,

XSLT/SPARQL mapping definitions.Allows the creation of SAWSDL based

applications.Extends the .NET API for WSDL1.1.Support for WSDL2.0 through XSLT.

Semantically Resolving Type Mismatches in Scientific Workflows

ImplementationDevelopment of a custom activity that extends

the base Web Service activity shipped with WF.Enables a semi-automatic composition of

Semantic Web Services, and the execution of the workflow.

Can be composed with Web Services described using WSDL files.

A C# implementation of the activity.Semantic capabilities are provided by the Jena

library, integration with C# is enabled via IKVM.

Semantically Resolving Type Mismatches in Scientific Workflows

Semantic ReasoningModel reference annotations describe the

functionalities a Web service provides.Use ontologies as semantic models for the

semantic annotations.Reasoning capabilities are provided by using:

Jena, an open source Semantic Web framework for Java.

Pellet, an open source Java OWL-DL reasoner.Currently support schema type and message

part annotations to achieve automatic parameter binding.

Semantically Resolving Type Mismatches in Scientific Workflows

Schema Type MappingProvide mappings between XML and semantic

models.Lifting Schema Mapping specifies mapping between

WSDL Type Definitions in XML and semantic data.Used XSTL and XQuery as mapping languages.

Lowering Schema Mapping specifies mapping between semantic data and WSDL Type Definitions in XML.Used SPARQL to query ontology, followed by XSTL and

XQuery.Semantic data is queried through SPARQL, it is

supported by Jena through its query engine.

Semantically Resolving Type Mismatches in Scientific Workflows

IKVM .NETAn implementation of Java for the Microsoft .NET

Framework.It includes the following components:

A Java Virtual Machine implemented in .NET.A .NET implementation of the Java class libraries.Tools that enable Java and .NET interoperability.

Used to compile Jena and Pellet JAR libraries into .NET DLL assemblies, Java bytecode is translated to Common Intermediate Language (CIL).

Allowed using Jena’s capabilities in the implementation of the Semantic Web service activity.

Semantically Resolving Type Mismatches in Scientific Workflows

Semantic Web Service Activity (1)Activity bindings are the key feature that

enables property binding between activities, or on the workflow itself.

This mechanism allows data propagation between composed activities.

WF rely on syntactic approaches when binding properties between activities.

The SWS activity implements a basic semantic matching engine to better support semantically compatible properties.

Semantically Resolving Type Mismatches in Scientific Workflows

Semantic Web Service Activity (2)Automatically bind SWS parameters to

composed workflow activities using the semantic approach.

The semantic model annotation of an activity’s input has to be equivalent or a subclass of the composed activity’s output one.

Values are mapped to the appropriate data representation at design time.

Missing activity bindings can be manually added using the WF visual designer.

Semantically Resolving Type Mismatches in Scientific Workflows

Semantically Resolving Type Mismatches in Scientific Workflows

Bioinformatics Workflow ExampleX14298

[email protected] embl

GetEntrygetFASTA_DDBJEn

try

WSWUBlast

M7WEXBN7013

Atgagtgatggagcagttcaaccagacggtggtcaacctgctgtcagaaatgaaagagctcaggatctgggaacgggtctggaggcggg

blastn

accession

sequencedatabaseemail

jobID

Semantically Resolving Type Mismatches in Scientific Workflows

WSWUBlastWSWUBlast

Automatic Binding in Bioinformatics Workflow

blastnblastn

Semantic ReasonerSemantic Reasoner

OutputOutput

InputInput

SAWSDLSAWSDL

Semantic ConceptSemantic Concept

Degrees of Match:-Exact-Subclass

Degrees of Match:-Exact-Subclass

Bind Parameters, Carry out necessary

translations

Bind Parameters, Carry out necessary

translations

SequenceSequence

SequenceSequence

DNA SequenceDNA Sequence

DNA SequenceDNA Sequence

GetEntryGetEntry

getFASTA_DDBJEntry

getFASTA_DDBJEntry

SAWSDLSAWSDL

Conclusion & Future WorkAPI implementations that enable the development

of semantically annotated Web services.Semantic Web service activity integration to WF,

facilitating workflow building and manipulation.Future Work:

Improve the SWS activity by processing more SAWSDL annotations, e.g. operation and portType.

Semantically annotate Bioinformatics Web services, then use WF to build a workflow composed of SWS activities in order to test the implementation.

Implement an approach to semantically guide and verify compensations and exceptions.

Semantically Resolving Type Mismatches in Scientific Workflows