36
Web Service Oriented Data Warehouses Mariusz Pajer National Institute of Telecommunications m.pajer(at)itl.waw.pl 1 INESC Coimbra - DSTIS 2009, 4-7 September 2009

Web Service Oriented Data Warehouses Mariusz Pajer...WSODW characteristic (3) Service optimization – all services else equal, but high-quality services are preferable Service discoverability

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Web Service Oriented Data Warehouses

    Mariusz PajerNational Institute of Telecommunications

    m.pajer(at)itl.waw.pl

    1INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • Agenda

    2

    Data flow in Legacy Data Warehouse ArchitectureDW challengesMPP with SNA as solution for performanceSOA and WS as a key for flexibilityWeb Service Oriented DW characteristicData flow in WSODW ArchitectureWSODW implementationBPMN and WS-BPELWS specifications

    INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • Data flow in legacy DW

    INESC Coimbra - DSTIS 2009, 4-7 September 2009

    DB DataSource 1

    DB DataSource n

    FileSources

    OLTPSources

    Data Sources

    Datatransformation

    modules

    Data loadmodules

    Data extraction modules

    ETL Processes DataWarehouseStructures

    TLmodules

    Global DWStorage

    OperationalDW Storage(s)

    DataMart(s)

    OLAPApplication 1

    OLAPApplication n

    DWUser

    Interfaces

  • DW challengesIncreased amount of source data from heterogeneous and ad-hoc changing Data Sources (DS)New DS types (e.g. streams as data sources, sensor networks)Demand for always actual data martsAd-hoc and customisable analyses

    Active WarehousingPerformanceFlexibility

    4INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • Massive Parallel Processing as a key for performance (1)Single super computer with many networked processing units having its own memory and storage, each working separatelyMultiple parallel data feeds and multiple concurrent query streams for unstructured dataGrowing importance of virtual DW based on blade solutions (virtual processors, memory, storage, networks, software)Examples: Kognitio WX2

    5INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • Massive Parallel Processing as a key for performance (2)Shared Nothing Architecture – virtual grid of separated, independent and self-sufficient nodes, so there is no single point of contention across the systemData storage and data processing request are spread on all or selected possible nodesEvery node act as a gateway for the same piece of informationExample: Teradata, GreenPlenum, Hadoop

    6INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • SOA and WS as a key for flexibility

    INESC Coimbra - DSTIS 2009, 4-7 September 20097

    Reuse, granularity, modularity, composability, componentization and interoperability between ETL modules on DS and User Interfaces sidesStandards compliance (both common and industry-specific)Automated identification and categorization, provisioning and delivery, monitoring and tracking for available ETL modules

    Web Service oriented DW (WSODW) based on Service Oriented Architecture (SOA)

  • WSODW characteristic (1)Every ETL module, DS and User Interface (UI) acts as serviceService loose coupling – no dependencies between services, but an awareness of each other between services is a mustService contract – describe terms of use and communication with particular service

    8INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WSODW characteristic (2)Service abstraction and encapsulation - service hide internal logic from the outside world, porting of ETL modules or DS not planed as Web Service (WS) is often possibleService autonomy – service control encapsulated logicService composability – composite services can be formed by coordination and assembling of collected services

    9INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WSODW characteristic (3)Service optimization – all services else equal, but high-quality services are preferableService discoverability – discovery mechanisms let find and assess well described servicesService relevance – granularity of services has to assure meaningful of their functionalityEvery service could act as:

    Service providerService requester

    10INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • Data flow in Web Service Oriented DW

    INESC Coimbra - DSTIS 2009, 4-7 September 2009

    Data Warehouse FederationData

    Warehouse Structures

    OperationalDW Storage(s)

    DataMart(s)

    DW MetadataStorage(s)

    Global DWStorage

    DB DataSource 1

    DB DataSource n

    FileSources

    OLTPSources

    Data SourcesFederation

    UIServices

    DWObjects

    Discoverer

    UIBridger

    Metric/AlertAgent

    Security/SessionAgent

    MetadataAgent

    OLAPApplication

    1

    DW User InterfacesFederation

    OLAPApplication

    n

    OLTPApplication

    i

    DS & DW ETL Processes

    DataSubscriber

    DataCleaner

    DataLoader

    DataAggregator

    DataTransformer

    DataDeriver

    DataFederation

    MapperDW

    MetadataManager

    SchemaManager

    ETLAuditing

    SnapshotAgent

    CubeManager

  • WSODW implementation (1)Remote procedure call (RPC)

    Well known idea of client-server distributed computingBasic unit of RPC Web services is the WSDL operationCriticized for not being loosely coupled (many implementation mapped services directly to language-specific functions or method calls)RPC is disallowed in the WS-I (Web Services Interoperability) Basic Profile

    (DEAD END)

    12INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WSODW implementation (2)Simple Object Access Protocol (SOAP)

    Protocol specification for exchanging structured information in the implementation of Web Services in computer networksRelies on Extensible Markup Language (XML) in area of message formatRelies on other Application Layer protocols (RPC and HTTP, SMTP) for message negotiation and transmissionMachine-readable description of the operations offered by a service written in the Web Services Description Language (WSDL)

    13INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WSOD implementation (3)REpresentational State Transfer (REST)

    Client-server approach for web service implementation which use HTTP or similar protocols by constraining the interface to a set of well-known, standard operations (like GET, POST, PUT, DELETE for HTTP)Focusing on interacting with stateful resources, rather than messages or operationsArchitecture based on REST

    can use WSDL to describe SOAP messaging over HTTP -defines the operationscan be created without using SOAP at all

    14INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WSODW implementation (4)RESTful web service

    Implemented using HTTP and the principles of RESTCan be thought about as a collection of resourcesDefinition of such a web service comprising three aspects:

    The base URI for the web service such as http://example.com/resources/The MIME type of the data supported by the web service. This is often JSON , XML or YAML but can be any other valid MIME typeThe set of operations supported by the web service using HTTP methods (e.g. POST, GET, PUT or DELETE)

    15INESC Coimbra - DSTIS 2009, 4-7 September 2009

    http://example.com/resources/

  • Business Process Modeling Notation (BPMN)Enable business user to develop understandable graphical representation of business processesFour basic categories of elements

    Flow Objects: Events, Activities, GatewaysConnecting Objects: Sequence Flow, Message Flow, AssociationSwimlanes: Pool, LaneArtifacts: Data Object, Group, Annotation

    16INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS-BPEL (1)Web Services Business Process Execution Language (WS-BPEL)

    OASIS (Organization for the Advancement of Structured Information Standards) standard executable language for specifying interactions with Web ServicesProcesses defined with WS-BPEL export and import information by using Web Service interfaces exclusively

    17INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS-BPEL (2)WS-BPEL is an Orchestration language for WS

    Specifies an executable process that involves message exchanges with other systems, such that the message exchange sequences are controlled by the orchestration designerOrchestration refers to the central control (as the conductor) of the behavior of a distributed system (as the orchestra consisting of many players)

    18INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS-BPEL (3)WS-BPEL can adopt WS as its external communication mechanism in manner suitable for business processesIt focus on modern business processes, with the history of WSFL (Web Services Flow Language) and XLANG

    19INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • BPMN and WS-BPEL (1)OASIS technical committee decided thatdefinition of standard graphical notation for WS-BPEL is out of its scopeSame vendors proposed direct visual representation of BPEL process descriptions in the form of structograms, in a style reminiscent of a Nassi-Shneiderman diagram

    20INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • BPMN and WS-BPEL (2)Others propose use BPMN as a graphical front-end to capture BPEL process descriptionsDetailed mapping of BPMN to BPEL has been implemented in a number of tools (e.g. BPMN2BPEL)

    21INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (1)XML Specifications

    XML (eXtensible Markup Language)XML NamespacesXML SchemaXPathXQueryXML Information SetXIncludeXML Pointer

    22INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (2)Messaging Specifications (1)

    SOAP (formerly known as Simple Object Access Protocol)SOAP Message Transmission Optimization MechanismWS-Notification: WS-BaseNotification, WS-Topics, WS-BrokeredNotificationWS-SoapOverUDPWS-AddressingWS-TransferWS-Eventing

    23INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (3)Messaging Specifications (2)

    WS-EnumerationWS-MakeConnection

    24INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (4)Metadata Exchange Specifications (1)

    WS-PolicyWS-PolicyAssertionsWS-PolicyAttachmentWS-Discovery: WS-InspectionWS-MetadataExchangeUniversal Description, Discovery, and Integration (UDDI)

    25INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (5)Metadata Exchange Specifications (2)

    WSDL 2.0 CoreWSDL 2.0 SOAP Binding: Web Services Semantics (WSDL-S)WS-Resource Framework (WSRF)

    26INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (6)Security Specifications (1)

    WS-SecurityXML SignatureXML EncryptionXML Key Management (XKMS)WS-SecureConversationWS-SecurityPolicyWS-Trust

    27INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (7)Security Specifications (2)

    WS-FederationWS-Federation Active Requestor ProfileWS-Federation Passive Requestor ProfileWeb Services Security Kerberos BindingWeb Single Sign-On Interoperability ProfileWeb Single Sign-On Metadata Exchange ProtocolSecurity Assertion Markup Language (SAML)XACML

    28INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (8)Privacy

    P3P

    Reliable Messaging SpecificationsWS-ReliableMessagingWS-ReliabilityWS-RM Policy Assertion

    29INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (9)Resource Specifications

    Web Services Resource FrameworkWS-BaseFaultsWS-ServiceGroupWS-ResourcePropertiesWS-ResourceLifetimeWS-TransferResource Representation SOAP Header Block

    30INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (10)Web Services Interoperability organization (WS-I Specifications, provide additional information to improve interoperability between vendor implementations)

    WS-I Basic ProfileWS-I Basic Security ProfileSimple Soap Binding Profile

    31INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (11)Business Process Specifications

    WS-BPELWS-CDLWeb Services Choreography InterfaceWS-ChoreographyXML Process Definition Language

    32INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (12)Transaction Specifications

    WS-BusinessActivityWS-AtomicTransactionWS-CoordinationWS-CAFWS-TransactionWS-ContextWS-CFWS-TXM

    33INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • WS specifications: WS-* (13)Management Specifications

    WS-ManagementWS-Management CatalogWS-ResourceTransferWSDM

    Presentation Orientated SpecificationWeb Services for Remote Portlets

    34INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • Conclusions

    35

    Active Warehousing could be achieved more or lessPerformance issues could be resolved with the Massive Parallel Processing using Shared Nothing ArchitectureDW based on WS and build with SOA facilitate DW with flexibility and open new level of functionalityFurther research in area of WSODW is necessary (e.g. data quality assurance)

    INESC Coimbra - DSTIS 2009, 4-7 September 2009

  • Thank you for your attention,please do not hesitate to ask or

    comment

    36INESC Coimbra - DSTIS 2009, 4-7 September 2009

    Web Service Oriented Data WarehousesAgendaDW challengesMassive Parallel Processing as a key for performance (1)Massive Parallel Processing as a key for performance (2)SOA and WS as a key for flexibilityWSODW characteristic (1)WSODW characteristic (2)WSODW characteristic (3)WSODW implementation (1)WSODW implementation (2)WSOD implementation (3)WSODW implementation (4)Business Process Modeling Notation (BPMN)WS-BPEL (1)WS-BPEL (2)WS-BPEL (3)BPMN and WS-BPEL (1)BPMN and WS-BPEL (2)WS specifications: WS-* (1)WS specifications: WS-* (2)WS specifications: WS-* (3)WS specifications: WS-* (4)WS specifications: WS-* (5)WS specifications: WS-* (6)WS specifications: WS-* (7)WS specifications: WS-* (8)WS specifications: WS-* (9)WS specifications: WS-* (10)WS specifications: WS-* (11)WS specifications: WS-* (12)WS specifications: WS-* (13)Conclusions