Web Service Oriented Data Warehouses
Mariusz PajerNational Institute of Telecommunications
m.pajer(at)itl.waw.pl
1INESC Coimbra - DSTIS 2009, 4-7 September 2009
Agenda
2
Data flow in Legacy Data Warehouse ArchitectureDW challengesMPP with SNA as solution for performanceSOA and WS as a key for flexibilityWeb Service Oriented DW characteristicData flow in WSODW ArchitectureWSODW implementationBPMN and WS-BPELWS specifications
INESC Coimbra - DSTIS 2009, 4-7 September 2009
Data flow in legacy DW
INESC Coimbra - DSTIS 2009, 4-7 September 2009
DB DataSource 1
DB DataSource n
FileSources
OLTPSources
Data Sources
Datatransformation
modules
Data loadmodules
Data extraction modules
ETL Processes DataWarehouseStructures
TLmodules
Global DWStorage
OperationalDW Storage(s)
DataMart(s)
OLAPApplication 1
OLAPApplication n
DWUser
Interfaces
DW challengesIncreased amount of source data from heterogeneous and ad-hoc changing Data Sources (DS)New DS types (e.g. streams as data sources, sensor networks)Demand for always actual data martsAd-hoc and customisable analyses
Active WarehousingPerformanceFlexibility
4INESC Coimbra - DSTIS 2009, 4-7 September 2009
Massive Parallel Processing as a key for performance (1)Single super computer with many networked processing units having its own memory and storage, each working separatelyMultiple parallel data feeds and multiple concurrent query streams for unstructured dataGrowing importance of virtual DW based on blade solutions (virtual processors, memory, storage, networks, software)Examples: Kognitio WX2
5INESC Coimbra - DSTIS 2009, 4-7 September 2009
Massive Parallel Processing as a key for performance (2)Shared Nothing Architecture – virtual grid of separated, independent and self-sufficient nodes, so there is no single point of contention across the systemData storage and data processing request are spread on all or selected possible nodesEvery node act as a gateway for the same piece of informationExample: Teradata, GreenPlenum, Hadoop
6INESC Coimbra - DSTIS 2009, 4-7 September 2009
SOA and WS as a key for flexibility
INESC Coimbra - DSTIS 2009, 4-7 September 20097
Reuse, granularity, modularity, composability, componentization and interoperability between ETL modules on DS and User Interfaces sidesStandards compliance (both common and industry-specific)Automated identification and categorization, provisioning and delivery, monitoring and tracking for available ETL modules
Web Service oriented DW (WSODW) based on Service Oriented Architecture (SOA)
WSODW characteristic (1)Every ETL module, DS and User Interface (UI) acts as serviceService loose coupling – no dependencies between services, but an awareness of each other between services is a mustService contract – describe terms of use and communication with particular service
8INESC Coimbra - DSTIS 2009, 4-7 September 2009
WSODW characteristic (2)Service abstraction and encapsulation - service hide internal logic from the outside world, porting of ETL modules or DS not planed as Web Service (WS) is often possibleService autonomy – service control encapsulated logicService composability – composite services can be formed by coordination and assembling of collected services
9INESC Coimbra - DSTIS 2009, 4-7 September 2009
WSODW characteristic (3)Service optimization – all services else equal, but high-quality services are preferableService discoverability – discovery mechanisms let find and assess well described servicesService relevance – granularity of services has to assure meaningful of their functionalityEvery service could act as:
Service providerService requester
10INESC Coimbra - DSTIS 2009, 4-7 September 2009
Data flow in Web Service Oriented DW
INESC Coimbra - DSTIS 2009, 4-7 September 2009
Data Warehouse FederationData
Warehouse Structures
OperationalDW Storage(s)
DataMart(s)
DW MetadataStorage(s)
Global DWStorage
DB DataSource 1
DB DataSource n
FileSources
OLTPSources
Data SourcesFederation
UIServices
DWObjects
Discoverer
UIBridger
Metric/AlertAgent
Security/SessionAgent
MetadataAgent
OLAPApplication
1
DW User InterfacesFederation
OLAPApplication
n
OLTPApplication
i
DS & DW ETL Processes
DataSubscriber
DataCleaner
DataLoader
DataAggregator
DataTransformer
DataDeriver
DataFederation
MapperDW
MetadataManager
SchemaManager
ETLAuditing
SnapshotAgent
CubeManager
WSODW implementation (1)Remote procedure call (RPC)
Well known idea of client-server distributed computingBasic unit of RPC Web services is the WSDL operationCriticized for not being loosely coupled (many implementation mapped services directly to language-specific functions or method calls)RPC is disallowed in the WS-I (Web Services Interoperability) Basic Profile
(DEAD END)
12INESC Coimbra - DSTIS 2009, 4-7 September 2009
WSODW implementation (2)Simple Object Access Protocol (SOAP)
Protocol specification for exchanging structured information in the implementation of Web Services in computer networksRelies on Extensible Markup Language (XML) in area of message formatRelies on other Application Layer protocols (RPC and HTTP, SMTP) for message negotiation and transmissionMachine-readable description of the operations offered by a service written in the Web Services Description Language (WSDL)
13INESC Coimbra - DSTIS 2009, 4-7 September 2009
WSOD implementation (3)REpresentational State Transfer (REST)
Client-server approach for web service implementation which use HTTP or similar protocols by constraining the interface to a set of well-known, standard operations (like GET, POST, PUT, DELETE for HTTP)Focusing on interacting with stateful resources, rather than messages or operationsArchitecture based on REST
can use WSDL to describe SOAP messaging over HTTP -defines the operationscan be created without using SOAP at all
14INESC Coimbra - DSTIS 2009, 4-7 September 2009
WSODW implementation (4)RESTful web service
Implemented using HTTP and the principles of RESTCan be thought about as a collection of resourcesDefinition of such a web service comprising three aspects:
The base URI for the web service such as http://example.com/resources/The MIME type of the data supported by the web service. This is often JSON , XML or YAML but can be any other valid MIME typeThe set of operations supported by the web service using HTTP methods (e.g. POST, GET, PUT or DELETE)
15INESC Coimbra - DSTIS 2009, 4-7 September 2009
http://example.com/resources/
Business Process Modeling Notation (BPMN)Enable business user to develop understandable graphical representation of business processesFour basic categories of elements
Flow Objects: Events, Activities, GatewaysConnecting Objects: Sequence Flow, Message Flow, AssociationSwimlanes: Pool, LaneArtifacts: Data Object, Group, Annotation
16INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS-BPEL (1)Web Services Business Process Execution Language (WS-BPEL)
OASIS (Organization for the Advancement of Structured Information Standards) standard executable language for specifying interactions with Web ServicesProcesses defined with WS-BPEL export and import information by using Web Service interfaces exclusively
17INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS-BPEL (2)WS-BPEL is an Orchestration language for WS
Specifies an executable process that involves message exchanges with other systems, such that the message exchange sequences are controlled by the orchestration designerOrchestration refers to the central control (as the conductor) of the behavior of a distributed system (as the orchestra consisting of many players)
18INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS-BPEL (3)WS-BPEL can adopt WS as its external communication mechanism in manner suitable for business processesIt focus on modern business processes, with the history of WSFL (Web Services Flow Language) and XLANG
19INESC Coimbra - DSTIS 2009, 4-7 September 2009
BPMN and WS-BPEL (1)OASIS technical committee decided thatdefinition of standard graphical notation for WS-BPEL is out of its scopeSame vendors proposed direct visual representation of BPEL process descriptions in the form of structograms, in a style reminiscent of a Nassi-Shneiderman diagram
20INESC Coimbra - DSTIS 2009, 4-7 September 2009
BPMN and WS-BPEL (2)Others propose use BPMN as a graphical front-end to capture BPEL process descriptionsDetailed mapping of BPMN to BPEL has been implemented in a number of tools (e.g. BPMN2BPEL)
21INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (1)XML Specifications
XML (eXtensible Markup Language)XML NamespacesXML SchemaXPathXQueryXML Information SetXIncludeXML Pointer
22INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (2)Messaging Specifications (1)
SOAP (formerly known as Simple Object Access Protocol)SOAP Message Transmission Optimization MechanismWS-Notification: WS-BaseNotification, WS-Topics, WS-BrokeredNotificationWS-SoapOverUDPWS-AddressingWS-TransferWS-Eventing
23INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (3)Messaging Specifications (2)
WS-EnumerationWS-MakeConnection
24INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (4)Metadata Exchange Specifications (1)
WS-PolicyWS-PolicyAssertionsWS-PolicyAttachmentWS-Discovery: WS-InspectionWS-MetadataExchangeUniversal Description, Discovery, and Integration (UDDI)
25INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (5)Metadata Exchange Specifications (2)
WSDL 2.0 CoreWSDL 2.0 SOAP Binding: Web Services Semantics (WSDL-S)WS-Resource Framework (WSRF)
26INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (6)Security Specifications (1)
WS-SecurityXML SignatureXML EncryptionXML Key Management (XKMS)WS-SecureConversationWS-SecurityPolicyWS-Trust
27INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (7)Security Specifications (2)
WS-FederationWS-Federation Active Requestor ProfileWS-Federation Passive Requestor ProfileWeb Services Security Kerberos BindingWeb Single Sign-On Interoperability ProfileWeb Single Sign-On Metadata Exchange ProtocolSecurity Assertion Markup Language (SAML)XACML
28INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (8)Privacy
P3P
Reliable Messaging SpecificationsWS-ReliableMessagingWS-ReliabilityWS-RM Policy Assertion
29INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (9)Resource Specifications
Web Services Resource FrameworkWS-BaseFaultsWS-ServiceGroupWS-ResourcePropertiesWS-ResourceLifetimeWS-TransferResource Representation SOAP Header Block
30INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (10)Web Services Interoperability organization (WS-I Specifications, provide additional information to improve interoperability between vendor implementations)
WS-I Basic ProfileWS-I Basic Security ProfileSimple Soap Binding Profile
31INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (11)Business Process Specifications
WS-BPELWS-CDLWeb Services Choreography InterfaceWS-ChoreographyXML Process Definition Language
32INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (12)Transaction Specifications
WS-BusinessActivityWS-AtomicTransactionWS-CoordinationWS-CAFWS-TransactionWS-ContextWS-CFWS-TXM
33INESC Coimbra - DSTIS 2009, 4-7 September 2009
WS specifications: WS-* (13)Management Specifications
WS-ManagementWS-Management CatalogWS-ResourceTransferWSDM
Presentation Orientated SpecificationWeb Services for Remote Portlets
34INESC Coimbra - DSTIS 2009, 4-7 September 2009
Conclusions
35
Active Warehousing could be achieved more or lessPerformance issues could be resolved with the Massive Parallel Processing using Shared Nothing ArchitectureDW based on WS and build with SOA facilitate DW with flexibility and open new level of functionalityFurther research in area of WSODW is necessary (e.g. data quality assurance)
INESC Coimbra - DSTIS 2009, 4-7 September 2009
Thank you for your attention,please do not hesitate to ask or
comment
36INESC Coimbra - DSTIS 2009, 4-7 September 2009
Web Service Oriented Data WarehousesAgendaDW challengesMassive Parallel Processing as a key for performance (1)Massive Parallel Processing as a key for performance (2)SOA and WS as a key for flexibilityWSODW characteristic (1)WSODW characteristic (2)WSODW characteristic (3)WSODW implementation (1)WSODW implementation (2)WSOD implementation (3)WSODW implementation (4)Business Process Modeling Notation (BPMN)WS-BPEL (1)WS-BPEL (2)WS-BPEL (3)BPMN and WS-BPEL (1)BPMN and WS-BPEL (2)WS specifications: WS-* (1)WS specifications: WS-* (2)WS specifications: WS-* (3)WS specifications: WS-* (4)WS specifications: WS-* (5)WS specifications: WS-* (6)WS specifications: WS-* (7)WS specifications: WS-* (8)WS specifications: WS-* (9)WS specifications: WS-* (10)WS specifications: WS-* (11)WS specifications: WS-* (12)WS specifications: WS-* (13)Conclusions