54
Data Platform Vision Vu Tuyet Trinh [email protected] Hanoi University of Technology

Session14_DataPlatform

Embed Size (px)

Citation preview

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 1/54

Data Platform Vision

Vu Tuyet [email protected]

Hanoi University of Technology

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 2/54

Microsoft

MS. SQL Server 2008

Outline

Overview of Microsoft Data Platform Vision

 Analysis Services

Reporting Services Integration Service

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 3/54

Microsoft

MS. SQL Server 2008

DATADATA

PLATFORMPLATFORM

VISIONVISION

Types of Types of 

DataData

Services toServices to

interactinteract

XML

e-mail

time/calendar 

file, document

geospatial

search

query

data analysisdata analysis

reportingreporting

data integrationdata integration

robust

synchronization

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 4/54

Microsoft

MS. SQL Server 2008

Microsoft Data Platform Vision

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 5/54

Microsoft

MS. SQL Server 2008

Improved Productivity

ADO.NET Entity Framework:

Provides a data programming interface that makes it:

Easy to understand the conceptual data model Easy to design and develop applications

Easy to maintain applications

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 6/54

Microsoft

MS. SQL Server 2008

Improved Productivity

LINQ

LINQ to SQL .

LINQ to Entities .

LINQ to DataSet .

LINQ to XML .

LINQ to Object .

Visual Studio

Providing features such as source code control, tracking, and

deployment tools .

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 7/54

MicrosoftMS. SQL Server 2008

Improved Productivity

Microsoft Office 2007

See Part 2-Analysis Services.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 8/54

MicrosoftMS. SQL Server 2008

SERVICES

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 9/54

MicrosoftMS. SQL Server 2008

SERVICES

servicesservices

AnalysisAnalysis

ServicesServices

ReportingReporting

ServicesServices

IntergrationIntergrationServicesServices

Data PlatformData Platform

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 10/54

MicrosoftMS. SQL Server 2008

1. Analysis Services

Microsoft SQL Server 2008 Analysis Services builds on a strong

foundation of analytical tools to provide a truly enterprise scale

solution .

 Analysis Services provides optimized Office interoperability to

provide a familiar interface and an open, embeddable architecture to

allow developers to integrate the data.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 11/54

MicrosoftMS. SQL Server 2008

Analysis Services Build EnterpriseBuild Enterprise--

Scale SolutionsScale Solutions

AnalysisAnalysis

ServicesServices

 Extend Reach

withComprehensive

Analytics

Drive Actionable

Insight through

Familiar Tools

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 12/54

MicrosoftMS. SQL Server 2008

Build Enter prise-Scale Solutions

Microsoft SQL Server 2008 Analysis Services is designed to provide

exceptional performance and scales to support applications with millions of 

records and thousands of users. Innovative, consolidated tools help improve

developer productivity and result in better design and faster implementation.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 13/54

MicrosoftMS. SQL Server 2008

Build Enter prise-Scale Solutions

 BuildBuild

EnterpriseEnterprise--

ScaleScale

SolutionsSolutions

High

Developer 

Productivity

Scalable

Infrastructure

Superior 

Performance

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 14/54

MicrosoftMS. SQL Server 2008

High Developer Productivity

SQL Server 2008 Analysis Services introduces a set of new,

innovative Best Practice Design Alerts that provide automatic

notification of potential design issues early in the development

process, which reduces wasted time caused by design mistakes and

facilitates a faster development process.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 15/54

MicrosoftMS. SQL Server 2008

High Developer Productivity

Figure 1 shows an alert on the Time dimension and Calendar hierarchy.

 Figure 1 Figure 1

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 16/54

MicrosoftMS. SQL Server 2008

High Developer Productivity Figure 2 shows the current alerts on a design.

 Figure 2 Figure 2

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 17/54

MicrosoftMS. SQL Server 2008

High Developer Productivity SQL Server 2008 Analysis Services further increases developer productivity

with new, enhanced cube, dimension, and attribute designers.

 Figure 3

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 18/54

MicrosoftMS. SQL Server 2008

Scalable Infrastructure

 Analysis Services can scale to support databases of many terabytes

in size with many thousands of users.

SQL Server 2008 Analysis Services provides Dynamic ManagementViews similar to those available to the database engine.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 19/54

MicrosoftMS. SQL Server 2008

Superior Performance

 Analysis Services cubes are multidimensional structures and stores

bussiness data in a highly optimized and compressed format called

MultidimensionalOLAP (MOLAP).

 AS can improve query performance by orders of magnitude and

therefore allow a finer granularity of analysis.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 20/54

MicrosoftMS. SQL Server 2008

Superior Performance SQL Server provides attribute-based hierarchies that avoid the need

for any duplication and improve performance and scalability.

SQL Server 2008 Analysis Services allows writeback data to be

stored in MOLAP format resulting in significantly better performance

for query and writeback operations.

 AS prevent users from overloading the relational database by

providing a high performance, transparent, synchronized aggregate

cache.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 21/54

MicrosoftMS. SQL Server 2008

Extend Solutions with Comprehensive Analytics

 Analysis Services takes the analytical platform to a new level offering more

advanced features than those traditionally related to OLAP. This enables

organizations to accommodate multiple analytical needs within one solutionoffering so much more than a traditional OLAP platform. In this effort, the

Unified Dimensional Model  ( U DM) plays a central role, providing extensive

analytical capabilities.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 22/54

MicrosoftMS. SQL Server 2008

 Extend Reach

with

Comprehensive

Analytics

UnifiedDimensional

Model

Predictive

Analysis

Central

Manageability of 

Key Enterprise

Metrics

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 23/54

MicrosoftMS. SQL Server 2008

Unified Dimensional Model

The UDM was a new concept for Analysis Services that was introduced with

the release of SQL Server 2005. The UDM provides an intermediate logical

layer between the physical relational database used as the data source and

the proprietary cube and dimension structures that are used to resolve user 

queries.

In this way, you can think of the UDM as the centerpiece of the OLAP

solution.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 24/54

MicrosoftMS. SQL Server 2008

Central Manageability of Key Enterprise Metrics

In SQL Server 2008 Analysis Services enterprise wide Key Performance

Indicators (KPI¶s) can be centrally stored and managed.

This provides a central repository for users to access key enterprise metricsthrough a variety of applications including Microsoft Office

PerformancePoint Server 2007, Microsoft Office Excel 2007, Microsoft

Office SharePoint Services 2007, and Microsoft SQL Server Reporting

Services.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 25/54

MicrosoftMS. SQL Server 2008

Predictive Analysis

Traditional data analysis looks at historical data and quickly returns

results based on this data.

However, many questions asked by business users cannot be

answered by this sort of analysis as they are not looking for the

results of what has happened, but instead they are looking for 

predictions of what might happen.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 26/54

MicrosoftMS. SQL Server 2008

Predictive AnalysisMicrosoft SQL Server Data Mining Add-Ins for Office 2007:

The Data Mining Add-Ins for Office 2007 empowers end users to performadvanced analysis directly in Microsoft Excel and Microsoft Visio.

There are three individual components:

Data Mining Client for Excel enables you to create and manage an entire Analysis Services data mining project from within Excel 2007.

Table Analysis Tools for Excel enables you to use the powerful AnalysisServices data mining capabilities to analyze data stored in Excel spreadsheets.

Data Mining Templates for Visio enables you to render decision trees,regression trees, cluster diagrams, and dependency nets in Visio diagrams.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 27/54

MicrosoftMS. SQL Server 2008

Drive Actionable Insight through Familiar Tools

Drive

Actionable

Insight

through

Familiar Tools

Optimized

Office

Interoperability

Rich Partner 

Extensibility

Open

Embeddable

Architecture

MSOffice Excel

MS Office Word

MS Office Visio

MS Office Share

Point

MS Office

Performance Point

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 28/54

MicrosoftMS. SQL Server 2008

2. Reporting Service Microsoft SQL Server 2008 Reporting Services provides a complete server-

based platform that is designed to support a wide variety of reporting needs

including managed enterprise reporting, ad-hoc reporting, embedded

reporting, and web based reporting to enable organizations to deliver 

relevant information where needed across the entire enterprise.

Reporting Services 2008 provides the tools and features necessary toauthor a variety of richly formatted reports from a wide range of data

sources and provides a comprehensive set of familiar tools used to manage

and secure an enterprise reporting solution.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 29/54

MicrosoftMS. SQL Server 2008

Reporting

Services

t oring Report anagingReporting Services elivering Reports

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 30/54

MicrosoftMS. SQL Server 2008

Authoring Report

 AuthoringAuthoring

ReportReportAccessing Data

Sources for 

Report Creation

Using ReportDevelopment

Tools

Creating

Compelling

Reports

Tablix

Charts

Interactive

Features

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 31/54

MicrosoftMS. SQL Server 2008

Managing Reporting Services

Managing

Reporting Services

Extending Management

Capabilities

Extending Management

Capabilities

Configuring a ReportingServices InstanceConfiguring a ReportingServices Instance

MS Office SharePoint

Services Integration

MS Office SharePoint

Services Integration

Securing Reporting

Services

Securing Reporting

Services

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 32/54

MicrosoftMS. SQL Server 2008

Delivering Reports

Delivering Reports

High PerformanceReport ProcessingHigh PerformanceReport Processing

CachingCaching

SnapshotsSnapshots

Multiple File

Formats

Multiple File

Formats

Delivering Reports

through Subscriptions

Delivering Reports

through Subscriptions

Embedding Reports into

Business Applications

Embedding Reports into

Business Applications

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 33/54

MicrosoftMS. SQL Server 2008

3. Intergration Services

SQL Server 2008 Integration Services (SSIS) helps Information

Technology departments to meet data integration requirements in

their enterprises.

SQL Server 2008 Integration Services meets the challenges of 

cleansing, transforming, and mapping multiple data sources withlarge volumes into a useful format.

New features improve its ability to scale up and improve

performance while speeding development and lowering the TCO.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 34/54

MicrosoftMS. SQL Server 2008

3. Intergration Services

IntergrationIntergrationServicesServices

 Technology

Challenges

Organizational Challenges

EconomicChallengesEconomicChallenges

SSIS

Architecture

Integration

Scenarios

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 35/54

MicrosoftMS. SQL Server 2008

 Technology

Challenges

As illustrated in Figure , with

increased staging the time taken

to ³close the loop,´ (i.e., to

analyze, and take action on newdata) increases as well. These

traditional ELT architectures (as

opposed to value-added ETL

 processes that occur prior to

loading) impose severe

restrictions on the ability of 

systems to respond to emerging

 business needs.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 36/54

MicrosoftMS. SQL Server 2008

SSIS

Architecture

Task flow and data

flow engine

Pipeline architecture

ADO.NET connectivity

Thread pooling

Persistent lookups

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 37/54

MicrosoftMS. SQL Server 2008

IntegrationScenarios

SSIS for data transfer 

operations

SSIS for data transfer 

operations

SSIS for data warehouse

loading

SSIS for data warehouse

loading

SSIS and Data QualitySSIS and Data Quality

Application of SSIS

Beyond Traditional ETL

Application of SSIS

Beyond Traditional ETL

SSIS, the Integration

Platform

SSIS, the Integration

Platform

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 38/54

MicrosoftMS. SQL Server 2008

SSIS for data transfer operations

SQL Server 2008 Integration Services has an improved wizard that

uses ADO.NET, has an improved user interface,

performs automaticdata type conversions,

and is more scalable

than previous versions.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 39/54

MicrosoftMS. SQL Server 2008

SSIS for data warehouse loading

SQL Server 2008 includes support for Change Data Capture (CDC).

SSIS can consume data from (and load data into) a variety of 

sources including managed (ADO.NET),OLE DB, ODBC, flat file,

MicrosoftOffice Excel®, and XML by using a specialized set of 

components called adapters.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 40/54

MicrosoftMS. SQL Server 2008

Figure 3 shows anexample of such a flow.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 41/54

MicrosoftMS. SQL Server 2008

 Figure 4

shows a page from

the SCD

Wizard.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 42/54

MicrosoftMS. SQL Server 2008

Figure 5 shows

the data flow that

is generated by

this Wizard

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 43/54

MicrosoftMS. SQL Server 2008

SSIS and Data Quality

One of the key features of SSIS, as well as its ability to integrate data, is its abilityto integrate different technologies to manipulate the data. This has allowed SSIS

to include innovative ³fuzzy logic´±based data cleansing components.

SSIS deeply integrates with the data mining functionality in Analysis Services.

Data mining abstracts out the patterns in a dataset and encapsulates them in a

mining model.

Support for complex data routing in SSIS helps you to not only identify

anomalous data, but also to automatically correct it and replace it with better 

values. This enables ³closed loop´ cleansing scenarios

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 44/54

MicrosoftMS. SQL Server 2008

Figure 6 shows

an example of a

closed loop

cleansing data

flow.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 45/54

MicrosoftMS. SQL Server 2008

Application of SSIS Beyond Traditional ETL 

Application of SSIS

Beyond Traditional ETL

Application of SSIS

Beyond Traditional ETL

Service Oriented

Architecture

Data and text

mining

On-demand data

source

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 46/54

MicrosoftMS. SQL Server 2008

On-demand d ata source

Figure 7

shows a

SSIS

 package that

sources data

from RSS

feeds over 

the Internet,

integrateswith data

from a Web

service

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 47/54

MicrosoftMS. SQL Server 2008

Figure 8 shows

the use of the

SSIS package as

a data source in

the Report

Wizard.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 48/54

MicrosoftMS. SQL Server 2008

SSIS, the Integration Platfor m 

SSIS goes beyond being an ETL tool not only in terms of enabling

nontraditional scenarios, but also in being a true platform for data

integration. SSIS is part of the SQL Server Business Intelligence (BI)

platform that enables the development of end-to-end BI applications.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 49/54

MicrosoftMS. SQL Server 2008

SSIS, the Integration

Platform

SSIS, the Integration

Platform

Integrated

development platform

Integrated

development platform

ProgrammabilityProgrammability

ScriptingScripting

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 50/54

MicrosoftMS. SQL Server 2008

I ntegrated development platform

SQL Server Integration Services, Analysis Services, and Reporting

Services all use a common Microsoft Visual Studio® based

development environment called the SQL Server Business

Intelligence (BI) Development Studio. BI Development Studio

provides an integrated development environment (IDE) for BI

application development.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 51/54

MicrosoftMS. SQL Server 2008

I ntegrated development platform

Figure 9

shows a BI

Development

Studio

solution that

consists of 

Integration,

Analysis, andReporting

 projects.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 52/54

MicrosoftMS. SQL Server 2008

I ntegrated development platform

Figure 10

shows an

example of 

geographic

data

visualized

using a

scatter plotand a text

grid.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 53/54

MicrosoftMS. SQL Server 2008

Figure 11 shows

an example of ascript that checks

for the existence

of an Office Excel

file.

8/7/2019 Session14_DataPlatform

http://slidepdf.com/reader/full/session14dataplatform 54/54

MicrosoftMS SQL Server 2008

Making Data Integration Approachable

Figure 12, SSIS

eliminates (or at

least minimizes)

unnecessary

staging