28
Data Quality Management in SAP NetWeaver BI Rudolf Hennecke SAP NetWeaver RIG, SAP AG

Data Quality Management in SAP NetWeaver BI - Webinar Powerpoint

  • Upload
    1j1j1j

  • View
    27

  • Download
    2

Embed Size (px)

Citation preview

Data Quality Management in SAP NetWeaver BI

Rudolf Hennecke

SAP NetWeaver RIG, SAP AG

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 2

First-Hand Information

In this presentation, we will examine the following topics:

Definition of Data Quality Management in respect to data warehousing in SAP NetWeaver BIOverview of tools SAP NetWeaver BI provides in order to support data quality initiativesPresentation of new features in the area of Data Quality Management in SAP NetWeaver 2004s

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 3

Agenda

Overview Data Quality Management

Data Validation

Error Handling

Error Resolution

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 4

Definition of Data Quality

IntegrityBetween objects /

systems

TimelinessIs data up to date?

RelevanceComprehensibilityMeaningfulness

CompletenessBetween objects /

systemsOn record level

ConsistencyPlausibilityRedundancy

AccuracyUniformityUniquenessCertaintyCorrectness

Data Quality criteria:

have to be defined by business and IT

have to be regarded on technical and semantic level

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 5

Data Quality Process in SAP NetWeaver BI

Data ValidationDefine data quality and define necessary checksUse automated checks in SAP NetWeaver BI and implement customer checks where necessaryConsider data quality in your Enterprise Data Warehouse design

Error ResolutionResolve data quality issues after data loadingThis can involve deletion, repairing and reloading of dataThis should also include periodic analysis of the data in your BI objects including repair options

Error HandlingDefine reaction on errors during data loadingRetain invalid data for manual or automated correction and subsequent updating to InfoProvidersGet detailed information on type of error and place where it occurred

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 6

Agenda

Overview Data Quality Management

Data Validation

Error Handling

Error Resolution

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 7

Data Validation - Scope

Data validation in SAP NetWeaver BI answers the following questions:

check what?Technical qualitySemantic quality (Business rules)

check where and when?During data loading

In source system, in data transfer, in transformation, in InfoProvider update

On persistent dataOn PSA data, on InfoProvider data

check how?Built-in in data transfer (automatic checks)Implemented in transformation (business rule based)Scheduled

What?

Where?

How?

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 8

Data Validation – Checks in SAP NetWeaver BI

Technical SemanticField level

Data type & conversion exits

CodepageTechnical consistency in respect to BI technology (RSRV)

Plausibility (empty field, plausible value,..)

Technical consistency of data contained in BI Objects (RSRV)

Reconciliation

Referential integrity & Master data check (SID)

Record level

Plausibility on correlation between characteristics and key figures

Table level

Records sent = records updated, no aggregation allowed

Plausibility on aggregation and calculation of multiple data records

Duplicate records in PSADuplicate master data records

Built-in To be implemented

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 9

Data Validation – Built-In Checks

DataSource

OthersBI ServiceAPI

InfoProvider

Transformation

InfoSource

Transformation

InfoSource

Transfer Rules

Update Rules SAP BW 3.xSAP

NetWeaver 2004s

SAP NetWeaver BI

Enhanced!

New!

Duplicates on master data

Referential Integrity

Duplicates in transaction data

Data Type & Conversion Exits

Completeness

Completeness

Codepages

Semantic checks

(based on business

rules)

Technical checks

Source system

Source system

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 10

Focus area: Checking Data Types and Conversion Exits

In SAP NetWeaver 2004s, checks on Data Type and Conversion Exits can be enabled per field of the DataSource.

plausible date fieldsplausible time fieldscharacter values in data type NUMC fields andcompliance with the ALPHA conversion routine.

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 11

Data Validation – Checks to be implemented

Source system

DataSource

OthersBI ServiceAPI

Data Warehouse(Data Acquisition Layer)

Transformation

InfoSource

Transformation

InfoSource

Transfer Rules

Update RulesSAP BW

3.x

SAP NetWeaver

2004s

Source system

SAP NetWeaver BI

Data Warehouse(Integration Layer)

Operational Reporting

Data Marts

Reconciliation& Audit Trace

Data Integrity PSA

Reconciliation& Audit Trace

Custom check points

Customer-defined checks

Semantic checks

(based on business

rules)

Technical checks

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 12

Focus area: Implement semantic checks I

Implement semantic checks in customer transformations (SAP BW 3.x: in the update or transfer rules). These checks can call the monitor and therefore the Error Handling)

From update rules (BW 3.x) append to table MONITOR or See “How To… Create monitor entries from an update routine”

From transfer rules (BW 3.x) append to table G_T_ERRORLOGprocess single record using field RECORD

From transformations append table MONITOR (for monitoring only) and / or raise an exception (for storage in the error stack)

Process single record using field RECORDTo skip records from processing, raise exception cx_rsrout_skip_record. You can abort the whole data package by raising exception cx_rsrout_abort.

Check on master data completenessMaster Data completeness (on attributes and / or texts) can be checked using Transformations / DTPs (SAP BW 3.x: Export DataSources)

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 13

Focus area: Implement semantic checks II

Use custom check points during extractionWrite check point data to separate table and use this for validation.See “How To... Perform Data Load Consistency Checks in BW”

Build Audit Trace in Data ModelingAudit Traces (source, timestamp,…) allow for trace back to source system. You only use them on objects in the Data Warehouse Layer for simplified integrity checks against source systems.

Data Integrity Checks on Data Packages in PSACan be achieved using available APIs on PSA

Build Reconciliation procedureDependent on the scenario this can include checking the technical data transfer only or additional check on semantics (aggregation, calculation in extraction).See “How To…Validate Data In the InfoCube By Comparing to Data In the PSA” and “How To…Reconcile data between SAP source systems and SAP NetWeaver BI”

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 14

Focus area: Data Reconciliation in SAP NetWeaver BI

SAP Source system

SAP NetWeaver BI

MultiProvider

MultiProvider

DataStore Object VirtualProvider

OriginalDataSource

ReconciliationDataSource

Generic DataSourceOriginal DataSource with direct accessReconciliation DataSource in Business Content

First DataSources have been shipped in Business Content for SAP NetWeaver 2004s BI

no customer defined transformations!

New in SAP NetWeaver

2004s!

* Can be enhanced by customer specific exception sending proactive alerts when data quality issue occurs

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 15

Agenda

Overview Data Quality Management

Data Validation

Error Handling

Error Resolution

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 16

Error Handling in SAP NetWeaver BI

Error Handling in SAP NetWeaver BI consists of:

What to do in case of errors?Abort data loading or continue data loading on valid dataReport or do not report on valid dataConfigure up to which threshold invalid data is acceptable

Termination after a certain number of errors

How to monitor invalid data?Show error status in original requests in PSA and in separate error stack

How to correct the invalid data?Manual correction or automated correction (to be implemented) of invalid data in error stack (in SAP BW 3.x: error request)More complex error resolution scenario involving the source system or PSA or other objects in your Data Warehouse

See more details in next chapter “Error Resolution”

Recommendation: Correct errors as early as possible in the data flow!

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 17

Error Handling in SAP BW 3.x

DataSource

InfoSource

Transfer Rules

Update Rules

InfoProvider

Source system

PSA

Original Request

Manual or automated correction

Error Request

InfoPackage

Valid recordsInvalid records

Corrected records

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 18

Error Handling in SAP NetWeaver 2004s - I

Source system

DataSource

InfoProvider

DTP

Transformation

InfoSource

Transformation

No error handling available in InfoPackagesInvalid data can be written to error stack Termination after certain number of errors

can be configured (like in SAP BW 3.x)

Error Stack

Invalid records

Data Transfer Process

Corrected records

Manual or automated correction

Error DTP Valid records

All records

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 19

Error Handling in SAP NetWeaver 2004s - II

Data Transfer Process

DataSource

Error Stack

Data Store ObjectMaintenance of Error Stack key in Data Transfer Process

Semantic key can be definedMore key fields = potentially less records in error stack

New records with the same key will be filtered out.

In the same request and in subsequent requestsOnce a request is deleted in the InfoProvider, the related data records in error stack are automatically deleted.

If Data Store Objects are connected, their key is taken as initial default

Add. KEY

Error DTP

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 20

Focus Area: Data Transfer Process and Temporary Storage

Temporary storage in the data transfer process forEfficient restart of data transfer process in case of error

Temporary storage is used for restart in case of complete abort of processReloading of corrected records from Error Stack is done using the Error DTP

Easy monitoring of invalid data records

Activate temporary storage for each

sub-step of the Data Transfer Process

Identify the detail level of temporary storage

Configure automatic deletion of

temporary storage

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 21

Error Handling

No Error handling

(InfoPackage)No update,

no reporting

Update valid records, no reporting

Update valid records, reporting

possible

Monitor entry X X X X

Abort of update X X

Upd. valid records X X

Marked in tem. storage X X X

Update into Error Stack X X

Color of Request red red red green

Error Handling – SummaryCheck table for the impact of the error handling settings for erroneous records

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 22

Agenda

Overview Data Quality Management

Data Validation

Error Handling

Error Resolution

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 23

Error Resolution in SAP NetWeaver BI

SAP NetWeaver BI offers the following error resolution options:

Automated correction of invalid data during data loading and transformation (to be implemented)

Retain invalid data records in error stack (error request) and correct them manually (or automatically)

and subsequent transfer of corrected data records from the error stack to the InfoProvider using the Error DTP functionality (created automatically within the DTP maintenance)

Correction of invalid data without deletionby scheduling a (Full) repair request against „Overwriting“ Data Store Objects of your Data Warehouse Layerby loading (additive) cancellation records to relevant InfoProviders

(Selective) Deletion of invalid dataand reconstruct (corrected) data from source tables (full upload)and reconstruct data from the delta queue (repeat delta update) and reconstruct (corrected) data from PSA and reconstruct (corrected) data from your Data Warehouse Layer

Analysis and Repair of BI Objects (RSRV)

During data load

Error Handling

After data load

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 24

Focus area: Overwrite data in SAP NetWeaver 2004s

Order Status Quantity719 C 12 KG720 C 10 KG721 O 5 KG

Data in DataStore

Object

Data in source system

(Aggregation type = ‘Overwrite”)DTP

Delta Update

Order Status Quantity719 C 12 KG720 C 8 KG721 O 5 KG

Invalid data gets overwritten!

Order Status Quantity720 C 8 KG

Data in DataSource /

PSA

InfoPackage Full Update

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 25

Focus area: Analysis and Repair of BI Objects

Transaction RSRVChecks the consistency of data and objects stored in SAP NetWeaver BISome tests are capable to repair inconsistencies and errors

Individual test packages can be created combining different elementary testsTest Packages can be scheduled periodically using program RSRV_JOB_RUNNER in process chainsExamples for tests:

Unauthorized characters in characteristic valuesCheck characteristic values with conversion exitConsistency of the time dimension for an InfoCubePSA Duplicate Record CheckAnalysis of texts of BEx Objects for codepage errors

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 26

Summary

Data Quality in Data Warehousing consists of several technical and semantic criteria. It has to be defined on project basis by business and IT department.

SAP NetWeaver BI offers various tools for Data Validation & Error Handling that will assist you to easily detect and correct invalid data.

Your Enterprise Data Warehouse design plays an important role in providing a reliable Data Ware-house Layer for efficient Data Quality Manage-ment.

Some situations require more complex error resolution scenarios. If so, choose the appropriate error resolution measure by correcting the error as early as possible in your data flow.

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 27

Outlook

Integrate the existing BI based solution capabilities byAddressing IQ management integrated into process platform based on the Enterprise Information Management approach to avoid and, if necessary, correct problems as early as possible in the information lifecycle. Providing one IQ design time enabling ESA integrated model driven approach and make IQ models shareable/reusable in ALL application domains (process platform, application development etc. ). Consolidate this way all existing solutions in the platform in one framework.

Offer IQM as an integrated solution, supporting the complete information control loop by providing necessary:

modelsprocessestechnical capabilities ( own / partners )tight application integration

© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 28

Copyright 2006 SAP AG. All Rights Reserved

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.

Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.

Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.

IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, System i, System i5, System p, System p5, System x, System z, System z9, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, Informix, i5/OS, POWER, POWER5, POWER5+, OpenPower and PowerPC are trademarks or registered trademarks of IBM Corporation.

Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems Incorporated in the United States and/or other countries.

Oracle is a registered trademark of Oracle Corporation.

UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.

Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc.

HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology.

Java is a registered trademark of Sun Microsystems, Inc.

JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape.

MaxDB is a trademark of MySQL AB, Sweden.

SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.

The information in this document is proprietary to SAP. No part of this document may be reproduced, copied, or transmitted in any form or for any purpose without the express prior written permission of SAP AG.

This document is a preliminary version and not subject to your license agreement or any other agreement with SAP. This document contains only intended strategies, developments, and functionalities of the SAP® product and is not intended to be binding upon SAP to any particular course of business, product strategy, and/or development. Please note that this document is subject to change and may be changed by SAP at any time without notice.

SAP assumes no responsibility for errors or omissions in this document. SAP does not warrant the accuracy or completeness of the information, text, graphics, links, or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non-infringement.

SAP shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. This limitation shall not apply in cases of intent or gross negligence.

The statutory liability for personal injury and defective products is not affected. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third-party Web pages nor provide any warranty whatsoever relating to third-party Web pages.