Upload
james-chi
View
4.248
Download
4
Embed Size (px)
DESCRIPTION
Citation preview
Establishing a Robust Data Readiness Methodology
Prepared by: James Chi
Confidential GROM Associates, Inc. - 2
Summary
The planning, execution, verification, and
documentation of the migration of application data
from legacy or source systems to SAP are critical
components to any successful SAP project
implementation. SAP requires and expects master
and transactional data of high quality for the
intended process integration benefits to be
realized.
Data Readiness is, however, one of the most
overlooked aspects of an implementation project.
This is partly because so much emphasis is placed
on re-engineering the business processes that the
quality and accuracy of data often takes a lesser
priority. However, based on our experience, we
would suggest that many SAP implementation
projects simply lack the tools and methodologies
to systematically identify and perform data
readiness and conversion activities and resolve
data quality issues.
Our Recommended Solution
The data readiness strategy and methodology
described below is the result of an evolutionary
process developed over many SAP
implementations with multiple clients in various
industry verticals. This methodology is intended to
not only deliver repeatable, predictable, and
demonstrate results; but also bring visibility to
data quality issues early enough in the project to
mitigate them.
Data Readiness Components
Let us first introduce the distinct components that
make up a data migration landscape. As
illustrated in Figure 1, our recommended
methodology follows the traditional Extract,
Transform, and Load (ETL) data migration
component model.
Data Conversion Component Overview
Data Input Source
Central Data Staging
And Transformation Tool
LSMW
BDC /BDC Direct
CATT
Manual Input
Data StagingData Export
Destination
Source
Applications
iMac
Manual Data Collection
via data construction
application
Manual Data Collection-
via Excel and Flat File
Manual Data Collection
via SAP
SAP Systems
Custom ABAP
Figure 1
Extract Transform Load
Confidential GROM Associates, Inc. - 3
Data Input Sources
Data for the project implementation come from
sources as identified in the functional
specifications. The data for loading into SAP either
already exists in an electronic format or are
manually captured in an approved electronic
format. Import programs need to be kept as
simple as possible for faster implementation and
easier traceability. Import data can come from the
following sources:
Source Application Data – Data from source
systems are either exported into a comma
delimited text file or copied tables when ODBC
database connections are available. Data are
extracted out of source applications following the
principle of “all data and records” without data
filtering, filtering, translation, or formatting.
Manual Data Collection – Data may be manually
collected in situations where source data does not
exist. Based on the complexity and referential
dependency of the collected data, a data
construction application can be developed to help
facilitate the manual data collection and validation
process.
These data are subsequently provided to the
central Data Staging & Transformation tool
Manual Data Collection in Excel and Flat File – In
some cases the need to collect data manually that
does not exist in the source system(s) is served by
MS-Excel Spreadsheets or Flat Text File. Based on
the complexity of the data that is needed, the
project team develops and distributes an Excel
spreadsheet application to help facilitate the
manual data collection process. The data is
subsequently uploaded to the central Data Staging
& Transformation tool.
Manual Data Collection in SAP – In certain
functional areas, the project can manually collect
data for SAP where data do not exist in source
systems directly in the SAP system. It is
sometimes advantageous to build SAP data directly
in the SAP environment and take advantage of
existing pre-defined data value tables and
validation logic. The data is subsequently
extracted from SAP and provided to the central
Data Staging & Transformation tool.
DATA STAGING
Staged
Source DataTarget
Data
2 4
3
5
7
Extract Transform Load
11
Uploaded
Target Data
8
Source Data
Kickouts
Data Owner
SOURCESYSTEMS
TARGETSYSTEMS
APPLICATION
Referential
&
Supplemential
Data
6
Configuration
Team
Target Data
Kickouts
Team
Target Data
Kickouts
Data Owner
Process
Update
Rep
ort
Process
Update
Rep
ort
Data Readiness Process Overview
Figure 2
Confidential GROM Associates, Inc. - 4
Data Staging
All master and transactional data loaded into the
SAP system should be staged in a central Data
Staging & Transformation tool. This repository
receives source data and outputs transformed
target data. It contains source data in its
originally supplied form, all the rules to convert,
translate, supplement and format this data into the
destination format, and intermediate tables
required for data readiness processing. The output
from the central Data Staging & Transformation
tool is used as the source of data loads into SAP.
Commercial ETL tools are designed for the purpose
of extracting, transforming, and loading data.
These tools should be leveraged on projects where
available. On projects where a commercial ETL
tool is not available, native database tools such as
Microsoft’s DTS or Oracle’s Warehouse Builder can
be used as well.
Once staged in their original or approved collection
format, all data is filtered, translated, and
formatted in a traceable and reportable fashion via
execution of individual data rules in the central
Data Staging & Transformation Tool. Exceptions to
this rule should only be permitted for manually
entered data objects.
Data Export Destination Programs
Data is exported from the central Data Staging &
Transformation tool into SAP via standard SAP
data conversion methods and tools. Data
programs must be kept as simple as possible to
ensure quick development and better traceability
for troubleshooting and reconciliation purposes.
These conversion methods and tools are:
LSMW – Legacy System Migration
Workbench
BDC Programs – Binary Direct Connection
CATT – Computer Aided Test Tool
Post Load Custom ABAP
Post Load Manual Input
Comprehensive Data Readiness Process
Let us now describe the steps involved in a robust
and comprehensive data readiness process. The
overall process is illustrated in Figure 2.
In order to ensure ongoing execution,
troubleshooting, and problem resolution
throughout the data conversion test cycles
described in the next section “Data Conversion
Approach and Methodology”, the Systematic Data
Readiness Process is followed for each data test
run. Following is a high-level overview of the
process.
Step 1: Extraction of Source Data
The conversion starts with the extraction of source
data. This extraction, depending upon its source
may be a direct ODBC connection, a spreadsheet
or flat file created programmatically, or a manually
loaded spreadsheet. Original spreadsheets and
flat files must be secured in a centralized location
for audit and validation purposes. In all cases, the
extract of source data must be accompanied by a
report that details the contents. A Source Data
Reconciliation Report should be produced for each
extract and must indicate the total number of
records contained in the source. Other metrics
should be supplied for key data fields such as
sums, totals, or hash totals of data columns
contained in the source. This information will be
very important in demonstrating that the source
data has been completely and accurately imported
into the central Data Staging & Transformation
tool.
Step 2–3: Upload, Process, and Verification of Extracted Data & Data
Quality Checkpoint One
The next step in the process begins the upload of
data from source applications and manual
collection repositories in their native format into
the central Data Staging & Transformation tool. It
is critical for all data to be imported into the
staging tool in an “as-is” format. All source
Confidential GROM Associates, Inc. - 5
application tables and/or spreadsheet rows and
columns are imported into the staging tool without
any filtering and manipulation. This ensures that
all data record filtering, translation, harmonization,
and formatting operations are performed in the
staging tool in an approved, auditable, traceable,
and reportable fashion via execution of business
rules at individual source level.
Once the data has been successfully extracted into
the central Data Staging & Transformation tool,
the source data is modified according to data
filtering rules. Data filtering refers to reducing the
dataset based upon rules documented in the
functional specifications and business relevancy
parameters. This filtering is performed in order to
ensure that only active and relevant data are
loaded into SAP. Additionally, source data can
now be subject to a variety of quality and integrity
checks to identify source data issues that can
either be resolved in the staging tool as a
transformation rule or be resolved back in the
source system. Data records that do not pass key
quality or integrity checks should be flagged as
such and omitted from subsequent transformation
and loading steps, and directed to Data Owners for
correction or clarification.
Data reconciliation activities are also performed.
All results are gathered and compared to the
Source Data Reconciliation Report. Results and
Kickouts are provided to Data Owners for review,
approval and correction.
Step 4: Transformation of Staged Data
Once the source data has been filtered, all source
data are combined into a single staged target SAP
data for translation, supplementation and
formatting rules specifically designed for the target
environment per Design Specifications. Data
translation refers to replacing source system
coding, groupings, and other source system
application data characteristics to corresponding
SAP coding, groupings, and data characteristics.
Supplementation refers to supplying additional
referential or required data according to Design
Specifications that are not available from source
data. Data formatting refers to converting the
source data from its original record format to a
format that can be read by the SAP data upload
programs for loading into SAP. These data staging
rules, define the main transformation of the
filtered source data into data that is coded and
formatted for SAP upload purposes. All data
formatting, filtering, and translation rules are
based on criteria documented in the functional
specifications. Data reconciliation activities are
performed to verify that all required business rules
defined in the functional specifications have been
completely and accurately applied.
Step 5: Data Quality Checkpoint Two
Once the data has been successfully filtered,
translated, and formatted, the resulting dataset
can be subject to another set of quality and
integrity checks aimed at identifying target data
integrity and completeness issues. These issues
can be resolved in the staging tool as a
transformation rule, resolved in SAP, resolved in
the data construction application, or resolved back
in the source system. Data records which do not
pass key quality or integrity checks should be
flagged as such and omitted from subsequent
loading steps, and directed to data owners and
configuration team for correction or clarification.
Data reconciliation activities are also performed
from the target SAP environment perspective. All
results are gathered and compared to verify that
all required business rules defined in the functional
specifications have been completely and accurately
applied. Results and kickouts reports are provided
to Data Owners and Configuration Team for review
and correction.
Step 6: Data Supplementation
Following review of target data results and
kickouts reports, data owners have the opportunity
to inject additional data into the transformation
process of staged data. Additional data refers to
missing data component that is required according
to functional or SAP system specifications and
cross reference data that mapping legacy data into
new SAP data per Design Specifications.
Configuration team has the opportunity to verify,
validate and correct data value needed in target
Confidential GROM Associates, Inc. - 6
SAP system in order to load approved staged
target data without errors.
Step 7-8: Loading of Target Data into SAP & Final Verification
Subsequent to the successful completion of data
quality checks, translated and formatted data will
be loaded into SAP via any of the mechanism
described under the “Data Export Destination
Programs” section of this document and verified
for accuracy and completeness. This verification
will involve a combination of visual inspection and
technical checks including record counts, sums,
and or hash totals of data columns contained in
the export files and SAP tables.
Data Readiness Approach and
Methodology
Now that we have introduced both data readiness
landscape components and process, we can finally
position how this all fits in the lifecycle of an SAP
implementation project.
What follows is a description of the various data
readiness activities as they are executed
throughout the Grom’s Best Practice Data
Readiness Approach. Grom’s Data Readiness
Approach is an enhanced, refined and
complementary to ASAP methodology that SAP
implementation project is typically followed.
Project Definition – The purpose of this phase is
to understand and define data quality baseline and
a path forward with respect to data readiness for
SAP implementation. Once data quality baseline
has been defined and understood, data migration
and readiness scope can be derived and estimated
in alignment with business objectives of SAP
implementation. Toolset selection can be
accomplished based on scope of the conversion.
Finally, the effort and cost of the conversion can
be estimated for approval.
Project Preparation – This phase is to provide
initial preparation and planning for the SAP
implementation project, the important data
readiness issues addressed during the project
preparation phase are:
Finalization of data migration scope and data
readiness strategy
On-boarding of data team
Installation of ETL toolset
Initiation of legacy system connection and
extraction
Business Blueprint – Define the business
processes to be supported by the SAP system and
the functional requirements, data conversion and
readiness activities begins with the identification of
data objects which require conversion from the
source application to the SAP system. During this
phase, all data and records will be extracted and
profiled from source systems, business and SAP
readiness requirements will be defined, and
Mapping Documentation completed in order for
data quality report development. The quality and
integrity of the source data will assessed
repeatedly during this period.
Realization (Build) – Build the system based
upon the requirements described in the functional
specifications, included in this phase are several
data readiness process development and individual
data object testing cycles. During the early part of
realization, functional specifications are developed
for the data conversion objects identified during
requirements gathering. These design
specifications serve as the basis for determining
which conversion mechanisms are used and
provide additional functional conversion program
development and testing details for a given data
object. The project team develops all required data
conversion rules and programs. These conversion
rules and programs are tested repeatedly in the
Q/A or Unit Test environments as illustrated in
Figure 3.
Confidential GROM Associates, Inc. - 7
Realization (Test) – The purpose of this phase is
dedicated for testing and refinement of conversion
rules and programs of the central Data Staging
and Transformation tool. As source data evolves
in the course of normal business operation over
the project timeline, new data issues may surface
and conversion rules may need to be updated or
refined through the Continual Improvement
Interactive Process. As the target SAP system in
each environments continue to mature into “To-
Be” production system, data readiness will be
measured and reported against environment to
confirm alignment of design and functional
specifications. Through this iterative testing and
repeatable process, data quality with respect to
readiness will elevate closer toward
Transactionable Data Quality
Level prior to Go-Live as illustrated below in figure
4. By the end of this realization test phase, the
central Data Staging & Transformation tool will be
tested with full data conversions in 2 to 3 rounds
of Unit Testing and 2 to 3 rounds of Integration
Testing.
Sources
Cutover
Rehearsal
Environments
Integration Test
Environments
Unit Test
Environments
SAP
Production
Data Staging
Application
Results
User Reports
Business
Blueprint
Testin
g E
ven
ts
Go-Live
Pull On Demand
Resolutions
Figure 3
Continual Improvement Iterative Process
Realization
(Test)
Data
Quality
Data
Readin
ess
Activ
ities
Transactionable Data Quality LevelHigh
Low
Go
-Liv
e
Fin
al P
rep
ara
tio
n
Ins
tall/R
un
/Su
pp
ort
Bu
sin
es
s B
lue
pri
nt
Realization
(Build)
Figure 4
Data Quality with Continual Improvement Process
Project Time Line
Confidential GROM Associates, Inc. - 8
Final Preparation – Development of the central
Data Staging & Transformation tool is completed
and cutover activities will be rehearsed 2 to 3
rounds during this phase. As part of final
production cutover, final source data extractions
and preparations will be performed and all master
and transactional data will be loaded into the
production environment. Production data
reconciliation and validation reports will be
prepared to ensure all records are accounted for.
Any additional manual data conversion activities
and manual configuration steps in SAP will
executed according to conversion plan. Finally,
data owners sign-off the production load and
validation reports as required by the SAP
implementation project.
Install / Run / Support – As the purpose of this
phase is the transition from the pre-production
environment to live production operation, this
phase is used to closely monitor system
transactions, and to optimize system performance.
From a data conversion perspective, any post go-
live issues related to data should be investigated,
resolved, and closed.
About the Author
James Chi is the Director of the GROM’s Business
Consulting Group Enterprise Solutions Practice and
has overall delivery responsibilities for all GROM-
led projects. James joined GROM after spending
the last seventeen years delivering SAP solutions
in the pharmaceutical, medical device, and
consumer products industries. James’ strong
functional background in Supply Chain Planning
and Manufacturing Execution has blended to create
a well-rounded business expert with more than
fifteen years of Project Management experience.
James has a BE in Electrical Engineering from
Stevens Institute of Technology.