161
Informatica PowerCenter Data Validation Option (Version 9.5.0) Installation and User Guide

PC 950 DataValidationOption UserGuide En

Embed Size (px)

DESCRIPTION

Informatica DVO user guide

Citation preview

Page 1: PC 950 DataValidationOption UserGuide En

Informatica PowerCenter Data Validation Option(Version 9.5.0)

Installation and User Guide

Page 2: PC 950 DataValidationOption UserGuide En

Informatica PowerCenter Data Validation Option

Version 9.5.0July 2012

Copyright (c) 1998-2012 Informatica. All rights reserved.

This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use anddisclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form,by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or internationalPatents and other Patents Pending.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided inDFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013©(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.

The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us inwriting.

Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica OnDemand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and InformaticaMaster Data Management are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other companyand product names may be trade names or trademarks of their respective owners.

Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rightsreserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rightsreserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © MetaIntegration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems Incorporated. Allrights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All rights reserved.Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights reserved. Copyright ©Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights reserved. Copyright © InformationBuilders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved. Copyright Cleo Communications, Inc. All rightsreserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-technologies GmbH . All rights reserved. Copyright © JaspersoftCorporation. All rights reserved. Copyright © is International Business Machines Corporation. All rights reserved. Copyright © yWorks GmbH. All rights reserved. Copyright ©Lucent Technologies 1997. All rights reserved. Copyright (c) 1986 by University of Toronto. All rights reserved. Copyright © 1998-2003 Daniel Veillard. All rights reserved.Copyright © 2001-2004 Unicode, Inc. Copyright 1994-1999 IBM Corp. All rights reserved. Copyright © MicroQuill Software Publishing, Inc. All rights reserved. Copyright ©PassMark Software Pty Ltd. All rights reserved. Copyright © LogiXML, Inc. All rights reserved. Copyright © 2003-2010 Lorenzi Davide, All rights reserved. Copyright © RedHat, Inc. All rights reserved. Copyright © The Board of Trustees of the Leland Stanford Junior University. All rights reserved. Copyright © EMC Corporation. All rights reserved.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is licensed under the Apache License,Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing,software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See theLicense for the specific language governing permissions and limitations under the License.

This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software copyright ©1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under the GNU Lesser General Public License Agreement, which may be found at http://www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind, either express or implied, including but notlimited to the implied warranties of merchantability and fitness for a particular purpose.

The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California, Irvine,and Vanderbilt University, Copyright (©) 1993-2006, all rights reserved.

This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and redistribution ofthis software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.

This product includes Curl software which is Copyright 1996-2007, Daniel Stenberg, <[email protected]>. All Rights Reserved. Permissions and limitations regarding thissoftware are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or withoutfee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms availableat http://www.dom4j.org/ license.html.

The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to termsavailable at http://dojotoolkit.org/license.

This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding thissoftware are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.

This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at http://www.gnu.org/software/ kawa/Software-License.html.

This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & WirelessDeutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.

This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are subjectto terms available at http://www.boost.org/LICENSE_1_0.txt.

This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at http://www.pcre.org/license.txt.

This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to termsavailable at http:// www.eclipse.org/org/documents/epl-v10.php.

This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://www.stlport.org/doc/ license.html, http://www.asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://httpunit.sourceforge.net/doc/license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/license.html, http://www.libssh2.org,http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-agreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html ; http://aopalliance.sourceforge.net/ ; http://www.bouncycastle.org/licence.html; http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt . http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231; http://developer.apple.com/library/mac/#samplecode/HelpHook/Listings/HelpHook_java.html;http://nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html;http://forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/software/tcltk/license.html, http://

Page 3: PC 950 DataValidationOption UserGuide En

www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.iodbc.org/dataspace/iodbc/wiki/iODBC/License; http://www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/index.html; http://www.net-snmp.org/about/license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt; and http://www.schneier.com/blowfish.html; http://www.jmock.org/license.html; and http://xsom.java.net.

This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and DistributionLicense (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary Code LicenseAgreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php) the MIT License (http://www.opensource.org/licenses/mit-license.php) and the Artistic License (http://www.opensource.org/licenses/artistic-license-1.0).

This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this softwareare subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab. For furtherinformation please visit http://www.extreme.indiana.edu/.

This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775;6,640,226; 6,789,096; 6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,243,110; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422;7,676,516; 7,720,842; 7,721,270; and 7,774,791, international Patents and other Patents Pending.

DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the impliedwarranties of noninfringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. Theinformation provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation issubject to change at any time without notice.

NOTICES

This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress SoftwareCorporation ("DataDirect") which are subject to the following terms and conditions:

1.THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOTLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.

2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OFTHE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACHOF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.

Part Number: PC-DVO-95000-0001

Page 4: PC 950 DataValidationOption UserGuide En

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiInformatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Informatica Customer Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Informatica Multimedia Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Chapter 1: Data Validation Option Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Data Validation Option Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Data Validation Option Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Data Validation in an Enterprise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Data Validation Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

System Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Data Validation Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Basic Counts, Sums, and Aggregate Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Check Referential Integrity of Target Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Enforce Constraints on Target Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Compare Individual Records between Sources and Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Data Validation with Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Chapter 2: New Features and Behavior Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6New Features and Enhancements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

New Features and Enhancements in 9.5.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

New Features and Enhancements in 9.1.4.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

New Features and Enhancements in 9.1.2.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

New Features and Enhancements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Behavior Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Behavior Changes in 9.1.4.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Behavior Changes in 9.1.2.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Behavior Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Chapter 3: Data Validation Option Client Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Data Validation Option Client Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Tabs Available in Data Validation Option Client Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Table of Contents i

Page 5: PC 950 DataValidationOption UserGuide En

Tests Tab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

SQL Views Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Lookup Views Tab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Join Views Tab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Folders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Copying Folders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Copying Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Menus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Settings Folder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Chapter 4: Data Validation Option Installation and Configuration. . . . . . . . . . . . . . . . . . . 20Data Validation Option Installation and Configuration Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

PowerCenter Support for Data Validation Option Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

System Permissions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Information Required for Installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Installing and Configuring Data Validation Option for the First Time. . . . . . . . . . . . . . . . . . . . . . . . . 22

Data Validation Option Configuration for Additional Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Data Validation Option Upgrade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Upgrading from Version 9.1.x to Version 9.5.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Upgrading from Version 3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Upgrading from Version 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

DVOCmd Installation on UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Environment Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Modifying the License Key. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

JasperReports Server Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Chapter 5: Data Validation Option Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Data Validation Option Management Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

User Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Preferences File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Changing the User Configuration Directory through a Batch File. . . . . . . . . . . . . . . . . . . . . . . . 33

Changing the User Configuration Directory through an Environment Variable. . . . . . . . . . . . . . . 33

Multiple Data Validation Option Repository Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Multiple PowerCenter Installation Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Informatica Authentication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Data Validation Option Users and Informatica Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Informatica Authentication Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Configuring Informatica Authentication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Chapter 6: Repositories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Repositories Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

ii Table of Contents

Page 6: PC 950 DataValidationOption UserGuide En

Adding a Repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Editing Repositories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Deleting Repositories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Refreshing Repositories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Exporting Repository Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Metadata Export and Import. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Exporting Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Importing Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Metadata Manager Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Configuring Metadata Manager Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Chapter 7: Table Pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Table Pairs Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Table Pair Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Connection Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Database Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Pushing Test Logic to the Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

WHERE Clauses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Table Joins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Bad Records Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Parameterization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Adding Table Pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Editing Table Pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Deleting Table Pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Viewing Overall Test Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Chapter 8: Tests for Table Pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Tests for Table Pairs Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Test Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Fields A and B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Conditions A and B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Threshold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Max Bad Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Case Insensitive. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Trim Trailing Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Null = Null. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Comments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Expression Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Expression Tips. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Adding Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Editing Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Table of Contents iii

Page 7: PC 950 DataValidationOption UserGuide En

Deleting Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Running Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Automatic Test Generation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Generating Table Pairs and Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Generating Tests for Table Pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Compare Columns by Position. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Bad Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Chapter 9: Single-Table Constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62Single-Table Constraints Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Single Table Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Connection Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Bad Records Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Parameterization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Adding Single Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Editing Single Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Deleting Single Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Viewing Overall Test Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Chapter 10: Tests for Single-Table Constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70Tests for Single-Table Constraints Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Test Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Constraint Value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Remaining Controls on Test Editor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Adding Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Editing Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Deleting Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Running Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Bad Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Chapter 11: SQL Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76SQL Views Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

SQL View Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Table Definitions and Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Column Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

SQL Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Comment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Adding SQL Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

iv Table of Contents

Page 8: PC 950 DataValidationOption UserGuide En

Editing SQL Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Deleting SQL Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Chapter 12: Lookup Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Lookup Views Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Lookup View Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Selecting Source and Lookup Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Selecting Connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Overriding Owner Name. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Source Directory and File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Source to Lookup Relationship. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Adding Lookup Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Editing Lookup Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Deleting Lookup Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Lookup Views Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Joining Flat Files or Heterogeneous Tables using a Lookup View. . . . . . . . . . . . . . . . . . . . . . . . . . 83

Chapter 13: Join Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Join Views Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Join View Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Join View Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Connection Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Database Optimization in a Join View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Join Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Alias in Join View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Join Conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Adding a Join View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Configuring a Table Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Configuring a Join Condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Managing Join Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Join View Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Chapter 14: Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Reports Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Business Intelligence and Reporting Tools (BIRT) Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

BIRT Report Generation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

SQL and Lookup View Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

Custom Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Viewing Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Jasper Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Status in Jasper Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Configuring Jaspersoft Reporting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Table of Contents v

Page 9: PC 950 DataValidationOption UserGuide En

Generating a Report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Jasper Report Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Dashboards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Metadata Manager Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Configuring Metadata Manager Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Chapter 15: Command Line Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103Command Line Integration Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

CopyFolder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

CreateUserConfig. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

DisableInformaticaAuthentication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

ExportMetadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

ImportMetadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

InstallTests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Cache Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

LinkDVOUsersToInformatica. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

PurgeRuns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

RefreshRepository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

RunTests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Cache Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

UpdateInformaticaAuthenticationConfiguration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

UpgradeRepository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Chapter 16: Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Troubleshooting Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Troubleshooting Initial Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Troubleshooting Ongoing Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

Troubleshooting Command Line Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Appendix A: Datatype Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Test, Operator, and Datatypes Matrix for Table Pair Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Test, Operator, and Datatypes Matrix for Single-Table Constraints. . . . . . . . . . . . . . . . . . . . . . . . 118

Appendix B: BIRT Report Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119Summary of Testing Activities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Table Pair Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Detailed Test Results – Test Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Detailed Test Results – Bad Records Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Appendix C: Jasper Report Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124Home Dashboard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Repository Dashboard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Folder Dashboard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

vi Table of Contents

Page 10: PC 950 DataValidationOption UserGuide En

Tests Run Vs Tests Passed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Total Rows Vs Percentage of Bad Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Most Recent Failed Runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Last Run Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

Appendix D: Reporting Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129Reporting Views Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

results_summary_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

rs_bad_records_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

results_id_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

meta_sv_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

meta_lv_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

meta_jv_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

meta_ds_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

meta_tp_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

rs_sv_id_view, rs_lv_id_view, and rs_jv_id_view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Appendix E: Metadata Import Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138Metadata Import Syntax Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

Table Pair with One Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

Table Pair with an SQL View as a Source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Table Pair with Two Flat Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Single-Table Constraint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

SQL View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Lookup View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Appendix F: Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Table of Contents vii

Page 11: PC 950 DataValidationOption UserGuide En

PrefaceThe PowerCenter Data Validation Option Installation and User Guide describes how you can test and validate dataacross multiple data sources. It is written for database administrators and testers who are responsible forvalidating enterprise data. This guide assumes you have knowledge of the data sources and PowerCenter.

Informatica Resources

Informatica Customer PortalAs an Informatica customer, you can access the Informatica Customer Portal site at http://mysupport.informatica.com. The site contains product information, user group information, newsletters,access to the Informatica customer support case management system (ATLAS), the Informatica How-To Library,the Informatica Knowledge Base, the Informatica Multimedia Knowledge Base, Informatica ProductDocumentation, and access to the Informatica user community.

Informatica DocumentationThe Informatica Documentation team takes every effort to create accurate, usable documentation. If you havequestions, comments, or ideas about this documentation, contact the Informatica Documentation team throughemail at [email protected]. We will use your feedback to improve our documentation. Let usknow if we can contact you regarding your comments.

The Documentation team updates documentation as needed. To get the latest documentation for your product,navigate to Product Documentation from http://mysupport.informatica.com.

Informatica Web SiteYou can access the Informatica corporate web site at http://www.informatica.com. The site contains informationabout Informatica, its background, upcoming events, and sales offices. You will also find product and partnerinformation. The services area of the site includes important information about technical support, training andeducation, and implementation services.

Informatica How-To LibraryAs an Informatica customer, you can access the Informatica How-To Library at http://mysupport.informatica.com.The How-To Library is a collection of resources to help you learn more about Informatica products and features. Itincludes articles and interactive demonstrations that provide solutions to common problems, compare features andbehaviors, and guide you through performing specific real-world tasks.

viii

Page 12: PC 950 DataValidationOption UserGuide En

Informatica Knowledge BaseAs an Informatica customer, you can access the Informatica Knowledge Base at http://mysupport.informatica.com.Use the Knowledge Base to search for documented solutions to known technical issues about Informaticaproducts. You can also find answers to frequently asked questions, technical white papers, and technical tips. Ifyou have questions, comments, or ideas about the Knowledge Base, contact the Informatica Knowledge Baseteam through email at [email protected].

Informatica Multimedia Knowledge BaseAs an Informatica customer, you can access the Informatica Multimedia Knowledge Base at http://mysupport.informatica.com. The Multimedia Knowledge Base is a collection of instructional multimedia filesthat help you learn about common concepts and guide you through performing specific tasks. If you havequestions, comments, or ideas about the Multimedia Knowledge Base, contact the Informatica Knowledge Baseteam through email at [email protected].

Informatica Global Customer SupportYou can contact a Customer Support Center by telephone or through the Online Support. Online Support requiresa user name and password. You can request a user name and password at http://mysupport.informatica.com.

Use the following telephone numbers to contact Informatica Global Customer Support:

North America / South America Europe / Middle East / Africa Asia / Australia

Toll FreeBrazil: 0800 891 0202Mexico: 001 888 209 8853North America: +1 877 463 2435

Toll FreeFrance: 0805 804632Germany: 0800 5891281Italy: 800 915 985Netherlands: 0800 2300001Portugal: 800 208 360Spain: 900 813 166Switzerland: 0800 463 200United Kingdom: 0800 023 4632

Standard RateBelgium: +31 30 6022 797France: +33 1 4138 9226Germany: +49 1805 702 702Netherlands: +31 306 022 797United Kingdom: +44 1628 511445

Toll FreeAustralia: 1 800 151 830New Zealand: 09 9 128 901

Standard RateIndia: +91 80 4112 5738

Preface ix

Page 13: PC 950 DataValidationOption UserGuide En

x

Page 14: PC 950 DataValidationOption UserGuide En

C H A P T E R 1

Introduction to Data ValidationOption

This chapter includes the following topics:

¨ Data Validation Option Overview, 1

¨ Data Validation Workflow, 2

¨ Architecture, 2

¨ System Requirements, 3

¨ Data Validation Methodology, 3

Data Validation Option OverviewData Validation Option is a solution that you use with PowerCenter to validate data. You can validate target datato verify that it is accurate and the transformation process did not introduce error or inconsistencies.

Data validation is the process to verify whether the moved or transformed data is complete and accurate and hasnot been changed because of errors in the movement or transformation process. Use PowerCenter DataValidation Option to verify that your data is complete and accurate.

You might have the standard license or the enterprise license for Data Validation Option. If you have theenterprise license, you can use Data Validation Option in a production environment. If you have the standardlicense, you can use Data Validation Option only in a non-production environment.

Data Validation Option with the enterprise license includes the following additional features in 9.5.0:

¨ Parameterization

¨ Enhanced bad records storage

¨ Jasper Reports

Data Validation Option UsersThere are many possible users of Data Validation Option:

¨ Business or Data Analysts

¨ Data Warehouse Testers

¨ ETL Developers

¨ Database Administrators

1

Page 15: PC 950 DataValidationOption UserGuide En

Data Validation in an EnterpriseThere are two types of data validation generally performed in a data integration setting, source to targetcomparisons and production to development comparisons.

You can do the source to target validation at the end of development of a data integration project on the initial loadof a data warehouse, or as reconciliation of the ongoing daily or incremental loads.

You can perform data validation to compare production and development environments, when you upgrade dataintegration software or RDBMS database software.

Finally, you can perform data validation as part of the testing process or as part of the production process, calledthe reconciliation or Audit/Balance/Control process. Data Validation Option supports all of the use cases describedabove. Data Validation Option reads table definitions from PowerCenter metadata repositories, and checks thedata at either end of the process. It does not check the correctness of transformations or mappings. DataValidation Option identifies problems or inconsistencies but does not attempt to identify the source of the problemin the ETL process.

Data Validation WorkflowA typical workflow for data validation consists of multiple tasks.

1. Data Validation Option reads one or more PowerCenter metadata repositories.

2. Define the validation rules in Data Validation Option.

3. Run the rules to ensure the data conforms to the validation rules. When you do this, Data Validation Optionperforms the following tasks:

¨ Creates and executes all tests through PowerCenter.

¨ Loads results into the Data Validation Option results database and displays them in the Data ValidationOption Client.

4. Examine the results and identify sources of inconsistencies in the ETL process or the source systems.

5. Repeat this process for new records.

ArchitectureData Validation Option requires installation and setup of PowerCenter. Source and target data table and filedefinitions are imported from PowerCenter repositories. You set up table pairs and test rules in Data ValidationOption. This test metadata is stored in the Data Validation Option repository.

When the tests are run, Data Validation Option communicates with PowerCenter through an API to createappropriate mappings, sessions, and workflows, and to execute them. PowerCenter connects to the data beingtested instead of Data Validation Option. After the tests are executed, results are stored in the Data ValidationOption repository and displayed in the Data Validation Option Client.

You can configure Data Validation Option to authenticate users based on Informatica domain login credentials.After you enable Informatica authentication, the users must use their Informatica domain login credentials to usethe Data Validation Option client.

2 Chapter 1: Introduction to Data Validation Option

Page 16: PC 950 DataValidationOption UserGuide En

System RequirementsThe PowerCenter Client must be installed on the machine where Data Validation Option is installed. Any systemthat supports Informatica PowerCenter will support Data Validation Option. However, Data Validation Option worksbest on a machine that has at least 1GB of RAM.

Data Validation MethodologyA sample methodology to help you design a rigorous data validation process is presented in the following section.

Most users have some kind of a testing process already in place. Usually, it is a combination of SQL code andExcel spreadsheets. A common temptation is to replicate the current SQL-based process. The first question oftenasked is, "How do I do this with Data Validation Option?"

Use the following guidelines to set up a data validation approach:

1. Test data, not mappings or workflows. Your test framework should not contain parallel mappings, sessions,and workflows. Testing mappings is unit testing, which is different from data validation.

2. Do not try to mimic SQL. Step back and think of what you are trying to accomplish. Data Validation Option canmake things a lot easier.

3. Assume the worst. If data needs to be moved from last_name to last_name, it may have been moved to city bymistake. If an IF statement was used, assume it was coded wrong. It is always prudent to assume a mistakehas been made and be pleasantly surprised when tests return no errors.

4. Do the easy things first. Complicated problems often manifest themselves in simple ways. Simple counts andconstraints can point out some obvious errors.

5. Design the initial test framework without taking performance into account. After you are satisfied with yourapproach, begin to optimize.

6. Try to split complex SQL into more than one table pair. For example, if you see something like the followingstatements:

Select CASE (code='X', TableA.Fld1, TableB.Fld1)Select CASE (code='X', TableA.Fld2, TableB.Fld2)You can create two table pairs:

Table A vs. Target WHERE clause A: code='X'

Table B vs. Target WHERE clause A: code <> 'X'

7. Do not copy formulas from the ETL mapping into Data Validation Option. Sometimes when you need to test acomplex transformation such as complex IF statements with SUBSTR, you might be tempted to just copy itfrom the mapping. This approach produces an obvious problem. If there is an error in the ETL mappingformula, you will replicate it in Data Validation Option, and Data Validation Option will not catch it. Therefore,you must always maintain a proper separation between ETL and testing.

8. Do not try to do everything in Data Validation Option. If you think that a particular step can be accomplishedmore easily with SQL, use SQL. If you run 95% of your validation in Data Validation Option, and candocument it with the audit trail, this is more than enough.

Basic Counts, Sums, and Aggregate TestsThe goal of basic counts, sums, and aggregate tests are to make sure that all records were moved.

System Requirements 3

Page 17: PC 950 DataValidationOption UserGuide En

This approach detects the following problems:

¨ Lack of referential integrity in the source. (Child with no parent will not be moved.)

¨ Row rejection by the target system.

¨ Incorrect ETL logic in the WHERE clauses.

¨ Other problems that do not move all the required records.

Approach data validation in the following order:

¨ COUNT and COUNT_ROWS to count the number of records

¨ SUM for numeric fields

¨ COUNT DISTINCT to compare detail vs. aggregate tables

Check Referential Integrity of Target TablesCheck referential integrity of table tables to find lack of referential integrity in the target, either child without aparent or a fact record without corresponding dimension table.

In a scenario where Table A is child table; Table B is parent table and Field A is child foreign key; Field B is parentprimary key, approach data validation in the following order :

¨ Test is SETA_in_B. (Every child FK is in parent PK.)

¨ In case of a star schema, fact table is child, dimension is parent. (Every fact FK needs to be in parent PK.)

¨ In case of composite keys, create an expression that concatenates the keys, and run these tests on thatexpression.

Enforce Constraints on Target TablesOften the errors in very complicated transformations manifest themselves in rather simple ways such as NULLs inthe target, missing rows, or incorrect formats. You can test for such scenarios by enforcing constraints on targettables.

This is one of the most overlooked yet most effective data testing strategies. The following examples explain targettable constraints:

Unique Primary Keys

UNIQUE(PK)

If a composite PK, use UNIQUE(expression).

Valid Individual Values

For example:

VALUE(FldA) Between 10, 50

VALUE(FldB) In ('A','B','C')

VALUE(FldC) > 0

NOT_NULL(FldD)

FORMAT(Phone) = 'Reg expression' (optional)

You can have more than one test on a specific field and you can create tests on an expression.

4 Chapter 1: Introduction to Data Validation Option

Page 18: PC 950 DataValidationOption UserGuide En

Aggregate Constraints

These are often used for sanity checks in terms of rows moved, totals, etc. For example, is this correct?

Source (staging file) Target

1 laptop 1000 1 laptop 1000

2 desktop 500 2 desktop 1500

This looks correct, but if this is XYZ company’s daily sales, it is not correct, even though it was movedcorrectly. Somehow you know that XYZ sells more than $2500/day. Therefore, you can say that anything lessthan 1000 records and anything less than $2m and more than $15m is suspect.

Therefore:

COUNT_ROWS(any fld) > 1000

SUM(Amount) Between 1000000,2000000

Compare Individual Records between Sources and TargetsIf a field was moved without transformation or if it was transformed, comparing individual records between sourcesand targets ensures whether the value is correct.

This is another critical step in testing. Read the section that explains the difference between VALUE andOUTER_VALUE tests and the expected results when using each test.

Approach data validation in the following order :

¨ Simple comparison. Create a table pair, join on a common keys, and then either set up tests automatically(right-click/generate) or manually if field names are different.

¨ Any row based expression (concatenation, calculation) can be tested similarly, for example:

VALUE(first_name || '_' || last_name || '@dvosoft.com' = email)

Data Validation with ViewsYou can use lookup views , sql views, and join views to create complex data validation scenarios.

SQL Views

After you complete the other data validation methods, you can use Data Validation Option SQL views toconstruct complicated SQL-based scenarios that involve multiple tables and complicated transformations.

Lookup Views

Testing lookups is an important step in data testing. Data Validation Option lookup views allow you to test thevalidity of the lookup logic in your transformation layer.

Join Views

You can create complex join relationships between heterogeneous data sources in a join view. You can testthe validity of data across related tables with join views.

Data Validation Methodology 5

Page 19: PC 950 DataValidationOption UserGuide En

C H A P T E R 2

New Features and BehaviorChanges

This chapter includes the following topics:

¨ New Features and Enhancements, 6

¨ Behavior Changes, 10

New Features and EnhancementsThis section contains information about new enhancements in different versions of Data Validation Option.

New Features and Enhancements in 9.5.0In 9.5.0, Data Validation Option contains multiple new features and enhancements.

AuthenticationLDAP Authentication

Administrators can configure LDAP authentication on the Informatica domain for a Data Validation Optionschema with PowerCenter 9.0.1 or later.

LicensingData Validation Option is available with a standard license and an enterprise license. You must enter the licensekey after you install Data Validation Option.

Command Line UtilityInstallTests

When you run the InstallTests command you can use the forceinstall option to recreate mappings for the teststhat are already installed.

ReportsJasper Reports

If you have the enterprise license, you can generate different reports and dashboards with the JasperReportsServer.

6

Page 20: PC 950 DataValidationOption UserGuide En

RepositoryMetadata Manager Integration

You can view the details of the PowerCenter repository objects in Metadata Manager.

SQL Views, Lookup Views, and Join ViewsExpressions in Join Conditions

You can enter and validate PowerCenter expressions as fields in a join condition when you configure a joinview.

Expression Validation

You can validate the PowerCenter expressions that you enter in the Data Validation Option Client when youconfigure a source to lookup relationship in a lookup view.

Table Pairs and Single TablesEnhanced Bad Records Storage

If you have the enterprise license, you can store up to 16 million bad records in the Data Validation repositoryor as a flat file for each test.

Expression Validation

You can validate the PowerCenter expressions that you enter in the Data Validation Option Client when youconfigure table pairs and single tables.

Parameterization

If you have the enterprise license, you can configure a parameter file and use the parameters in the WHEREclause of a table pair or single table.

Copy Objects

You can copy table pairs and single tables in a folder to another folder.

TestsExpression Validation

You can validate the PowerCenter expressions that you enter in the Data Validation Option Client when youconfigure test conditions and expression definitions.

Automatic Test Generation

You can generate tests based the position of columns of the tables in a table pair.

New Features and Enhancements in 9.1.4.0In 9.1.4.0, Data Validation Option contains multiple new features and enhancements.

AuthenticationInformatica Authentication

Administrators can configure Informatica authentication for a Data Validation Option schema withPowerCenter 9.0.1 or later. After you enable Informatica authentication, users must use Informatica domaincredentials to log in to the Data Validation Option Client.

New Features and Enhancements 7

Page 21: PC 950 DataValidationOption UserGuide En

RepositoriesData Sources

You can use the following data sources in Data Validation Option:

¨ PowerExchange for DB2 z/OS

¨ Netezza

¨ SAP

¨ Salesforce.com

¨ SAS

Single Tables and Table PairsThreshold

You can enter the threshold for aggregate and value tests in table pairs and single tables as a percentagevalue in addition to the absolute value.

Max Bad Records

You can enter a percentage value for the maximum number of bad records for tests in table pairs and singletables in addition to the absolute value.

Number of Processed Records

After you run a test, the test report displays the details of the processed records. You can view the number ofbad records, total number of records processed through the join, and the number of records read from thedatabase.

Join Views

You can create an object with complex join conditions in multiple heterogeneous data sources. You can use ajoin view as a table in table pairs and single tables.

New Features and Enhancements in 9.1.2.0In 9.1.2.0, Data Validation Option contains multiple new features and enhancements.

Command Line UtilityRunTests Command

When you run tests with the RunTests command, you can use the following options:

¨ Send an email with the test results once Data Validation Option completes the test.

¨ Provide cache memory setting for mapping transformations.

InstallTests Command

When you install tests with the InstallTests command, you can provide the cache memory setting for mappingtransformations

TestsAutomatic Test Generation

You can generate count tests along with value tests in automatic test generation.

8 Chapter 2: New Features and Behavior Changes

Page 22: PC 950 DataValidationOption UserGuide En

RepositoriesData Sources

Data Validation Option supports the following data sources through PowerExchange for ODBC:

¨ DB2 z/OS

¨ DB2 AS/400

¨ IMS

¨ Adabas

¨ VSAM

¨ Mainframe flat files

Metadata Import

Significant performance improvement when you import or refresh metadata from a PowerCenter repository.

New Features and Enhancements in 9.1.0Data Validation Option version 9.1.0 contains new features and enhancements.

Client Layout

¨ Folders. You can organize single tables and table pairs by placing them in folders. When you upgradefrom version 3.1, the installation program creates a Default folder for each user and places the table pairsand single tables in the folder. When you create a new user, Data Validation Option creates a Defaultfolder.

¨ Error reporting. If a test fails to run, Data Validation Option displays the test run error on the Tests tab.Previously, you had to examine the PowerCenter session log file to view test run errors.

¨ Single Tables tab. The details area contains separate tabs for table pairs and single tables.

Command Line Utility

¨ New Commands. The Data Validation Option command line utility, DVOCmd.exe, contains new commandsthat allow you to create users and refresh repositories.

PowerCenter Version

¨ PowerCenter 9.1.0. Data Validation Option 9.1.0 works with PowerCenter versions 8.5 and later, exceptfor PowerCenter version 9.0.

Reports

¨ Reports for tests in folders. You can run reports for all tests in a folder.

¨ Report information. Reports display the folder name, error messages, join expressions, and conditions, ifapplicable.

Repositories

¨ Refreshing repositories. When you refresh a repository, you can refresh the entire repository, theconnection objects, the folders, or the sources and targets. You can also refresh repository foldersindividually.

Single Tables and Table Pairs

¨ Expressions in join conditions. When you join two tables in a table pair, you can enter a PowerCenterexpression as a field in the join condition. Enter an expression to join tables with key fields that are notidentical.

New Features and Enhancements 9

Page 23: PC 950 DataValidationOption UserGuide En

¨ Large table processing. When you include a large table in a table pair, you can optimize the way DataValidation Option joins table data. You can specify which table Data Validation Option uses for the masteror detail table. You can also use sorted output for the join.

¨ Pushing sorting logic to the database. To increase the performance of table pair and single table tests, youcan push the sorting logic for joins to the source database. Pushing sorting logic to the database causesthe database to sort records before it loads them to PowerCenter which minimizes disk input and output.

Tests

¨ Filter conditions. You can apply a filter condition to table pair and single table tests. If you apply a filtercondition to a table pair test, Data Validation Option applies the filter condition after it joins the tables inthe table pair.

Behavior ChangesThis section contains information on behavior changes in different versions of Data Validation Option.

Behavior Changes in 9.1.4.0Effective in 9.1.4.0, Data Validation Option behavior changes in multiple ways.

Client LayoutSQL Views Tab and Lookup Views Tab

You can no longer add SQL views and lookup views to a table pair or single table from the right-click menu inSQL Views Tab and Lookup Views Tab.

RepositoriesPowerCenter Repository Support

You can use repositories from PowerCenter 8.6.1 HotFix 10 and later. Previously, you could use repositoriesfrom PowerCenter 8.5 and later.

TestsAutomatic Test Generation

You select whether to enable trim trailing spaces for the tests that you generate automatically. Previously, youhad to manually update the automatically generated tests to enable trim trailing spaces.

You can also apply separate data source sorting for the tables in the table pair. Previously, you could notprovide separate sorting for the tables.

Behavior Changes in 9.1.2.0Effective in 9.1.2.0, Data Validation Option behavior changes in multiple ways.

Single Tables and Table PairsDatabase connection

Name of the database connection that you provide in Data Validation Option table pairs and single tables areno longer case sensitive. Previously, if you edit the database connection name to a different case inPowerCenter, the existing table pairs and single tables would be invalidated.

10 Chapter 2: New Features and Behavior Changes

Page 24: PC 950 DataValidationOption UserGuide En

TestsAutogeneration of Tests

Use Compare Table menu item to autogenerate of table pairs and tests between tables and flat files in foldersand contains several options. Previously, you had to select the two folders and right-click to autogeneratetable pairs and tests.

Behavior Changes in 9.1.0Effective in 9.1.0, Data Validation Option behavior changes in multiple ways.

Client LayoutData Sources tab

The Data Sources tab is removed.

Folders

Table pairs and single tables appear in the Default folder in the Navigator. Previously, single tables and tablepairs appeared in the Single Tables and Table Pairs nodes in the Navigator.

Properties area

The Properties area is moved to the bottom right side of the Data Validation Option Client to make more roomfor the Navigator. Previously, the Properties area appeared in the bottom left side of the screen.

Results tab

The tab that lists bad records for tests is renamed to Results. The Results tab displays test summaryinformation for table pairs, single tables, and tests. It also displays bad records for certain types of tests.Previously, Data Validation Option displayed the Details tab only for tests. It displayed bad records only.

Single Tables tab

In the details area, single tables are listed on the Single Tables tab. Previously, single tables were listed onthe Table Pairs tab.

InstallationExecutable name

The Data Validation Option executable file name is DVOClient.exe. Previously, the executable file name wasDataValidator.exe.

Installation directory

The Data Validation Option default installation directory is C:\Program Files\Informatica<version>\DVO on 32-bit operating systems and C:\Program Files (x86)\Informatica<version>\DVO on 64-bit operating systems.Previously, the default installation directory was C:\Program Files\DVOSoft.

ReportsReports for single-table constraints

Reports for single-table constraints display information about the single table only. Previously, reports forsingle-table constraints were the same as reports for table pairs except they displayed “Table B” values asnull values.

Behavior Changes 11

Page 25: PC 950 DataValidationOption UserGuide En

RepositoriesImporting metadata

When you add a PowerCenter repository, Data Validation Option imports folder names. If the repository is atarget repository, Data Validation Option also imports connection metadata. To import source and targetmetadata, you must refresh the repository. Previously, when you saved a repository for the first time, DataValidation Option imported folder names and all source and target metadata. If the repository was a targetrepository, Data Validation Option also imported connection metadata.

Refreshing repositories

When you refresh a repository, you do not have to close and restart Data Validation Option. Previously, youhad to restart Data Validation Option after you refreshed a repository.

Single Tables and Table PairsSingle Tables Editor

You create single tables using the Single Table Editor dialog box. Previously, you created single tables usingthe Table Pairs Editor dialog box.

SQL and Lookup ViewsSQL view definition

When you create an SQL view, you can select the tables and columns to use in the view. Data ValidationOption detects the datatype, precision, and scale of the columns. Previously, when you created an SQL view,you had to define the SQL relationships across tables, define the columns, and define the datatype, precision,and scale of each column manually.

12 Chapter 2: New Features and Behavior Changes

Page 26: PC 950 DataValidationOption UserGuide En

C H A P T E R 3

Data Validation Option ClientLayout

This chapter includes the following topics:

¨ Data Validation Option Client Overview , 13

¨ Data Validation Option Client Tabs, 14

¨ Folders, 16

¨ Menus, 18

Data Validation Option Client OverviewThe Data Validation Option Client contains multiple areas and menus that allow you to perform different tasks.

Statistics Area

The statistics area appears when you click on the Data Validation Option user. This area displays informationabout the number of repositories, table pairs, single tables, tests, views, and data sources that exist in thecurrent instance of Data Validation Option. It also displays information about running tests and the user name.

Navigator

The Navigator is on the left side of the Data Validation Option Client. It contains the following objects:

Object Description

INFA Repositories Lists all PowerCenter repositories that you add to Data Validation Option. Expand a repositoryto see the repository folders and the sources and targets in each folder.

SQL views Lists the SQL views that you create.

Lookup views Lists the lookup views that you create.

Join views Lists the join views that you create.

Folders Lists the single tables and table pairs that you create.

13

Page 27: PC 950 DataValidationOption UserGuide En

Object Description

Table Pairs Lists all the table pairs that you create.

Single Tables Lists all the single tables that you create.

Details Area

The details area is in the upper right section of the Data Validation Option Client. It contains tabs that displaydetails about the objects you create in Data Validation Option such as tests, table pairs, single tables, andviews.

When you click on a folder, table pair, or single table, the details area displays the following information aboutthe tests associated with the object:

¨ Number of tests.

¨ Number of tests passed.

¨ Number of tests failed.

¨ Number of tests in progress.

¨ Number of tests not run because of errors.

¨ Number of tests not run by the user.

Properties Area

The Properties area is in the lower right section of the Data Validation Option Client. It displays the propertiesfor the object that you select in the Navigator or details area.

Results Area

The Results area appears in the lower right section of the Data Validation Option Client when you select atest, table pair, or single table in the details area. The Results area displays test summary information,results, and the bad records written to the Data Validation Option repository.

Status Bar

The status bar appears in the below the Results Area and displays the number of tests in progress and thenumber of tests in queue to be run.

Data Validation Option Client TabsThe Data Validation Option Client contains tabs that display different information.

The Data Validation Option Client contains the following tabs:

¨ Tests

¨ Table Pairs

¨ Single Tables

¨ SQL Views

¨ Lookup Views

¨ Join Views

14 Chapter 3: Data Validation Option Client Layout

Page 28: PC 950 DataValidationOption UserGuide En

Tests TabThe Tests tab in the details area displays all tests set up in this instance of Data Validation Option.

By default, tests are sorted in the order they were created. However, you can sort tests by clicking the columnheader.

The following table describes the columns on the Tests tab:

Column Description

Test status icon Indicates whether tests associated with the table pair have been run and the status of the mostrecent run. If you hold the pointer over the icon, Data Validation Option displays the meaning ofthe icon.

Name The test description.

Test type Type of test.

Table Pair /Single Table The name of the table pair or single table.

Test Run Date/Time The date and time that the tests were last run.

Test Run Error If a test failed, this column lists the error

SQL Views TabThe SQL Views tab in the details area displays all SQL views set up in this instance of Data Validation Option.

By default, SQL views are sorted in the order they were created. However, you can sort SQL views by clicking thecolumn header.

The following table describes the columns on the SQL Views tab:

Column Description

Description SQL view description.

Table Name Tables you use to create the SQL view.

SQL Statement SQL statement that you run against the database to retrieve data for the SQL view.

The right-click menu in the SQL View tab lists the following options:

¨ Add SQL View

¨ Edit SQL View

¨ Delete SQL View

¨ Export Metadata

Lookup Views TabThe Lookup Views tab in the details area displays all lookup views set up in this instance of Data Validation Option.

By default, lookup views are sorted in the order they were created. However, you can sort lookup views by clickingthe column header.

Data Validation Option Client Tabs 15

Page 29: PC 950 DataValidationOption UserGuide En

The following table describes the columns on the Lookup Views tab:

Column Description

Description Lookup view description.

Source Table Source table name.

Lookup Table Lookup table name.

The right-click menu in the Lookup Views tab lists the following options:

¨ Add Lookup View

¨ Edit Lookup View

¨ Delete Lookup View

¨ Export Metadata

Join Views TabThe Join Views tab in the details area displays all join views set up in this instance of Data Validation Option.

By default, join views are sorted in the order they were created. However, you can sort Join views by clicking thecolumn header.

The following table describes the columns on the Join Views tab:

Column Description

Description Join view description.

Joined Tables List of tables joined in the join view.

The right-click menu in the Join Views tab lists the following options:

¨ Add Join View

¨ Edit Join View

¨ Delete Join View

¨ Export Metadata

FoldersFolders store the single tables and table pairs that you create.

By default, Data Validation Option places the single tables and table pairs that you create in the default folder. Ifyou create a folder, you can create single tables or table pairs within the folder. You can move single tables ortable pairs between folders. Within a folder, you can expand a single table or table pair to view the testsassociated with it. Folder names are case sensitive.

16 Chapter 3: Data Validation Option Client Layout

Page 30: PC 950 DataValidationOption UserGuide En

You can also copy folders. You can copy the contents of a folder in your workspace to a different folder in yourworkspace or to a folder in another user workspace. You must copy folder contents to a new folder. You cannotcopy folder contents to a folder that exists in the target workspace.

When you copy a folder, Data Validation Option copies all table pairs, single tables, and test cases in the sourcefolder to the target folder. Data Validation Option does not copy test runs or the external IDs associated with tablepairs or single tables.

If a table pair or single table in the source folder uses an SQL view or a lookup view, Data Validation Optioncopies the view to the target user workspace unless the workspace contains a view with the same name. If thetarget workspace contains a view with the same name, Data Validation Option gives you the following options:

¨ You can use the view in the target workspace.

¨ You can copy the view to the target workspace with another name. Data Validation Option names the view inthe target workspace "Copy <number> <source view name>."

Before Data Validation Option copies a folder, it verifies that the repository and all data sources associated withthe objects to copy exist in the target workspace. Object names in Data Validation Option are case sensitive.Therefore, the repository, data sources, and folders that contain the data sources must have identical names inthe source and the target workspaces. If the repository or any required data source does not exist in the targetworkspace, Data Validation Option does not copy the folder.

Copying FoldersYou can copy the contents of a folder in your workspace to a different folder in your workspace or to a folder inanother user workspace.

1. Select Edit > Copy Folder.

The Copy Folder Contents dialog box opens.

2. Enter the following information:

Property Description

Copy from User Name of the source user. Data Validation Option copies the folder in this user workspace.

Copy from Folder Name of the source folder. Data Validation Option copies this folder.

Copy to User Name of the target user. Data Validation Option copies the folder to this user workspace. Thesource user and the target user can be the same user.

Copy to Folder Name of the target folder. The target folder must be unique in the target workspace.

3. Click OK.

Copying ObjectsYou can copy the objects in a folder in your workspace to a different folder in your workspace or to a folder inanother user workspace.

1. Select the object that you want to copy.

You can also select multiple objects which are available in different folders that are of the same type.

2. Right-click the object and select Copy.

The Copy Object(s) dialog box appears.

Folders 17

Page 31: PC 950 DataValidationOption UserGuide En

3. Enter the following information:

Property Description

User Name of the target user. Data Validation Option copies the folder to this user workspace. Thesource user and the target user can be the same user.

Folder Name of the target folder. The target folder must be unique in the target workspace.

MenusThe following table describes the Data Validation Option menu items:

Menu Menu Item Definition

File New Create a new Data Validation Option object.

Metadata Import, export, and reload metadata.

Settings Configure preferences and opens the Data Validation Option folder in the“Documents and Settings” directory in Windows Explorer.

Exit Exit the Data Validation Option Client.

Edit Edit Edit the selected object.

Delete Delete the selected object.

Copy folder Copy the selected folder.

Move tables/table pair Move tables/table pairs to the selected folder.

Action Run Tests Run selected tests.

Add Test Add a new test.

Generate Value Tests Generate value tests for the selected object.

Compare Tables Autogenerate tests for all the table pairs.

Generate Report Generate a consolidated test report.

Dashboard Launches dashboard. The feature is available only for enterprise customers.

Refresh All Repositories Refresh contents of all the repositories.

Everything Refresh all contents in the selected repository.

Folder List Refresh the list of folders in the selected repository.

Folder (Sources and Targets) Refresh the sources and targets in the selected repository.

18 Chapter 3: Data Validation Option Client Layout

Page 32: PC 950 DataValidationOption UserGuide En

Menu Menu Item Definition

Connections Refresh all connections in the selected repository.

Dashboards Home Launches the Home dashboard.

Repository Details Launches the Repository dashboard.

Folder Details Launches the Home dashboard.

Table Details Launches the Home dashboard.

Help Help Opens the help file.

About Displays information about PowerCenter Data Validation Option. Click thedialog box to close it.

Change License Key Opens the Change License Key dialog box.

Note: You can see the Dashboards menu and menu items if you have the enterprise license.

Settings FolderWhen you select File > Settings > Open Settings Folder, Windows Explorer displays the contents of the DataValidation Option folder in the Documents and Settings directory for that installation of the application. The datafolder also contains an XML file that contains the information entered in the Preferences dialog box.

Menus 19

Page 33: PC 950 DataValidationOption UserGuide En

C H A P T E R 4

Data Validation Option Installationand Configuration

This chapter includes the following topics:

¨ Data Validation Option Installation and Configuration Overview , 20

¨ Prerequisites , 21

¨ System Permissions, 21

¨ Information Required for Installation, 22

¨ Installing and Configuring Data Validation Option for the First Time, 22

¨ Data Validation Option Configuration for Additional Users, 26

¨ Data Validation Option Upgrade, 27

¨ Upgrading from Version 9.1.x to Version 9.5.0, 27

¨ Upgrading from Version 3.0 , 27

¨ Upgrading from Version 3.1 , 28

¨ DVOCmd Installation on UNIX, 29

¨ Environment Variables, 30

¨ Modifying the License Key, 31

¨ JasperReports Server Setup, 31

Data Validation Option Installation and ConfigurationOverview

You must install Data Validation Option client before you can create and run data validation tests.

To install or upgrade Data Validation Option, complete the following tasks:

1. Review the prerequisites.

2. Review the required system permissions.

3. Install or Upgrade Data Validation Option.

4. Perform the Data Validation Option setup steps.

After you install Data Validation Option, run a test to verify that the installation was successful.

20

Page 34: PC 950 DataValidationOption UserGuide En

PrerequisitesYou must complete the prerequisites to successfully install Data Validation Option.

Before you install Data Validation Option, complete the following prerequisites:

¨ Install PowerCenter 8.6.1 HotFix 10 or later on the same local area network as the Data Validation OptionClient machine.

¨ The Informatica domain must contain at least one PowerCenter Integration Service.

¨ Install PowerCenter Client on the same machine where Data Validation Option will be installed.

¨ Setup at least one PowerCenter repository.

¨ Obtain the license file to use Data Validation Option. Data Validation Option has a separate license file. Youcannot use the license file for PowerCenter on Data Validation Option.

PowerCenter Support for Data Validation Option FeaturesYou must install PowerCenter 8.6.1 HotFix 10 or later to use Data Validation Option 9.5.0. Certain features in DataValidation Option requires a later version of PowerCenter.

The following table lists the Data Validation Option features and the supported PowerCenter versions:

Data Validation Option Feature Supported PowerCenter Version

SAP R/3 data source support PowerCenter 9.1.0 and later

SAS data source support PowerCenter 9.1.0 and later

Informatica Authentication PowerCenter 9.0.1 and later

DVOCmd in Data Validation Option 9.1.0 and later requires PowerCenter 8.6.1 HotFix 10 or later. Prior versions ofDVOCmd works with previous versions of PowerCenter 8.6.1.

System PermissionsYou require certain system permissions to complete Data Validation Option installation and configuration.

To complete Data Validation Option setup, verify that you have the permission to complete the following tasks:

¨ Create a database, including the ability to create schemas, tables, indexes, sequences, and views.

¨ Create a PowerCenter connection object in the Workflow Manager.

¨ Create a folder in a PowerCenter repository in the Repository Manager.

¨ Create and associate a PowerCenter Integration Service with the PowerCenter repository that the DataValidation Option user can access.

¨ Copy a JAR file onto the machine that hosts Informatica Services.

¨ Configure the Administrator tool.

¨ Modify the environment variables on the machine where you install Data Validation Option.

¨ Read and write on the Data Validation Option installation directory and subdirectories.

Prerequisites 21

Page 35: PC 950 DataValidationOption UserGuide En

Information Required for InstallationBefore you install Data Validation Option, gather the information that you need during installation.

Complete the following table with the values that you need to complete Data Validation Option setup:

Name Value

Informatica Domain

PowerCenter Integration Service

PowerCenter Repository Service

PowerCenter Repository user name

PowerCenter Repository password

Location of the domains.infa file on theclient machine

Installing and Configuring Data Validation Option forthe First Time

When you install and configure Data Validation Option Client for the first time, you must configure thePowerCenter repository that holds the mappings and sessions for the data validation tests.

1. Verify that the user for the Data Validation Option repository have the privileges to create and modify tables,indexes, sequences, and views during installation. User must have these privileges to create a DataValidation Option repository.

Note: If the Data Validation Option repository database is IBM DB2, the user name and schema name mustbe the same. Configure the page size in IBM DB2 to a minimum of 16KB. You cannot install the DataValidation Option repository on a clustered IBM DB2 system.

2. Open the Administrator tool and create a PowerCenter Repository Service to store the Data Validation Optionmappings. You can also use an existing PowerCenter Repository Service.

3. Verify that the code page set for the PowerCenter Integration Service is compatible with the NLS setting ofthe Data Validation Option repository database.

If the settings are not compatible, test results might be inaccurate.

4. Open the Workflow Manager and set up a connection to the Data Validation Option repository. Every DataValidation Option user must have the permission to use this connection.

Record the connection name:

________________________________________________________________________________

5. Open the Repository Manager and create a folder in the repository for Data Validation Option to storemappings that run tests. Use this folder only for storing Data Validation Option mappings. Every DataValidation Option user must have the privileges to use this folder.

Record the repository and folder names.

22 Chapter 4: Data Validation Option Installation and Configuration

Page 36: PC 950 DataValidationOption UserGuide En

Repository name:

________________________________________________________________________________

Folder name:

________________________________________________________________________________

6. Verify that the domains.infa file is available in the following location: <PowerCenter installation directory>\clients\PowerCenterClient\.

The domains.infa file contains the Informatica domain and PowerCenter repository details. You must connectto the Informatica domain and the PowerCenter repository from the PowerCenter client tools to update thedomains.infa file with the domain and repository details.

If the domains.infa file is not available on the PowerCenter Client machine, copy the file from the followinglocation on the PowerCenter server machine: <Informatica Services installation directory>\<version>

7. On the Data Validation Option Client machine, create an environment variable called INFA_HOME and set thevalue to the location of the domains.infa file:

a. Select Control Panel > System > Advanced > Environment Variables.

b. Click New System Variable.

c. Enter INFA_HOME for the variable name.

d. Enter the domains.infa file path, excluding the domains.infa filename, for the variable value.

e. Click OK in each dialog box.

8. Verify that the environment variable is set up correctly:

a. Open the DOS command window and type set.

b. The environment variable that was just set up should appear in the list of environment variables. It shouldread as follows: INFA_HOME = C:\Informatica\<version>\clients\PowerCenterClient\Configure the variable, if the environment variable is not set.

9. Install Data Validation Option on the client machine.

10. Create a folder on the machine that hosts Informatica Services and copy the dvoct.jar file from the C:\ProgramFiles< (x86)>\Informatica<version>\DVO\powercenterlibs directory on the Data Validation Option Client to thenew folder. Ensure that the PowerCenter Integration Service can access the location.

11. Update the Java SDK Classpath for the PowerCenter Integration Service:

a. Open the Administrator tool.

b. From the navigator, select the PowerCenter Integration Service.

c. Click the Processes tab.

d. Edit the Service Process Properties > General Properties.

e. Edit the Java SDK Classpath.

f. Enter the path to the dvoct.jar file on the machine that hosts the Informatica Services, including thedvoct.jar file name. If there is a value in Java SDK Classpath, add a semi-colon (Windows) or colon(UNIX/Linux) after the classpath before you add the dvoct.jar file path.

If PowerCenter is installed in a grid environment, repeat this step for each node.

12. Run the Data Validation Option Client.

Installing and Configuring Data Validation Option for the First Time 23

Page 37: PC 950 DataValidationOption UserGuide En

13. Enter the Data Validation Option repository information:

a. In Data Validation Option, select File > Settings > Preferences > Data Validation Option.

You can also right-click on the Data Validation Option user in the Navigator to open the Preferencesdialog box.

b. Enter the following information:

Option Description

User Enter a unique user name.

Database Type Select Oracle, SQL Server, or IBM DB2.

Database Driver This value is automatically populated by Data Validation Option. It does not need to bechanged.

Database URL The value automatically populated by Data Validation Option for this field consists of aseries of values. There are placeholders for the database host (server) name, databasename, and port number, if appropriate. Remove the characters '<' and '>' when youenter the value.

Note: If Oracle RAC is used, the URL must be in the following format:jdbc:oracle:thin:@(DESCRIPTION=(LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=host1) PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=host2) (PORT=1521))(CONNECT_DATA=(SERVICE_NAME=service)))See tnsnames.ora for the exact syntax.

Database User Enter the database user name.

Database Password Enter the database user password.

c. Click Test to make sure the database information is correct.

d. Click Save, and create the Data Validation Option repository schema when prompted.

14. Optionally, update the mapping properties based on the requirements and the environment:

a. Select File > Settings > Preferences > Mapping Properties.

b. Enter the following information:

Option Description

Max Bad Records forReporting

Maximum number of bad records for reporting written to the Data Validation Optionrepository for each test. The maximum value you can enter is 1000.Default is 100.

DTM Buffer Size Amount of memory allocated to the PowerCenter session from the DTM process.Default is Automatic. Data Validation Option uses the buffer size that you configure inPowerCenter if you do not enter a value.You can increase the DTM buffer size if the tests contain a large number of table pairs.

24 Chapter 4: Data Validation Option Installation and Configuration

Page 38: PC 950 DataValidationOption UserGuide En

Option Description

You can specify a numeric value. If you enter 2000, the PowerCenter Integration Serviceinterprets the number as 2000 bytes. Append KB, MB, or GB to the value to specify otherunits. For example, you can specify 512MB.

Max Concurrent Runs Maximum number of PowerCenter sessions run at the same time. Each table pair is run asone session, regardless of how many tests it has.Default is 10. The maximum value you can enter is 50.

c. If you have the enterprise license, you can configure the following detailed error rows analysis settings:

¨ Max Bad Records for Detailed Analysis. Maximum number of bad records stored for detailed errorrecord analysis. Data Validation Option stores up to 16,000,000 records for error record analysis.Default is 5000.

¨ File delimiter. Delimiter character to separate the error records if you choose to store the bad recordsin a file.

15. Restart the Data Validation Option Client.

The Data Validation Option Client prompts you to enter the license key.

16. Click Browse and select the license key.

Data Validation Option Client prompts you to restart the application. Data Validation Option stores the licenseinformation in the Data Validation Option database schema.

17. Add the repository that contains the folder where Data Validation Option creates mappings:

a. Right-click INFA Repositories in the Navigator, and select Add Repository. The Repository Editordialog box opens with the following options:

Option Description

Name Name for the repository, ideally with the word “target” to identify the repositorytype.

Client Location Location of the pmrep.exe file on the client machine using the Browse button.Typically the location is: C:\Informatica\<version>\clients\PowerCenterClient\client\bin.

PowerCenter Version The PowerCenter version that runs Data Validation Option.

PowerCenter Domain Name of the Informatica domain.

Repository Name of the PowerCenter repository.

User name User name for the PowerCenter repository.

Password User password for the PowerCenter repository.

Security Domain LDAP security domain. Leave this field blank if you use native authentication.

Contains Target Folder Select true. You can configure only one PowerCenter repository with targetfolder for a Data Validation Option schema.

Installing and Configuring Data Validation Option for the First Time 25

Page 39: PC 950 DataValidationOption UserGuide En

Option Description

Target Folder Enter the folder name defined.

Integration Service Enter the name of the PowerCenter Integration Service.

Data Validation Option ResultsWarehouse Connection

Enter the PowerCenter connection to the Data Validation Option repository thatyou created in Step 5.

Enable Metadata Manager Enable or disable integration with the Metadata Manager service.

Is secure connection Select the checkbox if the Metadata Manager service runs over a secureconnection.

Server Host Name Enter the server host name where Metadata Manager service runs.

Server Port Enter the Metadata Manager service port.

Resource Name Enter the name of the PowerCenter resource.

b. Click Save.

Data Validation Option offers to test the repository settings to make sure they are correct.

c. Test the repository settings to ensure that the settings are accurate.

When you confirm the settings, Data Validation Option imports the PowerCenter sources, targets, andconnections.

18. Optionally, you can add other repositories. Make sure that Contains Target Folder is false because only onerepository can have a target folder.

19. Create a table pair with one test and run it to make sure the installation was successful.

RELATED TOPICS:¨ “Troubleshooting” on page 114

Data Validation Option Configuration for AdditionalUsers

After you configure Data Validation Option for the first user, you can configure additional users on other clientmachines.

1. On the Data Validation Option Client machine, create an environment variable INFA_HOME and set the value tothe location of the domains.infa file.

2. Install the Data Validation Option Client.

3. Start the Data Validation Option Client.The Data Validation Option Client prompts you to configure the repository.

4. Configure the Data Validation Option repository.You can use the details of an existing Data Validation Option repository.

26 Chapter 4: Data Validation Option Installation and Configuration

Page 40: PC 950 DataValidationOption UserGuide En

5. Add the PowerCenter repository for Data Validation Option.

6. Optionally, configure additional PowerCenter repositories.

Note: You can also use the command CreateUserConfig to create additional Data Validation Option users.

RELATED TOPICS:¨ “Command Line Integration” on page 103

Data Validation Option UpgradeYou can upgrade Data Validation Option 3.0 and 3.1 to Data Validation Option 9.x.

Upgrade to version 9.x does not affect test metadata or test results. Back up Data Validation Option repositorybefore you upgrade.

Upgrading from Version 9.1.x to Version 9.5.0You can upgrade to the latest version of Data Validation Option from Data Validation Option 9.1.x.

1. Uninstall the Data Validation Option version 9.1.x Client.

2. Install the latest version of the Data Validation Option Client.

3. From the command line go to the Data Validation Option installation folder.

4. Enter the DVOCmd UpgradeRepository command, if this is the first upgrade of the client in a multi-userenvironment.

5. On the Data Validation Option Client machine, create an environment variable called INFA_HOME and set thevalue to the location of the domains.infa file, if the variable does not exist.

The domains.infa file is available in the following location: <PowerCenter installation directory>\clients\PowerCenterClient\

6. Restart the Data Validation Option Client.

7. Run the Data Validation Option Client

The Data Validation Option Client prompts you to enter the license key.

8. Click Browse and select the license key.

Data Validation Option Client prompts you to restart the application. Data Validation Option stores the licenseinformation in the Data Validation Option database schema.

Upgrading from Version 3.0You can upgrade from Data Validation Option 3.0.

1. In Data Validation Option version 3.0, go to Tools > Properties, and note the name of the connection to theData Validation Option repository.

Data Validation Option Upgrade 27

Page 41: PC 950 DataValidationOption UserGuide En

2. Uninstall Data Validation Option version 3.0.

3. Install Data Validation Option.

4. From the command line go to the Data Validation Option installation folder.

To find the Data validation Option installation folder, right-click the Data Validation Option button and clickOpen File Location.

5. Enter the DVOCmd UpgradeRepository command, if this is the first upgrade of the client in a multi-userenvironment.

6. On the Data Validation Option Client machine, create an environment variable called INFA_HOME and set thevalue to the location of the domains.infa file.

7. Optionally, edit C:\Program Files< (x86)>\Informatica<version>\DVO\config\JMFProperties.properties andchange the number of pmrep processes from 2 to 8.

Provide a higher number of pmrep processes to increase performance when you run several testssimultaneously.

8. Start Data Validation Option.

9. Edit the following repository information:

¨ PowerCenter version

¨ Name of the connection to the Data Validation Option repository.

10. Click Save.

A prompt appears and asks you to verify the test settings.

11. Right-click the repository name in the Navigator and click Refresh.

12. Restart Data Validation Option.

13. Run the Data Validation Option Client

The Data Validation Option Client prompts you to enter the license key.

14. Click Browse and select the license key.

Data Validation Option Client prompts you to restart the application. Data Validation Option stores the licenseinformation in the Data Validation Option database schema.

Note: The database URL format has changed since version 3.1. If Data Validation Option fails to upgrade thedatabase URL, select File > Settings > Preferences > Data Validation Option, and update the Database URL.

Upgrading from Version 3.1You can upgrade from Data Validation Option 3.1.

1. Uninstall Data Validation Option version 3.1.

2. Install Data Validation Option.

3. From the command line go to the Data Validation Option installation folder.

To find the Data validation Option installation folder, right-click the Data Validation Option button and clickOpen File Location.

4. Enter the DVOCmd UpgradeRepository command, if this is the first upgrade of the client in a multi-userenvironment.

28 Chapter 4: Data Validation Option Installation and Configuration

Page 42: PC 950 DataValidationOption UserGuide En

5. On the Data Validation Option Client machine, create an environment variable called INFA_HOME and set thevalue to the location of the domains.infa file.

The domains.infa file is available in the following location: <PowerCenter installation directory>\clients\PowerCenterClient\.

6. Optionally, edit C:\Program Files< (x86)>\Informatica<version>\DVO\config\JMFProperties.properties andchange the number of pmrep processes from 2 to 8.

Provide a higher number pmrep processes to increase performance when you run tests simultaneously.

7. Restart Data Validation Option.

8. Run the Data Validation Option Client

The Data Validation Option Client prompts you to enter the license key.

9. Click Browse and select the license key.

Data Validation Option Client prompts you to restart the application. Data Validation Option stores the licenseinformation in the Data Validation Option database schema.

Note: The database URL format has changed since version 3.1. If Data Validation Option fails to upgrade thedatabase URL, select File > Settings > Preferences > Data Validation Option, and update the Database URL.

DVOCmd Installation on UNIXDVOCmd is a command line program that you can use to run Data Validation Option tasks without Data ValidationOption Client.

You can install DVOCmd on a UNIX machine and run Data Validation Option commands through the shell. You donot need to run Informatica services on the machine. You must install and configure Informatica services in thenetwork before you can use DVOCmd in UNIX. The installation is required for the command line programs such aspmrep and pmcmd, and libraries used by DVOCmd.

To install DVOCmd on UNIX, untar the installer package (install-dvo-<version>.tar) in a location with read-writepermission. You can find the .tar file inside the .zip package (Install_DataValidator_<version>.zip) that Informaticaprovides.

For example, you can untar the DVOCmd package in the user home directory. DVOCmd and the associated filesare available in the folder DVO.

cd $HOME tar xvf install-dvo-9.5.0.0.tar

After you install DVOCmd, you must set the INFA_DOMAINS_FILE environment variable to the location of thedomains.infa file. Copy the domains.infa file from a Windows machine running the PowerCenter Client.

DVOCmd Installation on UNIX 29

Page 43: PC 950 DataValidationOption UserGuide En

Environment VariablesYou can configure environment variables on the machine where you installed Data Validation Option.

The following tables describes the environment variables used by Data Validation Option:

Environment Variable Description

DV_DOUBLE_COMPARISON_EPSILON Tolerance value for floating point numbers. Default is 1e-15. Thetolerance value does not affect join keys. The variable is applicableonly for the 'equal to' operator.

DV_TRANS_CACHE_SIZE_AUTO Controls automatic cache size setting for the data validation mapping.Set the value to "Y" to set the cache size for data validation mappingto auto.If you don't set the variable, Data Validation Option uses the defaultcache size. Default is 20 MB for the data cache and 10 MB for theindex cache.You can also set the cache size when you run the InstallTests andRunTests DVOCmd commands with the option --cachesize.

DV_REPORT_ENGINE Controls the BIRT reporting engine.Set the value as "Y" to turn the reporting engine on and "N" to turnthe reporting engine off.You may need to set the environment variable when you face issueswith BIRT reporting engine in Citrix environment.

DV_MAPPING_STRING_MAX_PRECISION Precision of string fields in data validation mappings.

DV_RTRIM_JOIN_KEYS Controls removal of trailing spaces from join keys.Set the value as "Y" to remove trailing spaces from join keys.

PC_CLIENT_INSTALL_PATH Location of the pmrep command line program. You must set thevariable to run DVOCmd from the UNIX command line.

INFA_DOMAINS_FILE Location of the domains.infa file. You must set the variable to runDVOCmd from the UNIX command line.

DV_CONFIG_DIR Location of the user configuration directory.Default is <Windows user configuration directory>\DataValidatorYou can change the path to launch the Data Validation Option Clientwith a different user configuration directory.

INFA_HOME Location of the PowerCenter installation where domains.infa isavailable.

DV_PM_ROOT_DIR The service variable in the PowerCenter Integration Service thatspecifies the PowerCenter root directory. Use this variable to defineother service variables. For example, you can use $PMRootDir todefine subdirectories for other service process variable values. Youcan set the $PMSessionLogDir service process variable to$PMRootDir/SessLogs.

30 Chapter 4: Data Validation Option Installation and Configuration

Page 44: PC 950 DataValidationOption UserGuide En

Modifying the License KeyYou can modify the license key when the license expires or if you decide to change the Data Validation Optionlicense. If you modify the license from the enterprise license to standard license, you will lose all the test detailsstored for parameterization, bad records, and all the Jasper reports.

1. Select Help > Change License Key.

The Change License Key dialog box appears.

2. Click Browse and select the license key.

3. Click OK.

Restart the Data Validation Option Client.

JasperReports Server SetupYou must set up a JasperReports Server before you can generate Jasper reports.

You can use the JasperReports Server available with Informatica Services or use a standalone JasperReportsServer. JasperReports Server available with Informatica Services is called the Reporting and Dashboards Service.

Informatica 9.1.0 HotFix 1 and HotFix 2 includes JasperReports Server 4.0.1. Informatica 9.1.0 HotFix 3 and laterincludes JasperReports Server 4.2.0. If you use Informatica 9.1.0 HotFix 1 or HotFix 2, contact Informatica GlobalCustomer Support to obtain the required patch files that you must install to generate Jasper reports.

You can update the heap size of the JasperReports Server available with Informatica from the Administrator tool toimprove the performance. Default heap size is 512MB. Update the heap size based on the requirement. In theAdministrator tool, select the required Reporting and Dashboards Service and update Maximum Heap Size in theAdvanced Properties tab.

If you want to use a standalone JasperReports Server, download and install JasperReports Server 4.2.1 for theWindows 32-bit platform, Windows 64-bit platform, or Linux 64-bit platform.

You can download the JasperReports Server installer from the following location: http://jasperforge.org/projects/jasperserver/downloads

You must update the standalone JasperReports Server with the Datadirect drivers available with Data ValidationOption. Copy dwdb2.jar, dwsqlserver.jar, and dworacle.jar from the following location: <Data Validation Optioninstallation directory>\lib. Paste the files in the following directory: <JasperReports Server installationdirectory>\tomcat\lib.

You must install the SSL certificate if you use a JasperReports Server that runs over an HTTPS connection.

From the command line, browse to the following location: <Data Validation Option Installation Directory>\DVO\jre\bin

Run the following command: keytool -importcert -alias <certificate alias name> -file " <certificate path>\<certificatefilename>" -keystore ..\lib\security\cacerts

Modifying the License Key 31

Page 45: PC 950 DataValidationOption UserGuide En

C H A P T E R 5

Data Validation Option ManagementThis chapter includes the following topics:

¨ Data Validation Option Management Overview, 32

¨ User Configuration, 32

¨ Multiple PowerCenter Installation Configuration, 34

¨ Informatica Authentication, 34

Data Validation Option Management OverviewData Validation Option allows you to use multiple users and multiple PowerCenter versions on the same clientmachine.

When you configure a user, Data Validation Option stores the settings in a user configuration directory. You canconfigure multiple users in the Data Validation Option client to connect to multiple Data Validation Optionrepositories. You can also use multiple PowerCenter versions on the same client with Data Validation Option.

User ConfigurationData Validation Option creates a configuration directory for each user. The user configuration directory containsthe user preferences files, preferences.xml and the DVConfig.properties file. It also contains directories that storelog files, reports, and temporary files that the user generates.

By default, the user configuration directory is one of the following directories:

¨ 32-bit operating systems: C:\Documents and Settings\<user name>\DataValidator\¨ 64-bit operating systems: C:\Users\<user name>\DataValidator\

You can specify or change the user configuration directory through a batch file, through the --confdir option in thecommand line, or through the environment variable DV_CONFIG_DIR.

32

Page 46: PC 950 DataValidationOption UserGuide En

Preferences FileData Validation Option stores connection information in the preferences file for each user. Data Validation Optioncreates the preferences file in the user configuration directory.

When the user opens Data Validation Option, the user does not have to enter Data Validation Option repositoryinformation. Data Validation Option reads the connection information from the preferences file.

If you do not want users to have access to the database password, you can create users and preference filesthrough the DVOCmd CreateUserConfig. Data Validation Option creates a preference file for each user. If youcreate users through the CreateUserConfig command, each additional user must still perform all the configurationsteps.

Changing the User Configuration Directory through a Batch FileYou can create a batch file with commands to run Data Validation Option with a specific user configurationdirectory.

Create a batch file containing the following command:"<Data Validation Option Installation Directory>\DVOClient.exe" <User Configuration Directory>

For example, "C:\Program Files\Informatica9.1.0\DVO\DVOClient.exe" C:\DVOConfig_Dev

RELATED TOPICS:¨ “Command Line Integration” on page 103

Changing the User Configuration Directory through an EnvironmentVariable

You can change the user configuration directory through an environment variable on the Data Validation OptionClient machine.

To change the user configuration directory through an environment variable, create an environment variable,DV_CONFIG_DIR, with the value set as the full file path for the user configuration directory.

Multiple Data Validation Option Repository AccessThe Data Validation Option repository stores the objects and tests that you create in Data Validation Option.

If you work with multiple repositories, you must specify a unique user configuration directory for each repository.

For example, you want separate the Data Validation Option repositories for your development and productionenvironments. To specify unique user configuration directories for each repository, create a batch file for eachrepository that starts the Data Validation Option Client and specify the user configuration directory.

Suppose you install Data Validation Option in the default directory on 32-bit Windows. You want to set thedevelopment user configuration directory to C:\DVOConfig_Dev and the production user configuration directory to C:\DVOConfig_Prod. Create two batch files that start the Data Validation Option Client.

For the development environment, enter the following text in the batch file: "C:\Program Files\Informatica9.1.0\DVO\DVOClient.exe" C:\DVOConfig_Dev.

For the production environment, enter the following text in the batch file: "C:\Program Files\Informatica9.1.0\DVO\DVOClient.exe" C:\DVOConfig_Prod.

User Configuration 33

Page 47: PC 950 DataValidationOption UserGuide En

Multiple PowerCenter Installation ConfigurationYou can configure a batch file to use a different PowerCenter version for Data Validation Option.

If you want to use multiple PowerCenter versions with an installation of Data Validation Option, create batch filesfor each PowerCenter version.

Create a batch file with the following entries:

@ECHO OFFSET INFA_HOME=<INFA HOME PATH>DVOClient.exe

In the batch file, you must set the INFA_HOME environment variable to the PowerCenter version that you need toaccess and launch Data Validation Option. After you create a batch file, you must modify the Data ValidationOption shortcuts to call the batch file instead of DVOClient.exe.

If you want to use DVOCmd with a different PowerCenter installation, provide DVOCmd.exe instead ofDVOClient.exe in the batch file.

Informatica AuthenticationAdministrators can enable Informatica authentication so that the users must use valid Informatica domaincredentials to use the Data Validation Option Client.

By default, Data Validation Option Client does not validate the users that launch the Data Validation Option Client.You can enable Informatica authentication if you have PowerCenter 9.0.1 or later.

Informatica authentication validates over a secure connection with TLS if you have enabled TLS in the Informaticadomain. To configure Informatica authentication, you must have an Informatica login credential with administratorprivileges. The information you require to configure Informatica authentication is available in the nodemeta.xml filein machine that hosts the Informatica services. You can also update the Informatica authentication propertiesthrough the DVOCmd command UpdateInformaticaAuthenticationConfiguration.

Data Validation Option Users and Informatica UsersData Validation Option users and users with Informatica credentials can launch Data Validation Option Client.

Users connect to a Data Validation Option schema when they use the Data Validation Option Client. You canenable Informatica authentication to enable valid Informatica domain users to log in to Data Validation Option.

To configure Informatica authentication, you must have an Informatica login credential with administratorprivileges. To enable authentication, map each Data Validation user to an Informatica user. If the two users havethe same name, then Data Validation Option maps the users automatically. If the two users have different names,then you can map Data Validation Option users to Informatica users with the DVOCmd commandLinkDVOUsersToInformatica. The permissions of the Informatica user determines the PowerCenter metadataaccess for the associated Data Validation Option user, so it is important to ensure that those permissions andprivileges are set correctly.

34 Chapter 5: Data Validation Option Management

Page 48: PC 950 DataValidationOption UserGuide En

SecurityYou can enable the Transport Layer Security (TLS) protocol for secure authentication of Informatica domain usersin Data Validation Option.

Informatica authentication in Data Validation Option enables connection in a secure network with TLS. Enable TLSin Informatica Administrator to connect to Data Validation Option over a secure connection. You can enable TLS inthe Informatica Administrator from the Informatica domain properties. After you enable TLS for the domain,configure Informatica authentication in Data Validation Option Client to use TLS. The properties for TLS areaccessed from the Informatica domain.

Informatica Authentication ParametersYou must configure the host, port, and tlsenabled elements available in the nodemeta.xml file to enableInformatica authentication.

The nodemeta.xml file is available in the following location:

<Informatica_services_installation_directory>\isp\config

The elements of the nodemeta.xml file contains all the attributes associated with the Informatica domain

Configure the following elements available in the nodemeta.xml file to enable Informatica authentication:

Element Description

host Name of the machine that hosts Informatica services.

port Port through which Data Validation Option accesses Informatica domain.Note: Do not use the value available in the httpport element.

tlsEnabled Indicates whether TLS is enabled in the domain. If TLS is not enabled in the Informatica domain,tlsEnabled element is not available in the XML file.

Configuring Informatica AuthenticationYou can configure Informatica authentication in Data Validation Option to enable users to access Data ValidationOption Client with their Informatica credentials.

1. In the Data Validation Option Client, click File > Settings > Preferences.

The Preferences dialog box appears.You can also double-click on the Data Validation Option user in the navigator.

2. Select Informatica Authentication.

3. Select Enable Informatica User Authentication.

4. Enter the Informatica domain host name and port specified in the nodemeta.xml.

5. Select Is Secure Connection to authenticate the user over a secure connection. Ensure that TLS is enabledin the Informatica domain.

6. Click Test to test the domain settings.

7. Click Save to save the domain settings.

Data Validation Option Client prompts you to enter the Informatica login credentials when you restart theapplication.

Informatica Authentication 35

Page 49: PC 950 DataValidationOption UserGuide En

C H A P T E R 6

RepositoriesThis chapter includes the following topics:

¨ Repositories Overview, 36

¨ Adding a Repository, 36

¨ Editing Repositories, 37

¨ Deleting Repositories, 37

¨ Refreshing Repositories, 37

¨ Metadata Export and Import, 38

¨ Metadata Manager Integration, 39

Repositories OverviewData Validation Option connects to a PowerCenter repository to import metadata for PowerCenter sources,targets, folders, and connection objects. Data Validation Option also connects to a PowerCenter repository tocreate mappings, sessions, and workflows in the Data Validation Option target folder.

When you add a repository to Data Validation Option, you add either a source or target repository. You can addone target repository and multiple source repositories.

The target repository contains metadata for PowerCenter sources, targets, folders, and connection objects. It alsocontains the Data Validation Option target folder. The target folder stores the mappings, sessions, and workflowsthat Data Validation Option creates when you run tests. Do not store other PowerCenter mappings, sessions, orworkflows in this folder.

A source repository contains metadata for PowerCenter sources, targets, and folders. Add source repositories toData Validation Option if you want to compare tables from different repositories. When you add a sourcerepository, you must verify that all connection objects in the source repository also exist in the target repository.Data Validation Option uses the connection objects in the target repository when you run tests on table pairs.

The version number for a source repository can differ from the version number for the target repository. Theversion numbers for two source repositories can also differ.

Adding a Repository1. Right-click INFA Repositories in the Navigator.

36

Page 50: PC 950 DataValidationOption UserGuide En

2. Select Add Repository.

The Repository Editor dialog box appears.

3. Enter the repository properties.

Set Contains Target Folder to true when you add a source repository and to false when you add a targetrepository.

4. Click Test to test the repository connection.

Data Validation Option verifies the connection properties. If the repository is a target repository, DataValidation Option also verifies the PowerCener Integration Service, and that the Data Validation Option targetfolder and option results warehouse connection exist in the repository.

Editing Repositoriesu To edit a repository, right-click the repository, and select Edit Repository, or double-click the repository.

The Repository Editor dialog box appears. You can update any property that is enabled.

Deleting Repositoriesu To delete a repository from Data Validation Option, right-click the repository, and select Delete Repository.

Data Validation Option deletes the repository and all table pairs, single tables, tests, and views based on therepository data.

Refreshing RepositoriesYou refresh a source repository when the contents of the PowerCenter repository have changed. You usuallyrefresh a target repository only when there are additions or changes to connection objects.

When you refresh a repository, Data Validation Option reimports source, target, folder, and connection metadatafrom the PowerCenter repository. Therefore, Data Validation Option objects that use changed or deletedPowerCenter objects might no longer be valid after you refresh a repository. If you created table pairs, singletables, or tests with tables that were deleted from the PowerCenter repository, Data Validation Option deletesthem when you refresh the repository.

1. To refresh all repositories at once, right-click INFA Repositories in the Navigator, and select Refresh AllRepositories. To refresh one repository, right-click the repository, and select Refresh Repository

Editing Repositories 37

Page 51: PC 950 DataValidationOption UserGuide En

2. When you refresh one repository, you can select the objects to refresh. Select one of the following options:

Everything

Data Validation Option reimports all source, target, folder, and connection metadata. It updates the folderlist, and the Sources and Targets folders in the Navigator.

Connections

Data Validation Option reimports connection metadata. Select this option when a PowerCenter useradds, removes, or updates connection objects.

Folder List

Data Validation Option reimports folder metadata. It updates the folder list in the Navigator. Select thisoption when a PowerCenter user adds or removes folders.

Folders (Sources and Targets)

Data Validation Option reimports source and target metadata. It refreshes the contents of the Sourcesand Targets folders in each folder in the repository. Select this option when a PowerCenter user adds,removes, or modifies sources or targets in folders.

You can also refresh repository folders individually. You might refresh a folder after you refresh the folder listand Data Validation Option imports a new folder. To refresh a repository folder, right-click the folder in theNavigator, and select Refresh Folder (Sources and Targets). Data Validation Option refreshes the contentsof the Sources and Targets folders within the folder you refresh.

Note: Refreshing everything in a repository or refreshing all repositories can take several minutes to severalhours, depending on the size of the repositories. If you work with a small number of repository folders, youcan shorten refresh time by refreshing the folders individually.

Exporting Repository MetadataYou can export repository metadata to a file. You might want to export repository metadata when migrating from adevelopment to a production environment or if you are asked to do by Informatica Global Customer Support.

To export repository metadata from Data Validation Option, right-click the repository, and select Export Metadata.Data Validation Option prompts you for a file name and file path.

Metadata Export and ImportData Validation Option allows you to export and import test metadata from the repositories. Metadata import andexport allows users to share tests and allows rapid generation of tests through scripting.

Scripting is particularly useful in the following scenarios:

¨ You have a very large number of repetitive tests for different table pairs. In this situation, it might be faster togenerate the tests programmatically.

¨ The source-to-target relationships and rules are defined in a spreadsheet. This often happens during datamigration. You can script the actual Data Validation Option tests from the spreadsheets.

You can import and export the following metadata:

¨ Table Pairs

¨ Single Tables

¨ PowerCenter Sources

38 Chapter 6: Repositories

Page 52: PC 950 DataValidationOption UserGuide En

¨ SQL views

¨ Lookup views

¨ Join views

RELATED TOPICS:¨ “Metadata Import Syntax” on page 138

Exporting MetadataData Validation Option allows you to export selected objects, such as table pairs or SQL views, and all of theirdependencies, such as tables, to an XML file. You can also export all objects to an XML file.

u To export an object, right-click the object and select Export Metadata. To export all objects, select File >Export All Metadata.

Data Validation Option prompts you for the metadata export file name and directory.

Importing MetadataWhen you import metadata from an XML file, you can overwrite repository objects such as table pairs and viewsthat have the same type and same name as objects you are importing. To overwrite repository objects when youimport, select File > Import Metadata (Overwrite). To import metadata without overwriting repository objects,select File > Import Metadata. When you import metadata without overwriting objects, Data Validation Optionstops importing metadata if an object in the import file and an object in the repository are of the same type andhave the same name.

When you import metadata, you can automatically generate value tests as you do when you right-click a table pairand select Generate Value Tests. To do this, use the generate-tests command in the import XML file. Place thecommand at the end of the metadata definition for the table pair.

For example, to generate value tests for a table pair named “CustDetail_CustStage,” add the following lines to theimport XML file at the end of the table pair metadata definition:

<Commands>generate-tests("CustDetail_CustStage");

Metadata Manager IntegrationYou can view the metadata of data sources if you configure Metadata Manager integration in Data ValidationOption.

You can analyze the impact of test results on data sources if you enable Metadata Manager integration. You canview the metadata of the data source in the PowerCenter repository. To view the metadata of data sources, youmust setup a Metadata Manager Service in the Informatica domain. You must create a PowerCenter resource forthe PowerCenter repository that contains the data source.

Right-click on a data source in the repository and select Get Metadata to view the metadata of the data source.

Metadata Manager Integration 39

Page 53: PC 950 DataValidationOption UserGuide En

Configuring Metadata Manager IntegrationConfigure Metadata Manager integration in Data Validation Option to view the data lineage of data sources.

1. Right-click the repository for which you want to enable Metadata Manager Integration and select EditRepository.

2. Select Enable Metadata Manager Integration.

3. Select whether the Metadata Manager Service runs on a secure connection.

4. Enter the server host name.

5. Enter the Metadata Manager Service port.

6. Enter the name of the PowerCenter resource.

7. Click Test to the test the settings.

8. Click Save.

40 Chapter 6: Repositories

Page 54: PC 950 DataValidationOption UserGuide En

C H A P T E R 7

Table PairsThis chapter includes the following topics:

¨ Table Pairs Overview, 41

¨ Table Pair Properties, 41

¨ Adding Table Pairs, 48

¨ Editing Table Pairs, 49

¨ Deleting Table Pairs, 49

¨ Overall Test Results, 49

Table Pairs OverviewA table pair is the basis for all tests that compare one table to another. You can select a relational table, flat file,lookup view, SQL view, or join view as one or both tables in a table pair.

Data Validation Option considers Table A as the master table and Table B as the detailed table. When you selecttables in a table pair, select the master table as Table A and detailed table as Table B to improve the performance.

If you have the enterprise license, you can store all the error records after you run a test for the table pair. You canalso use a parameter file that the PowerCenter Integration Service applies when Data Validation Option runs thesessions associated with the table pair.

Table Pair PropertiesYou can view table pair properties by selecting a table pair in either the Navigator or the Table Pairs tab andviewing the properties. The properties vary depending on the types of objects you select for the table pair.

The following table describes the table pair properties:

Property Description

Table A/B The first or second table in the table pair.

Conn A/B Connection details for the table.

41

Page 55: PC 950 DataValidationOption UserGuide En

Property Description

Execute where clause A/B Filters the records that the PowerCenter Integration Service reads from thedatabase. Enter the a valid PowerCenter Boolean expression or an SQL WHEREclause without the WHERE keyword.

Optimization Level Controls which test logic that Data Validation Option converts to a PowerCentermapping and which test logic it pushes to the database. You can select one of thefollowing options:- Default - Data Validation Option converts all test logic to a PowerCenter mapping

and applies sorting to the data source.- WHERE clause, Sorting, and Aggregation in DB - Data Validation Option pushes

the WHERE clause, sorting logic for joins, and all aggregate tests to the database.Data Validation Option converts all other test logic to a PowerCenter mapping.

- Already Sorted Input - PowerCenter mapping does not sort the input. If the datasource is not sorted, tests may fail.

Description Table pair description. By default, Data Validation Option uses "Joined <Table A>-<Table B>" for joined table pairs. It uses "<Table A>-<Table B>" for table pairs thatare not joined.

External ID Identifier for the table pair that you can use when you run Data Validation Optiontests at the command line.

Table Join Join condition for the tables.

RELATED TOPICS:¨ “Single-Table Constraints” on page 62

¨ “Tests for Single-Table Constraints” on page 70

Connection PropertiesChoose the connection properties based on the data source type.

You must select a connection for all the data sources except for flat files. For flat files, you must provide thesource directory and the file name. Connections are PowerCenter connection objects created in the WorkflowManager.

Relational Connection PropertiesChoose the relational connection properties for Microsoft SQL Server, Oracle, IBM DB2, Netezza, andPowerExchange for DB2 data sources.

Configure the following properties when you select a relational data source:

Property Description

Connection PowerCenter connection object to connect to the relational data source.

Override OwnerName

Override the database name and schema of the source.For example, a Microsoft SQL Server table is identified by <database>.<schema>.<table>. To overridethe database and the schema, enter <new database name>.<To change the schema, enter <new schemaname> in the text box. You cannot change only the database name.

42 Chapter 7: Table Pairs

Page 56: PC 950 DataValidationOption UserGuide En

SAS and Salesforce Connection PropertiesYou can use a SAS or Salesforce data source in Data Validation Option.

Select the PowerCenter connection object when you select an SAS or Salesforce data source.

Note: You cannot override the owner name for SAS and Salesforce data sources.

SAP Connection PropertiesYou must configure the SAP authentication information to use SAP data sources.

Configure the following properties when you select an SAP data source:

Property Description

Connection PowerCenter connection object to connect to the SAP data source.

SAP User Name SAP source system connection user name. Must be a user for which you have created a sourcesystem connection.

SAP Password Password for the user name.

SAP Client SAP client number.

SAP Language Language you want for the mapping. Must be compatible with the PowerCenter Client code page.Data Validation Option does not authenticate the value. Ensure that you enter the correct value sothat the tests run successfully.

SAP Connect String Type A or Type B DEST entry in saprfc.ini.

SAP Data Sources in Data Validation OptionYou cannot override the owner name for an SAP data source. Data Validation Option uses the stream mode forinstallation of the ABAP programs and cannot use the FTP mode. SAP data sources must not contain thebackslash (/) character in the field names.

Flat File Connection PropertiesYou can use the flat file data sources in the PowerCenter repository.

Configure the following properties when you select a flat file data source:

Property Description

Source Dir Directory that contains the flat file. The path is relative to the machine that hosts Informatica Services.

Source File File name with the file name extension.

Database ProcessingWhen you include a large table in a table pair, you can optimize the way Data Validation Option joins table data.

Data Validation Option uses joins in value and set tests. To run value tests on a table pair, you must join the tablesbased on the related keys in each table. When you run set tests, Data Validation Option joins the tables.

Table Pair Properties 43

Page 57: PC 950 DataValidationOption UserGuide En

By default, Data Validation Option joins the tables with an inner equijoin and sorts the rows in the table. The joincondition uses the following WHERE clause syntax:

Table A.column_name = Table B.column_name

If one of the tables in a table pair contains a large volume of data compared to the other table, you can improvetest performance by designating the larger table as the detail source and the smaller table as the master source.Select smaller table as Table A and the larger table as Table B. The PowerCenter Integration Service compareseach row of the master source against the detail source when it determines whether to join two records.

If both tables in a table pair are large, you can improve test performance by using sorted input. When you usesorted data, the PowerCenter Integration Service minimizes disk input and output when it runs a test.

You can further increase performance for large relational tables by pushing the sorting logic to the database.

Pushing Test Logic to the DatabaseYou can push some test logic for a relational table to the source database. By default, Data Validation Optioncreates a PowerCenter mapping for each table pair. When you run a test, PowerCenter runs the mapping logic ina session. In some cases, you can significantly increase test performance by processing logic in the databaseinstead of a PowerCenter session.

You can push the WHERE clause, logic for aggregate tests, and sorting logic for joins to the source database.

Pushing the WHERE clause to the database can reduce the number of rows the PowerCenter Integration Servicereads when it runs a Data Validation Option test. For example, Table A is a table of U.S. customers, and you wantto test data only for customers in California. You enter a WHERE clause such as STATE = 'CA'. Data ValidationOption creates a mapping in PowerCenter that reads all U.S. customers. The mapping might contain a Filtertransformation that removes all records for customers outside of California.

If you push the WHERE clause to the database, the database filters customer records. The PowerCenterIntegration Service reads records for California customers only. Pushing the WHERE clause to the databaseincreases the performance of the PowerCenter session because the Integration Service reads a small subset ofrecords instead of all records in the table.

Pushing aggregate test logic to the database can also reduce the number of rows the PowerCenter IntegrationService reads from the database. For example, you use a COUNT test to compare the number of non-null recordsin two Customer_ID columns. If you push the test logic to the database, the PowerCenter Integration Service doesnot have to import all customer records in order to count them.

Pushing the sorting logic for joins to the database causes the database to sort records before it loads them toPowerCenter. This minimizes disk input and output when Data Validation Option runs a test. The reduction in inputand output is greatest for large tables, so you might want to push sorting logic to the database when you run testson tables with large volumes of data.

Pushing test logic to the database slightly increases the load on the database. Before you push test logic to thedatabase, you must decide whether the increased performance of the test outweighs the increased load on thedatabase.

WHERE ClausesUse a WHERE clause to limit the records returned from a data source and made available for a test.

Since PowerCenter pulls data from each data source individually, you can provide separate WHERE clauses orfilters for the data pulled from each table. Enter the WHERE clause without the “WHERE” keyword, for example,CITY <> 'London'.

The PowerCenter Integration Service is case-sensitive when it reads WHERE clauses. This functionalitycorresponds to the use of the Filter transformation in PowerCenter.

44 Chapter 7: Table Pairs

Page 58: PC 950 DataValidationOption UserGuide En

Data Validation Option does not check the WHERE clause syntax. If the PowerCenter Integration Serviceexecutes the WHERE clause, any valid PowerCenter expression, including expressions that use PowerCenterfunctions, is allowed. If the PowerCenter syntax is not valid, a mapping installation error occurs.

Use the followingguidelines if the data source executes the WHERE clause:

¨ Relational data source. The WHERE clause must be a valid SQL statement. If the SQL statement is not valid, aruntime error occurs. Enter a WHERE clause in the SQL format when you pushdown the WHERE clause intothe database.

¨ SAP data source. The WHERE clause must be a valid SAP filter condition in the ERP source qualifier.

¨ Salesforce data source. The WHERE clause must be a valid SOQL filter condition.

¨ SAS data source. The WHERE clause must be a valid Whereclause Overrule condition in the SAS sourcequalifier.

When you enter a WHERE clause, consider the following issues:

¨ Data Validation Option uses two WHERE clauses instead of one. A typical SQL statement has one WHEREclause. Data Validation Option, however, has one WHERE clause for Table A and one for Table B. Therefore, itis possible that more data comes from one table than the other. For example, applying emp_id < 10 to Table Abut not Table B results in only nine records coming from Table A and all records from Table B. This affectsOUTER_VALUE and aggregate tests, which might or might not be what you intended. However, when youcompare production to development where the production environment has three years of data anddevelopment only has two weeks, applying a WHERE clause to production equalizes the data sets.

¨ Certain validation problems can be solved through a nested SQL WHERE clause. For example, if you want tofilter for employees with disciplinary issues, use the following WHERE clause (assuming it is executed in thedatabase):

emp_id IN (SELECT DISTINCT emp_id FROM table_discipline)¨ Because the filter condition you enter in the WHERE clause applies to all tests in the table pair, Data Validation

Option applies the WHERE clause before it joins the tables. This can improve performance when the WHEREclause filters a large percentage of rows from the source table because the PowerCenter Integration Serviceprocesses fewer rows later in the mapping. If you want to enter a condition that filters a small percentage ofrows, or you want to apply different filters for different tests, you can enter a filter condition in the Table PairTest Editor dialog box.

Table JoinsYou can join or unjoin a table pair. You must join the tables if you want to run a VALUE or OUTER_VALUE test.Data Validation Option ignores joins for all set and aggregate tests.

To create a joined table pair, define one or more conditions based on equality between the tables. For example, ifboth tables in a table pair contain employee ID numbers, you can select EMPLOYEE_ID as the join field for onetable and EMP_ID as the join field for the other table. Data Validation Option performs an inner equijoin based onthe matching ID numbers.

You can join tables only on fields of like datatypes. For example, you can join an INT field to a DECIMAL field, butnot to a DATETIME or VARCHAR field. Data Validation Option supports numeric, string, datetime and binary/otherdatatypes. Joins are not allowed on binary/other datatypes.

You can create a join with one set of fields or with multiple sets of fields from each table if the table requires this toproduce unique records. Note that additional sets of fields increase the time necessary to join two sources. Theorder of the fields in the join condition can also impact the performance of Data Validation Option tests. If you usemultiple sets of fields in the join condition, Data Validation Option compares the ports in the order you specify.

Table Pair Properties 45

Page 59: PC 950 DataValidationOption UserGuide En

When you create a join, you can select a field from each table or enter an expression. Enter an expression to jointables with key fields that are not identical. For example, you have two customer tables that use different cases forthe LAST_NAME field. Enter the following expression for one of the join fields: lower(LAST_NAME)

When you enter an expression for one or both join fields, you must specify the datatype, precision, and scale ofthe result. The datatype, precision, and scale of both join fields must be compatible. The expression must usevalid PowerCenter expression syntax. Data Validation Option does not check the expression syntax.

Bad Records ConfigurationIf you have the enterprise license, you can store up to 16,000,000 bad records per test to perform advancedanalysis and up to 1000 bad records for reporting. If you have the standard license, you can store up to 1000 badrecords for reporting and analysis.

For reporting, you can set the number of max bad records in the mapping properties in the Preferences dialog box.The default number of bad records is 100. You can set up to 1000 bad records for reporting.

For advanced analysis, you can enter the max bad record for detailed analysis along with the file delimiter in theDetailed Error Rows Analysis section in mapping properties. The default number of bad records is 5000. You canset up to 16,000,000 bad records.

At the table pair or single table level, you can choose to store all the bad records from the tests for the table pair orsingle table. Select Save all bad records for test execution in the Advanced tab when you create or edit a tablepair or single table.

Data Validation Option stores the following tests for table pairs:

¨ Value

¨ Outer Value

¨ Set

Data Validation Option stores the following constraint tests for single tables:

¨ Value

¨ Unique

You must select whether you want to store the bad records in a flat file or a table in the Data Validation Optionschema. If you store bad records in a flat file, you can optionally enter the name of the file. Data Validation Optionappends test information to the name and retains the file extension.

Note: If you modify the file delimiter in the preferences file, run the InstallTests command with the forceInstalloption for the existing table pairs or single tables that you already ran. You can also edit and save the table pair orsingle table from the Data Validation Option Client before you run the test. If you modify the bad records value,you need not reinstall the tests.

Bad Records in Flat FileIf you configure to store bad records in flat file, Data Validation Option creates the flat file in the machine that runsInformatica services.

The flat files that Data Validation Option generates after running the tests are stored in the followingfolder:<PowerCenter installation directory>\server\infa_shared\TgtFiles

You can modify the folder from the Administrator tool. Edit the $PMTargetFileDir property for the PowerCenterIntegration Service.

Data Validation Option generates a folder for each of the table pair or single table. The name of the table pair orsingle table is in the following format: TablePairName_TestRunID or SingleTableName_TestRunID

46 Chapter 7: Table Pairs

Page 60: PC 950 DataValidationOption UserGuide En

Data Validation Option creates flat files for each test inside the folder. The name of the flat file is in the followingformat: <user defined file name>_ TestCaseType_TestCaseColumn A_TestCaseColumnB_TestCaseIndex.<user definedfile extension>

You can get the Test Case Index from the Properties tab and the Test Run ID from the results summary of the testin the detail area. You can get the Table Pair ID/Single Table ID from the Table Pair or Single Table properties.

For example, you enter the file name as BAD_ROWS.txt when you configure the table pair or single table and yourun an outer value test on fields FIRSTNAME and FIRSTNAME. The test-case index is 1 and the test-fields areexpressions. The bad records file after you run the test is of the format,BAD_ROWS_OUTER_VALUE_ExprA_ExprB_1.txt.

Data Validation Option supports all the file delimiters that PowerCenter supports. When you enter non-printablecharacters as delimiters, you should enter the corresponding delimiter code in PowerCenter. When you importthese files in PowerCenter, you have to manually create the different data fields since the code appears in theplace of delimiter in the bad records file.

Caution: Data Validation Option uses comma as a delimiter if there a multiple primary keys. You must not usecomma as a file delimiter.

When you run the tests for the table pair or single table, Data Validation Option stores the details of the badrecords along with the following format:

Table Pair Name Table A NameTable A ConnectionTable B NameTable B ConnectionTest DefinitionTest Run TimeTest Run By UserKey A[],Result A[],Key B[],Result B[]

If the tests pass, Data Validation Option still creates a flat file without any bad records information.

Bad Records in Database Schema ModeIf you choose to save detail bad records in the Data Validation Option schema, the bad records are written into thedetail tables.

The following table describes the tables to which Data Validation Option writes the bad records based on the typeof test:

Test Type Table Columns

Value, Outer Value, ValueConstraint

ALL_VALUE_RESULT_DETAIL TEST_RUN_ID, TEST_CASE_INDEX, KEY_A,VALUE_A, KEY_B, VALUE_B

Unique Constraint ALL_UNIQUE_RESULT_DETAIL TEST_RUN_ID, TEST_CASE_INDEX, VALUE

Set ALL_SET_RESULT_DETAIL TEST_RUN_ID, TEST_CASE_INDEX, VALUE_A,VALUE_B

Note: You must ensure that the database table has enough table space to hold all the bad records.

The test details of all the tests that you ran in Data Validation Option is available in the TEST_CASE table. Thetest installation details of the tests are available in the TEST_INSTALLATION table. You can obtain the TEST_IDof a test from the TEST_INSTALLATION table. You need TEST_ID of a test to query the complete details of a testfrom the TEST_CASE table.

Table Pair Properties 47

Page 61: PC 950 DataValidationOption UserGuide En

You can get the Test Case Index from the Properties tab and the Test Run ID from the results summary of the testin the detail area. You can get the Table Pair ID/Table ID from the Table Pair or Single Table properties.

Foe example, you ran a table pair with a value test and outer value test.

The following SQL query is a sample query to retrieve the information of bad records of a test with Test CaseIndex as 5.

select ALL_VALUE_RESULT_DETAIL.*,TEST_CASE.* from ALL_VALUE_RESULT_DETAIL , TEST_RUN, TEST_INSTALLATION, TEST_CASE where ALL_VALUE_RESULT_DETAIL.TEST_RUN_ID=TEST_RUN.TEST_RUN_IDand TEST_RUN.TEST_INSTALLATION_ID=TEST_INSTALLATION.TEST_INSTALLATION_IDand TEST_INSTALLATION.TABLE_PAIR_ID=TEST_CASE.TEST_IDand ALL_VALUE_RESULT_DETAIL.TEST_CASE_INDEX=TEST_CASE.TEST_CASE_INDEXand ALL_VALUE_RESULT_DETAIL.TEST_RUN_ID=220

ParameterizationIf you have the enterprise license, you can perform incremental data validation through parameterization. You canuse parameters in the WHERE clause in Table Pairs and Single Tables.

Before you use a parameter in the WHERE clause, you must enter the name of the parameter file and add theparameters in the Advanced tab of a table pair or single table definition. You must specify the data type, scale,and precision of the parameter. After you add the parameters, you can use them in a WHERE clause to performincremental validation. You can validate the expression with the parameters that you enter in a table pair or singletable.

The parameter file must be in a location accessible to the Data Validation Option Client. The parameters in theparameter file must be in the format, $$<parameter name>=value. Ensure that the parameter that you add in atable pair or single table is available in the parameter file.

When you run a test, Data Validation Option looks up the value of the parameter from the parameter file and runsthe WHERE clause based on that value.

If the Informatica Services run on a Windows machine, you can place the parameter files in a folder on the server.Ensure that the Data Validation Option Client can access the folder. In the Data Validation Option Client, enter theparameter file location as a network path to the parameter file in the server in the following format: \\<servermachine host name>\<shared folder path>\parameter file name

If the Informatica Services run on a UNIX machine, you can place the parameter files in a folder on the server.Install DVOCmd on the server. You can configure parameterization on the Data Validation Option Client andprovide the absolute path of the parameter file and run the tests from the server with DVOCmd. Alternatively, youcan ensure that the Data Validation Option Client can access folder and enter the parameter file location as anetwork path to the parameter file in the server. Run the tests from the Data Validation Option Client.

Adding Table PairsYou can create a table pair from the file menu or from the shortcut in the menu bar.

1. Select the folder to which you want to add the table pair.

2. Click on the table pair shortcut on the menu bar or click on File > New > Table Pair.

The Table Pair Editor window appears.

3. Browse and select the data source that you want to use as Table A and Table B of the table pair.

48 Chapter 7: Table Pairs

Page 62: PC 950 DataValidationOption UserGuide En

You can search for a data source by name or path. You can search for lookup views, sql views, and join viewsonly with their names.

4. Click Edit and configure the connection properties for the table.

5. Enter the WHERE clauses you want to execute on the data sources.

Enter the WHERE clause in the data source format if you run the WHERE clause in the database. Otherwise,enter the WHERE clause in the PowerCenter format.

6. Enter the description for the table pair.

7. Enter the external ID for the table pair.

You can use the external ID to execute the table pair from the command line.

8. If the data source is a relational source or an application source, you can choose whether to execute theWHERE clause within the data source.

If you choose to run the WHERE clause within the data source, PowerCenter Integration Service passes theWHERE clause to the data source before PowerCenter Integration Service loads data. Ensure that theWHERE clause you enter is in the data source format.

9. Select the database optimization level.

10. Enter the join fields for the two tables.

11. If you have the enterprise license, click the Advanced tab.

You can select whether to store all the bad records for test execution and enter the storage details. You canalso enter the details of the parameter file and the parameters.

12. Click Save.

Editing Table PairsTo edit a table pair, right-click the table pair in the Navigator and select Edit Table Pair. You can also edit a tablepair by double-clicking the table pair.

When you edit a table pair, the Table Pair Editor dialog box opens.

Deleting Table PairsTo delete a table pair, right-click a table pair in the Navigator and select Delete Table Pair. You can also select atable pair and press the Delete key to delete a table pair.

When you delete a table pair, Data Validation Option deletes all of the associated tests. Data Validation Optiondoes not delete lookup, join, or SQL views used in the table pair.

Overall Test ResultsAfter you add a table pair to Data Validation Option, you can view the properties by selecting the table pair in theNavigator.

Editing Table Pairs 49

Page 63: PC 950 DataValidationOption UserGuide En

When you select a table pair in the navigator, all the tests run on the table pair appears on the right pane. Also,the run summary of all the tests appear on the top of the page. Select a test to view the details of that test in thebottom pane. You can view the test properties in the Properties tab and the test results in the Results tab.

You view important tests details like Test Case ID in the Properties tab along with other test details.

50 Chapter 7: Table Pairs

Page 64: PC 950 DataValidationOption UserGuide En

C H A P T E R 8

Tests for Table PairsThis chapter includes the following topics:

¨ Tests for Table Pairs Overview, 51

¨ Test Properties, 51

¨ Adding Tests, 57

¨ Editing Tests, 57

¨ Deleting Tests, 57

¨ Running Tests, 58

¨ Automatic Test Generation, 58

¨ Bad Records, 61

Tests for Table Pairs OverviewYou can run the following types of tests on table pairs:

Aggregate

Includes COUNT, COUNT_DISTINCT, COUNT_ROWS, MIN, MAX, AVG, and SUM.

Set

Includes AinB, BinA, and AeqB.

Value

Includes VALUE and OUTER_VALUE.

Note: When you run tests, the target folder must be closed in the Designer and Workflow Manager. If thetarget folder is open, Data Validation Option cannot write to the folder, and the tests return an error.

Test PropertiesYou can apply properties for the table pair test.

51

Page 65: PC 950 DataValidationOption UserGuide En

The following table describes the test properties:

Property Description

Function The test you run such as COUNT, COUNT_DISTINCT, AinB, or VALUE.

Field A/B The field that contains the values you want to compare when you run the test. You mustselect a field from each table in the table pair.

Condition A/B Allows you to filter records after Data Validation Option joins the tables in the table pair.Enter a valid PowerCenter Boolean expression.

Operator The arithmetic operator that defines how to compare each value in Field A with each value inField B.

Threshold The allowable margin of error for an aggregate or value test that uses the approximateoperator. You can enter an absolute value or a percentage value.

Max Bad Records The number of records that can fail comparison for a test to pass. You can enter an absolutevalue or a percentage value.

Case Insensitive Ignores case when you run a test that compares string data.

Trim Trailing Spaces Ignores trailing spaces when you run a test that compares string data. Data Validation Optiondoes not remove the leading spaces in the string data.

Null=Null Allows null values in two tables to be considered equal.

Comments Information about a test. Data Validation Option displays the comments when you view testproperties in the Properties area.

Field A/B is Expression Allows you to enter an expression for Field A or Field B.

Datatype The datatype for the expression if Field A or Field B is an expression.

Precision The precision for the expression if Field A or Field B is an expression.

Scale The scale for the expression if Field A or Field B is an expression B.

TestsThe following table describes the table pair tests:

Test Description

COUNT Compares the number of non-null values for each of the selected fields. This test works with anydatatype. The fields you compare must be of the same general datatype, for example, numeric- to-numeric or datetime-to-datetime.

COUNT_DISTINCT Compares the distinct number of non-null values for each of the selected fields. This test works with anydatatype except binary. The fields you compare must be of the same general datatype, for example,numeric- to-numeric or datetime-to-datetime.

COUNT_ROWS Compares the total number of values for each of the selected fields. This test counts nulls, unlike theCOUNT and COUNT_DISTINCT tests. This test works with any datatype.

52 Chapter 8: Tests for Table Pairs

Page 66: PC 950 DataValidationOption UserGuide En

Test Description

MIN Compares the minimum value for each of the selected fields. This test works with any datatype exceptbinary. The fields you compare must be of the same general datatype, for example, numeric- to-numericor datetime-to-datetime.

MAX Compares the maximum value for each of the selected fields. This test works with any datatype exceptbinary. The fields you compare must be of the same general datatype, for example, numeric- to-numericor datetime-to-datetime.

AVG Compares the average value for each of the selected fields. This test can only be used with numericdatatypes.

SUM Compares the sum of the values for each of the selected fields. This test can only be used with numericdatatypes.

SET_AinB Determines whether the entire set of values for Field A exist in the set of values for Field B. This testworks with any datatype except binary/other. The fields you compare must be of the same generaldatatype, for example, numeric- to-numeric or datetime-to-datetime. You can use this test to confirm thatall values in a field exist in a lookup table. This test examines all values for a column instead of making arow-by-row comparison.

SET_BinA Determines whether the entire set of values for Field B exist in the set of values for Field A. Determineswhether the entire set of values for the field selected from Table B exist in the set of values for the fieldselected from Table A. This test works with any datatype except binary/other. The fields you comparemust be of the same general datatype, for example, numeric- to-numeric or datetime-to-datetime. Youcan use this test to confirm that all values in a field exist in a lookup table. This test examines all valuesfor a column instead of making a row-by-row comparison.

SET_AeqB Determines whether the set of values for the selected fields are exactly the same when compared. Thistest works with any datatype except binary. The fields you compare must be of the same generaldatatype, for example, numeric- to-numeric or datetime-to-datetime. You can use this test to confirm thatall values in a field exist in a lookup table. This test examines all values for a column instead of making arow-by-row comparison.

SET_ANotInB Determines whether there are any common values between the selected fields. If there are commonvalues, the test returns an error. If there are no common values, the test succeeds.

VALUE For joined table pairs, this test compares the values for the fields in each table, row-by-row, anddetermines whether they are the same. If there are any rows that exist in one table but not the other, therows are disregarded which implies an inner join between the tables. If the fields are both null and theNull=Null option is disabled, this pair of records fails the test. This test works with any datatype exceptbinary. The fields you compare must be of the same general datatype, for example, numeric- to-numericor datetime-to-datetime.

OUTER_VALUE For joined table pairs, this test compares the values for the fields in each table, row-by-row, anddetermines whether they are the same. If there are any rows that exist in one table but not the other,they are listed as not meeting the test rules which implies an outer join between the tables. For the testto pass, the number of rows for the tables, as well as the values for each set of fields must be equal. Ifthe fields are both null and the Null=Null option is disabled, this set of records fails the test. This testworks with any datatype except binary. The fields you compare must be of the same general datatype,for example, numeric- to-numeric or datetime-to-datetime.

Fields A and BTo create a test, you must select the fields that contain the values you want to compare from each table in thetable pair. The fields available in each table appear in Field A and Field B. Select a field from each table.

Test Properties 53

Page 67: PC 950 DataValidationOption UserGuide En

Conditions A and BYou can filter the values for each field in a VALUE or OUTER_VALUE test to exclude rows that do not satisfy thetest condition. For example, you want to exclude telephone extension numbers that contain fewer than threecharacters. Use the following VALUE test:

¨ Table A.EXT = Table B.EXT, Condition A = LENGTH(EXT)<3, Condition B = LENGTH(EXT)<3

The filter condition you enter for a test differs from the WHERE clause you enter for a table pair. Data ValidationOption applies the WHERE clause to all tests in the table pair, before it joins the tables. It applies the test filtercondition after it joins the tables. You might want to use a test filter condition instead of a filter condition in theWHERE clause when the filter condition does not remove a large percentage of rows. This can improveperformance if you run one test on the table pair.

Data Validation Option does not check the condition syntax. Any valid PowerCenter expression, includingexpressions that use PowerCenter functions, is allowed. If the PowerCenter syntax is not valid, a mappinginstallation error occurs when you run the test.

To enter a filter condition for either field, enter the filter condition in the Condition A or Condition B field.Because the PowerCenter Integration Service processes the filter condition, it must use valid PowerCenter syntax.Enter the field name in the filter condition, for example, Emp_ID > 0. Do not include the WHERE keyword.

OperatorThe operator defines how to compare the test result for Field A with the test result for Field B. Enter an operatorfor aggregate and value tests.

The following table describes the operators available in the Operator field:

Operator Definition Description

= Equals Implies that the test result for Field A is the same as thetest result for Field B. F or example, SUM(Field A) is thesame as SUM(Field B).

<> Does not equal Implies that the test result for Field A is not the same as thetest result for Field B.

< Is less than Implies that the test result for Field A is less than the testresult for Field B.

<= Is less than or equal to Implies that the test result for Field A is less than or equalto the test result for Field B.

> Is greater than Implies that the test result for Field A is greater than thetest result for Field B.

>= Is greater than or equal to Implies that the test result for Field A is greater than orequal to the test result for Field B.

~ Is approximately the same as Implies that the test result for Field A is approximately thesame as the test result for Field B. An approximate testmust have a threshold value. You can use this operator withnumeric datatypes only.

Note: Data Validation Option compares string fields using an ASCII table.

54 Chapter 8: Tests for Table Pairs

Page 68: PC 950 DataValidationOption UserGuide En

RELATED TOPICS:¨ “BIRT Report Examples” on page 119

ThresholdA threshold is a numeric value that defines an acceptable margin of error for a test. You can enter a threshold foraggregate tests and for value tests with numeric datatypes.

An aggregate test fails if the number of non-matching records exceed the threshold value. For example, you run aCOUNT test that uses the “≈” operator and set the threshold to 10. The test passes when the results are within 10records of each other.

In a value test, the threshold defines the numeric margin of error used when comparing two values. For example,you run a VALUE test that uses the “=” operator. The test compares a REAL field with a value of 100.99 to anINTEGER field with a value of 101. The test passes when the threshold value is at least 0.01.

You can enter an absolute value or a percentage value as the threshold. To enter a percentage value as thethreshold, suffix the number with a percentage (%) sign. For example, 0.1%.

You must configure the threshold if the test uses the approximate operator.

Max Bad RecordsData Validation Option lists records that do not compare successfully as bad records. You can configure atolerance value for the acceptable number of bad records.

By default, for a set or value test to pass, all records must compare successfully. You can configure an acceptablevalue for the maximum number of bad records. The test passes if the number of bad records does not exceed themax bad records value.

You can enter an absolute value or a percentage value for the max bad records. To enter a percentage value asthe max bad records, suffix the number with a percentage (%) sign. For example, 0.1%.

Value and set tests display bad records on the Results tab.

Case InsensitiveString comparison in PowerCenter is case-sensitive. If you want the PowerCenter Integration Service to ignorecase when you run a test that compares strings, enable the Case Insensitive option. This option is disabled bydefault.

Trim Trailing SpacesBy default, string comparison fails if two strings are identical except one string contains extra spaces. Forexample, one field value in a test is 'Data Validation' and the other field value is 'Data Validation ' (with three blankspaces after the last character). If you do not trim trailing spaces, the test produces a bad record. If you trimtrailing spaces, the comparison passes because the extra spaces are ignored.

You might want to trim trailing spaces when the CHAR datatype, which pads a value entered with spaces to theright out to the length of the field, is used. A field of CHAR(20) compared to CHAR(30) fails, even if both fieldshave the same value, unless you trim the trailing spaces.

Enable the Trim Trailing Spaces option if there are spaces after the value entered in a field that should beignored in the comparison. This option is disabled by default.

Test Properties 55

Page 69: PC 950 DataValidationOption UserGuide En

Null = NullIf null values in Field A and Field B should be considered equal, enable the Null = Null option. For example, acurrent employee in a table that contains employee information has a null termination date. If two records with nulltermination dates were compared by a database, they would not be considered equal because SQL does notconsider a null value in one field to be equal to a null value in another field.

Because business users often consider null values to be equal, the Null = Null option is enabled by default.

CommentsYou can enter information about a test in the Comments field. Data Validation Option displays comments in theProperties window when you select the test in the Navigator or the Tests tab.

Expression DefinitionsTo substitute an expression for a database value in a test field, enable the Field A/B is Expression option. Whenyou enable this option, Data Validation Option disables the Field A or Field B control in the Table Pair TestEditor dialog box and enables the expression-related fields.

The expression must use valid PowerCenter expression syntax. Click on Validate after you write the expression.The Data Validation Option Client does not allow you to save the table pair unless you enter a valid expression.

The datatype value represents the datatype of the expression after calculation. If you do not select the correctdatatype, the test might produce an error. The datatypes you can select are PowerCenter datatypes.

The precision and scale must match the precision and scale used in PowerCenter for the datatype or the testmight produce an error. The scale for any string or text datatype is zero. The precision and scale for a datetimedatatype is 23 and 3, respectively.

Note: Data Validation Option does not support the following functions:

¨ User-defined

¨ Custom

¨ Lookup

¨ Variable

Expression TipsTesting often requires the use of different expressions. PowerCenter functions are described at the end of thisguide.

The following examples demonstrate how to use expressions for data validation.

Concatenation, RTRIM, and SUBSTR

Often data transformation involves concatenation or the use of substring functions.

The following example tests the result of concatenation transformation:

¨ Expression A: UPPER(first_name || ' ' || last_name)¨ Field B: full_name

IF Statements

The IF function is arguably the most popular testing function. The syntax for the IF function is as follows:

IF(condition, if_true_part, if_false_part)

56 Chapter 8: Tests for Table Pairs

Page 70: PC 950 DataValidationOption UserGuide En

The following example shows the IF function used in testing:

Table A Table B

sales_usa region (either 'USA' or 'INTL')

sales_intl sales

The aggregate validation can be accomplished by two tests:

¨ Test 1: SUMField A: sales_usaExpression B: IF(region='USA', sales, 0)

¨ Test 2: SUMField A: sales_intlExpression B: IF(region='INTL', sales, 0)

Adding TestsYou can add tests to table pairs one at a time or you can generate tests in batches. You can add any testmanually.

To add a test to a table pair, right-click the name of the table pair in the Navigator or Table Pairs tab, or right-clickin the Tests tab, and select Add Test. The Table Pair Test Editor dialog box opens.

You can generate value tests in a batch for table pairs that have tables with matching field names and datatypefamilies. You can also generate value tests in a batch for tables or files within two target folders that havematching table or file names, field names, and datatype families.

RELATED TOPICS:¨ Automatically Generating Value Tests

¨ Comparing Repository Folders

Editing TestsTo edit a test, right-click the test in the Navigator or Tests tab, and select Edit Test. You can also double-click thetest name. The Table Pair Test Editor dialog box opens.

Deleting TestsTo delete a test, right-click the test in the Navigator or the Tests tab, and select Delete Test. You can also use theDelete key.

Adding Tests 57

Page 71: PC 950 DataValidationOption UserGuide En

Running TestsYou can run tests from the Data Validation Option client or the command line.

Use one of the following methods to run tests:

¨ Select one or more table pairs and click Run Tests.

¨ Right-click a folder in the Navigator and select Run Folder Tests.

¨ Right-click a test in the Tests tab and select Run Selected Tests.

Data Validation Option runs all tests for a table pair together. You can run tests individually if only one test is setup for the table pair. If you select an individual test, Data Validation Option runs all tests for the table pair.

After you run a test, you can view the results on the Results tab.

Data Validation Option uses the following logic to determine whether a test passes or fails:

¨ An aggregate test is calculated as “A <operator> B.” If this relationship is true, the test passes. If the operatorchosen is approximate, then the test is calculated as ABS(A-B) <= Threshold.

¨ A value or set test must produce fewer or an equal number of records that do not match compared to thethreshold value. If there is no threshold value and there are no records that do not match, the test passes.

When you run tests, the target repository folders must be closed in the Designer or Workflow Manager. If thetarget folders are open, Data Validation Option cannot write to the folders, and the tests fail.

Automatic Test GenerationYou can compare two repository folders and generate all table pairs, value tests, and count tests between thetables in the two folders. You can also automatically generate tests for existing table pairs.

You can generate value and count tests to compare tables during a PowerCenter upgrade or migration fromdevelopment to production. You can generate tests based on the column name or column position in the tables.

Data Validation Option generates the following tests:

¨ OUTER_VALUE test for a set of fields used for the join if the field names and field datatypes match. TheOUTER_VALUE test reports any difference when join field values exist.

¨ VALUE test for fields in which the field names and datatypes match. The VALUE test reports any difference inactual values for each set of fields.

¨ COUNT_ROWS test for a set of fields used for the join if the field names and field datatypes match. ACOUNT_ROWS test reports if there are difference in the number of rows between the tables.

Data Validation Option does not generate tests for fields when the field names or datatypes do not match. It alsodoes not generate tests for binary fields.

Generating Table Pairs and TestsUse the Compare Tables dialog box to generate table pairs between tables in any two folders and generate testsassociated with the table pairs.

1. In Data Validation Option, click Action > Compare Tables.

The Compare Tables dialog box appears.

2. Select the repositories that contain the tables that you want to compare.

58 Chapter 8: Tests for Table Pairs

Page 72: PC 950 DataValidationOption UserGuide En

3. Select the folders that contain the tables that you want to compare.

4. Select Sources or Targets from the sub-folders.

5. Select the database connection, for each folder.

If there are tables in the folder that require a different database connection, modify the database connectionin table pairs and tests after auto-generation is complete.

6. If the database is IBM DB2, enter the database owner name.

7. If the data source is SAP, click Select Source Parameters and configure the SAP source parameters.

8. If the folders contain flat files, enter the path that contains the source files and target files in the Source Dirfield.

9. Select whether to compare columns by name or position.

10. If you select compare columns by position, select the column to which you want to skip in the table.

11. Select the folder to store the table pairs and tests.

12. Choose to generate count tests or count and value tests.

13. Select Sort in DB to sort the tables before Data Validation Option generates tests.

14. Choose whether to trim the trailing spaces in the tests.

15. To generate value tests for flat files, Salesforce tables, or tables without primary keys, specify the name andlocation of the text file that contains information of primary keys for the flat files or tables.

The file must contain the name of the flat file or table and the primary key separated by comma. Each entrymust be in a new line. If a table has more than one key, each key must be in a new entry.

For example:flatfile_dictionary_10rows,FIELD1flatfile_dictionary_10rows,FIELD2flatfile_dictionary_10rows_str,FIELD1

16. Select whether you want to skip generation of all the tests or generate count tests if a primary key does notexist for a table.

Data Validation Option generates table pairs even if you skip generation of all the tests.

17. If you have the enterprise license, you can select whether to save all the bad records.

18. If you choose to save all the bad records, select whether to save the bad records in a flat file or the DataValidation Option schema.

19. Click Create.

Data Validation Option displays the summary of table pairs and tests to create and to skip.

20. Click Ok.

Data Validation Option generates all the possible table pairs and tests.

Generating Tests for Table PairsUse the Generate Tests option to generate the associated count tests and value tests of an existing table pair.

1. In the Navigator, select the table pair.

2. Right-click on the table pair and select Generate Value Test.

3. Select whether to compare columns by name or position.

4. If you select compare columns by position, select the column to which you want to skip in the table.

Automatic Test Generation 59

Page 73: PC 950 DataValidationOption UserGuide En

5. Click Yes.

Data Validation Option generates the tests for the table pair.

Compare Columns by PositionYou can generate tests that compares columns of the tables in a table pair by position instead of name.

The following examples describe the various scenarios when you compare columns by position:Table Pair with an SQL View and a Table

Suppose you have an SQL view, sample_view and a table, sample1 as Table A and Table B of a table pair.

The following table lists the columns in sample_view and sample1:

sample_view sample1

SampleA.column1 column1

SampleA.column2 column2

SampleB.column1

SampleB.column2

You need to compare SampleB.column1 and SampleB.column2 with column1 and column2.

Select compare columns by position. Select Table A and enter 2 as the offset.

Table Pair with tables that have the same number of columns and different names

Suppose you have SampleA as Table A and SampleB as Table B of a table pair.

The following table lists the columns in sample_view and sample1:

SampleA SampleB

column1 col1

column2 col2

column3 col3

You need to compare column1, column2, and column3 in SampleA with col1, col2, and col3 in SampleB.

Select compare columns by position. Enter 0 as the offset. You can select Table A or Table B.

Table Pair with tables that have different number of columns

Suppose you have SampleA as Table A and SampleB as Table B of a table pair.

60 Chapter 8: Tests for Table Pairs

Page 74: PC 950 DataValidationOption UserGuide En

The following table lists the columns in sample_view and sample1:

SampleA SampleB

column1 col1

column2 col2

column3 col3

column1

column2

column3

You need to compare column1, column2, and column3 in SampleA with column1, column2, and column3 inSampleB.

Select compare columns by position. Select Table B and enter 3 as the offset.

Bad RecordsBad records are the records that fail a value or a set test.

When you select a test on the Tests tab, the bad records appear on the Results tab. The columns that appear onthe Results tab differ depending on the test.

Aggregate Tests

Aggregate tests do not display bad records. The Results tab displays the test result value from Field A, thetest result value from Field B, and the comparison operator.

Value Tests

Value tests display the following columns for each bad record:

¨ The key for Table A

¨ The field or expression from Table A being compared

¨ The key for Table B

¨ The field or expression from Table B being compared

Set Tests

Set tests display the following columns for each bad record:

¨ The result from Table A

¨ The result from Table B

Note: If you compare two fields where one field has a value and the other is empty, Data Validation optionconsiders the record as a bad record. If the repository is on Oracle, the database stores the empty field as NULL.A bad record with NULL value in an Oracle repository can be either NULL or an empty field.

Bad Records 61

Page 75: PC 950 DataValidationOption UserGuide En

C H A P T E R 9

Single-Table ConstraintsThis chapter includes the following topics:

¨ Single-Table Constraints Overview, 62

¨ Single Table Properties, 62

¨ Adding a Single Table, 67

¨ Editing Single Tables, 68

¨ Deleting Single Tables, 68

¨ Viewing Overall Test Results, 69

Single-Table Constraints OverviewUse a single-table constraint to run tests on a single table. Single-table constraints define valid data within a table.You can enforce valid values, aggregates, formats, and uniqueness. For example, you might want to verify that noannual salary in an employee table is less than $10,000.

Errors in complex logic often manifest themselves in very simple ways, such as NULL values in the target.Therefore, setting aggregate, value, NOT_NULL, UNIQUE, and FORMAT constraints on a target table is a criticalpart of any testing plan.

To run single-table constraints, you must create a single table. You can select a relational table, flat file, lookupview, SQL view, or join view as a single table.

Single Table PropertiesYou can view single table properties by selecting a table in either the Navigator or the Single Tables tab andviewing the properties. Most properties come from the values entered in the Single Table Editor. Other propertiescome from the tests set up for and run on the table.

Edit single table properties in the Single Table Editor dialog box. Data Validation Option displays the SingleTable Editor dialog box when you add or edit a single table. The properties vary depending on the type of objectyou select for the table.

62

Page 76: PC 950 DataValidationOption UserGuide En

The following table describes the single table properties:

Property Description

Table The table name.

Conn Connection properties for the table.

Optimize in Database Controls which test logic that Data Validation Option converts to aPowerCenter mapping and which test logic it pushes to the database. You canselect one of the following options:- Default - Data Validation Option converts all test logic to a PowerCenter

mapping and applies sorting to the data source.- WHERE clause, Sorting, and Aggregation in DB - Data Validation Option

pushes the WHERE clause, sorting logic for joins, and all aggregate tests tothe database. Data Validation Option converts all other test logic to aPowerCenter mapping.

- Already Sorted Input - PowerCenter mapping does not sort the input. If thedata source is not sorted, tests may fail.

Where clause Allows you to limit the number of records that the PowerCenter IntegrationService reads from the database. Enter a valid PowerCenter Booleanexpression or an SQL WHERE clause without the WHERE keyword.

Description Single table description. By default, Data Validation Option uses the tablename.

External ID Identifier for the single table that you can use when you run Data ValidationOption tests at the command line.

Primary Key Column Primary key column or columns for the table.

Single table properties function in the same way as table pair properties.

RELATED TOPICS:¨ “Table Pair Properties” on page 41

Connection PropertiesChoose the connection properties based on the data source type.

You must select a connection for all the data sources except for flat files. For flat files, you must provide thesource directory and the file name. Connections are PowerCenter connection objects created in the WorkflowManager.

Single Table Properties 63

Page 77: PC 950 DataValidationOption UserGuide En

Relational Connection PropertiesChoose the relational connection properties for Microsoft SQL Server, Oracle, IBM DB2, Netezza, andPowerExchange for DB2 data sources.

Configure the following properties when you select a relational data source:

Property Description

Connection PowerCenter connection object to connect to the relational data source.

Override OwnerName

Override the database name and schema of the source.For example, a Microsoft SQL Server table is identified by <database>.<schema>.<table>. To overridethe database and the schema, enter <new database name>.<To change the schema, enter <new schemaname> in the text box. You cannot change only the database name.

SAS and Salesforce Connection PropertiesYou can use a SAS or Salesforce data source in Data Validation Option.

Select the PowerCenter connection object when you select an SAS or Salesforce data source.

Note: You cannot override the owner name for SAS and Salesforce data sources.

SAP Connection PropertiesYou must configure the SAP authentication information to use SAP data sources.

Configure the following properties when you select an SAP data source:

Property Description

Connection PowerCenter connection object to connect to the SAP data source.

SAP User Name SAP source system connection user name. Must be a user for which you have created a sourcesystem connection.

SAP Password Password for the user name.

SAP Client SAP client number.

SAP Language Language you want for the mapping. Must be compatible with the PowerCenter Client code page.Data Validation Option does not authenticate the value. Ensure that you enter the correct value sothat the tests run successfully.

SAP Connect String Type A or Type B DEST entry in saprfc.ini.

SAP Data Sources in Data Validation OptionYou cannot override the owner name for an SAP data source. Data Validation Option uses the stream mode forinstallation of the ABAP programs and cannot use the FTP mode. SAP data sources must not contain thebackslash (/) character in the field names.

64 Chapter 9: Single-Table Constraints

Page 78: PC 950 DataValidationOption UserGuide En

Flat File Connection PropertiesYou can use the flat file data sources in the PowerCenter repository.

Configure the following properties when you select a flat file data source:

Property Description

Source Dir Directory that contains the flat file. The path is relative to the machine that hosts Informatica Services.

Source File File name with the file name extension.

Bad Records ConfigurationIf you have the enterprise license, you can store up to 16,000,000 bad records per test to perform advancedanalysis and up to 1000 bad records for reporting. If you have the standard license, you can store up to 1000 badrecords for reporting and analysis.

For reporting, you can set the number of max bad records in the mapping properties in the Preferences dialog box.The default number of bad records is 100. You can set up to 1000 bad records for reporting.

For advanced analysis, you can enter the max bad record for detailed analysis along with the file delimiter in theDetailed Error Rows Analysis section in mapping properties. The default number of bad records is 5000. You canset up to 16,000,000 bad records.

At the table pair or single table level, you can choose to store all the bad records from the tests for the table pair orsingle table. Select Save all bad records for test execution in the Advanced tab when you create or edit a tablepair or single table.

Data Validation Option stores the following tests for table pairs:

¨ Value

¨ Outer Value

¨ Set

Data Validation Option stores the following constraint tests for single tables:

¨ Value

¨ Unique

You must select whether you want to store the bad records in a flat file or a table in the Data Validation Optionschema. If you store bad records in a flat file, you can optionally enter the name of the file. Data Validation Optionappends test information to the name and retains the file extension.

Note: If you modify the file delimiter in the preferences file, run the InstallTests command with the forceInstalloption for the existing table pairs or single tables that you already ran. You can also edit and save the table pair orsingle table from the Data Validation Option Client before you run the test. If you modify the bad records value,you need not reinstall the tests.

Bad Records in Flat FileIf you configure to store bad records in flat file, Data Validation Option creates the flat file in the machine that runsInformatica services.

The flat files that Data Validation Option generates after running the tests are stored in the followingfolder:<PowerCenter installation directory>\server\infa_shared\TgtFiles

Single Table Properties 65

Page 79: PC 950 DataValidationOption UserGuide En

You can modify the folder from the Administrator tool. Edit the $PMTargetFileDir property for the PowerCenterIntegration Service.

Data Validation Option generates a folder for each of the table pair or single table. The name of the table pair orsingle table is in the following format: TablePairName_TestRunID or SingleTableName_TestRunID

Data Validation Option creates flat files for each test inside the folder. The name of the flat file is in the followingformat: <user defined file name>_ TestCaseType_TestCaseColumn A_TestCaseColumnB_TestCaseIndex.<user definedfile extension>

You can get the Test Case Index from the Properties tab and the Test Run ID from the results summary of the testin the detail area. You can get the Table Pair ID/Single Table ID from the Table Pair or Single Table properties.

For example, you enter the file name as BAD_ROWS.txt when you configure the table pair or single table and yourun an outer value test on fields FIRSTNAME and FIRSTNAME. The test-case index is 1 and the test-fields areexpressions. The bad records file after you run the test is of the format,BAD_ROWS_OUTER_VALUE_ExprA_ExprB_1.txt.

Data Validation Option supports all the file delimiters that PowerCenter supports. When you enter non-printablecharacters as delimiters, you should enter the corresponding delimiter code in PowerCenter. When you importthese files in PowerCenter, you have to manually create the different data fields since the code appears in theplace of delimiter in the bad records file.

Caution: Data Validation Option uses comma as a delimiter if there a multiple primary keys. You must not usecomma as a file delimiter.

When you run the tests for the table pair or single table, Data Validation Option stores the details of the badrecords along with the following format:

Table Pair Name Table A NameTable A ConnectionTable B NameTable B ConnectionTest DefinitionTest Run TimeTest Run By UserKey A[],Result A[],Key B[],Result B[]

If the tests pass, Data Validation Option still creates a flat file without any bad records information.

Bad Records in Database Schema ModeIf you choose to save detail bad records in the Data Validation Option schema, the bad records are written into thedetail tables.

The following table describes the tables to which Data Validation Option writes the bad records based on the typeof test:

Test Type Table Columns

Value, Outer Value, ValueConstraint

ALL_VALUE_RESULT_DETAIL TEST_RUN_ID, TEST_CASE_INDEX, KEY_A,VALUE_A, KEY_B, VALUE_B

Unique Constraint ALL_UNIQUE_RESULT_DETAIL TEST_RUN_ID, TEST_CASE_INDEX, VALUE

Set ALL_SET_RESULT_DETAIL TEST_RUN_ID, TEST_CASE_INDEX, VALUE_A,VALUE_B

Note: You must ensure that the database table has enough table space to hold all the bad records.

66 Chapter 9: Single-Table Constraints

Page 80: PC 950 DataValidationOption UserGuide En

The test details of all the tests that you ran in Data Validation Option is available in the TEST_CASE table. Thetest installation details of the tests are available in the TEST_INSTALLATION table. You can obtain the TEST_IDof a test from the TEST_INSTALLATION table. You need TEST_ID of a test to query the complete details of a testfrom the TEST_CASE table.

You can get the Test Case Index from the Properties tab and the Test Run ID from the results summary of the testin the detail area. You can get the Table Pair ID/Table ID from the Table Pair or Single Table properties.

Foe example, you ran a table pair with a value test and outer value test.

The following SQL query is a sample query to retrieve the information of bad records of a test with Test CaseIndex as 5.

select ALL_VALUE_RESULT_DETAIL.*,TEST_CASE.* from ALL_VALUE_RESULT_DETAIL , TEST_RUN, TEST_INSTALLATION, TEST_CASE where ALL_VALUE_RESULT_DETAIL.TEST_RUN_ID=TEST_RUN.TEST_RUN_IDand TEST_RUN.TEST_INSTALLATION_ID=TEST_INSTALLATION.TEST_INSTALLATION_IDand TEST_INSTALLATION.TABLE_PAIR_ID=TEST_CASE.TEST_IDand ALL_VALUE_RESULT_DETAIL.TEST_CASE_INDEX=TEST_CASE.TEST_CASE_INDEXand ALL_VALUE_RESULT_DETAIL.TEST_RUN_ID=220

ParameterizationIf you have the enterprise license, you can perform incremental data validation through parameterization. You canuse parameters in the WHERE clause in Table Pairs and Single Tables.

Before you use a parameter in the WHERE clause, you must enter the name of the parameter file and add theparameters in the Advanced tab of a table pair or single table definition. You must specify the data type, scale,and precision of the parameter. After you add the parameters, you can use them in a WHERE clause to performincremental validation. You can validate the expression with the parameters that you enter in a table pair or singletable.

The parameter file must be in a location accessible to the Data Validation Option Client. The parameters in theparameter file must be in the format, $$<parameter name>=value. Ensure that the parameter that you add in atable pair or single table is available in the parameter file.

When you run a test, Data Validation Option looks up the value of the parameter from the parameter file and runsthe WHERE clause based on that value.

If the Informatica Services run on a Windows machine, you can place the parameter files in a folder on the server.Ensure that the Data Validation Option Client can access the folder. In the Data Validation Option Client, enter theparameter file location as a network path to the parameter file in the server in the following format: \\<servermachine host name>\<shared folder path>\parameter file name

If the Informatica Services run on a UNIX machine, you can place the parameter files in a folder on the server.Install DVOCmd on the server. You can configure parameterization on the Data Validation Option Client andprovide the absolute path of the parameter file and run the tests from the server with DVOCmd. Alternatively, youcan ensure that the Data Validation Option Client can access folder and enter the parameter file location as anetwork path to the parameter file in the server. Run the tests from the Data Validation Option Client.

Adding a Single TableYou can create a single table from the file menu or from the shortcut in the menu bar.

1. Select the folder to which you want to add the single table.

Adding a Single Table 67

Page 81: PC 950 DataValidationOption UserGuide En

2. Click on the single table shortcut on the menu bar or click on File > New > Single Table.

The Single Table Editor window appears.

3. Browse and select the data source that you want to use the single table.

You can search for a data source by name or path. You can search for lookup views, sql views, and join viewsonly with their names.

4. Click Edit and configure the connection properties for the table.

5. Enter the WHERE clause you want to execute on the data source.

6. Enter the description for the single table.

7. Enter the external id for the single table.

You can use the external id to execute the single table from the command line.

8. If your data source is relational, you can choose whether to execute the WHERE clause within the datasource.

If you choose to execute the WHERE clause within the database, PowerCenter Integration Service passes theWHERE clause to the database for execution before data loading.

9. Select the database optimization level.

The following options are available:

¨ Default. Data Validation Option creates a PowerCenter mapping based on the test logic and appliessorting on the data source.

¨ WHERE clause, sorting, and aggregation in DB. Data Validation Option pushes the WHERE clause andsorting to the database. Applicable for relational data sources.

¨ Already sorted input. Data Validation Option creates a PowerCenter mapping based on the test logic. If thedata source is not sorted, the tests may fail.

10. Select the primary key for the table in the Key Column pane.

Editing Single TablesTo edit a single table, right-click it in the Navigator or Single Tables tab, and select Edit Single Table. You canalso edit a single table by double-clicking it in the Single Tables tab.

When you edit a single table, the Single Table Editor dialog box opens

Deleting Single TablesTo delete a single table, right-click it in the Navigator or Single Tables tab, and select Delete Single Table. Youcan also delete a single table by selecting it and pressing the Delete key. When you delete a single table, DataValidation Option deletes all of its tests. Data Validation Option does not delete lookup or SQL views used in thesingle table.

68 Chapter 9: Single-Table Constraints

Page 82: PC 950 DataValidationOption UserGuide En

Viewing Overall Test ResultsWhen you select a single table in the navigator, all the tests run on the single table appears on the right pane.Also, the run summary of all the tests appear on the top of the page. Select a test to view the details of that test inthe bottom pane. You can view the test properties in the Properties tab and the test results in the Results tab.

You view important tests details like Test Case ID in the Properties tab along with other test details.

Viewing Overall Test Results 69

Page 83: PC 950 DataValidationOption UserGuide En

C H A P T E R 1 0

Tests for Single-Table ConstraintsThis chapter includes the following topics:

¨ Tests for Single-Table Constraints Overview, 70

¨ Test Properties, 70

¨ Adding Tests, 74

¨ Editing Tests, 74

¨ Deleting Tests, 75

¨ Running Tests, 75

¨ Bad Records, 75

Tests for Single-Table Constraints OverviewSingle-table constraints are tests are based on a single table. Data Validation Option allows you to run anaggregate test or a VALUE test on single tables. Note that there are no set tests nor an OUTER_VALUE test forsingle tables. However, there are some additional tests available for single tables that you cannot create for atable pair.

Most single-table constraints allow you to enter a constraint value for the test. The constraint value defines thevalue or values to which you want to compare the values in a field. For example, you might want to verify that aSALARY field contains values greater than $10,000. Enter the minimum salary as the constraint value.

Note: When you run tests, the target folder must be closed in the Designer and Workflow Manager. If the targetfolder is open, Data Validation Option cannot write to the folder, and the tests return an error.

Test PropertiesWhen you select a test in the Navigator or in the Tests tab, the properties for that test appear in the Propertiesarea. Most properties come from the values you enter in the Single Table Test Editor dialog box. Otherproperties apply to the most recent test run.

Edit test properties in the Single Table Test Editor dialog box when you add or edit a test.

70

Page 84: PC 950 DataValidationOption UserGuide En

The following table describes the test properties:

Property Description

Function The test you run such as COUNT, COUNT_DISTINCT, VALUE, or NOT_NULL.

Field The field that contains the values you want to test.

Condition Filter condition for the test. Enter a valid PowerCenter Boolean expression.

Operator The operator that defines how to compare each value in the field with the constraint value.

Constraint Value The value or values you want to compare the field values to.

Threshold The allowable margin of error for an aggregate or value test that uses the approximateoperator. You can enter an absolute value or a percentage value.

Max Bad Records The number of records that can fail comparison for a test to pass. You can enter anabsolute value or a percentage value.

Case Insensitive Ignores case when you run a test on string data.

Trim Trailing Spaces Ignores trailing spaces when you run a test that on string data. Data Validation Optiondoes not remove the leading spaces in the string data.

Comments Information about a test. Data Validation Option displays the comments when you viewtest properties in the Properties area.

Field is Expression Allows you to enter an expression for the field.

Datatype The datatype for the expression if the field is an expression.

Precision The precision for the expression if the field is an expression.

Scale The scale for the expression if the field is an expression.

TestsThe following table describes the single table tests:

Test Description

COUNT Compares the number of non-null values for the selected field to the constraint value. Thistest works with any datatype.

COUNT_DISTINCT Compares the distinct number of non-null values for the selected field to the constraintvalue. This test works with any datatype except binary.

COUNT_ROWS Compares the total number of values for the selected field to the constraint value. This testcounts nulls, unlike the COUNT and COUNT_DISTINCT tests. This test works with anydatatype.

MIN Compares the minimum value for the selected field to the constraint value. This test workswith any datatype except binary.

Test Properties 71

Page 85: PC 950 DataValidationOption UserGuide En

Test Description

MAX Compares the maximum value for the selected field to the constraint value. This test workswith any datatype except binary.

AVG Compares the average value for the selected field to the constraint value. This test can onlybe used with numeric datatypes.

SUM Compares the sum of the values for the selected field to the constraint value. This test canonly be used with numeric datatypes.

VALUE Examines the values for the field, row by row, and compares them to the constraint value.This test works with any datatype except binary.

FORMAT Determines whether the values in the field match the pattern in the constraint value. ThePowerCenter Integration Service uses the REG_MATCH function for this test. This testcannot be used with binary datatypes.

UNIQUE Confirms that the value in the field is unique. This test does not use a constraint value. Thistest cannot be used with binary datatypes.

NOT_NULL Confirms that the value in the field is not null. This test does not use a constraint value. Thistest cannot be used with binary datatypes.

NOT_BLANK If the value in the field is a string value, this test confirms that the value in the field is notnull or an empty string. If the value in the field is a numeric value, this test confirms that thevalue in the field is not null or zero. This test does not use a constraint value. This testcannot be used with datetime or binary datatypes.

FieldTo create a single-table constraint, you must select the field that contains the values you want to test. The fieldsavailable in the single table appear in the Field drop-down list. Select a field from the list.

ConditionYou can filter the values for the test field in a VALUE, FORMAT, NOT_NULL, or NOT_BLANK test. Data ValidationOption does not test records that do not satisfy the filter condition. For example, you want to test rows in anORDERS table only if the store ID number is not “1036.” Enter STORE_ID <> 1036 in the Condition field.

Data Validation Option does not check the condition syntax. Any valid PowerCenter expression, includingexpressions that use PowerCenter functions, is allowed. If the PowerCenter syntax is not valid, a mappinginstallation error occurs when you run the test.

Enter the filter condition in the Condition field. Because the PowerCenter Integration Service processes the filtercondition, it must use valid PowerCenter syntax. Do not include the WHERE keyword.

OperatorThe operator defines how to compare the test result for the field with the constraint value. Enter an operator foraggregate, VALUE, and FORMAT tests.

72 Chapter 10: Tests for Single-Table Constraints

Page 86: PC 950 DataValidationOption UserGuide En

The following table describes the operators available in the Operator field:

Operator Definition Description

= Equals Implies that the test result for the field is the same as theconstraint value.For example, SUM(field) is the same as the constraint value.

<> Does not equal Implies that the test result for the field is not the same as theconstraint value.

< Is less than Implies that the test result for the field is less than the constraintvalue.

<= Is less than or equal to Implies that the test result for the field is less than or equal to theconstraint value.

> Is greater than Implies that the test result for the field is greater than theconstraint value.

>= Is greater than or equal to Implies that the test result for the field is greater than or equal tothe constraint value.

~ Is approximately the same as Implies that the test result for the field is approximately the sameas the constraint value.The approximate operator requires a threshold value. It onlyapplies to numeric datatypes

Between Is between two values entered Implies that the test result for the field is between the twoconstants entered as the constraint value.This operator is generally used for numeric or datetime datatypes.

Not Between Is not between two values entered Implies that the test result for the field is not between the twoconstants entered as the constraint value.This operator is generally used for numeric or datetime datatypes.

In Is included in a list of valuesentered

Implies that the test result for the field is in the list of constantsentered as the constraint value.

Not In Is not included in a list of valuesentered

Implies that the test result for the field is not in the list ofconstants entered as the constraint value.

Note: Data Validation Option compares string fields using an ASCII table.

RELATED TOPICS:¨ “BIRT Report Examples” on page 119

Constraint ValueThe constraint value represents a constant value to which you want to compare the field values. For example, youmight want to verify that all values in the ORDER_DATE field fall between January 1, 2010 and December 31,2010. Or, you might want to verify that the minimum ORDER_ID number is greater than 1000.

The constraint value must be a string, numeric, or datetime constant. The datatype of the constraint valuedepends on the test.

Test Properties 73

Page 87: PC 950 DataValidationOption UserGuide En

The following table lists the constraint value datatype allowed for each test:

Test Datatype

COUNT, COUNT_DISTINCT, COUNT_ROWS Integer

MIN, MAX, SUM, VALUE Same as the Field datatype.

FORMAT String

AVG Double

UNIQUE, NOT_NULL, NOT_BLANK These tests do not use a constraint value.

Enter a constraint value in the Constraint Value field. Enter a constant or list of constants separated by commas.

The number of constants you enter as the constraint value depends on the operator you use:

Arithmetic operator such as =, <>, or ~

Enter a single constant.

Between or Not Between operator

Enter two constants separated by a comma.

In or Not In operator

Enter multiple constants separated by commas.

Enclose each string value, datetime value, or format pattern within single quotes. Datetime values must match thePowerCenter standard datetime format of MM/DD/YYYY HH24:MI:SS.

Remaining Controls on Test EditorThe remaining controls on the Single Table Test Editor are used in the same manner that they are used for tablepairs.

Adding TestsTo add a test to a single table, right-click the name of the table in the Navigator or Single Tables tab, or right-clickin the Tests tab, and select Add Constraint Test. The Single Table Test Editor dialog box opens.

Editing TestsTo edit a test, right-click the test in the Navigator or Tests tab, and select Edit Test. You can also double-click thetest name. The Single Table Test Editor dialog box opens.

74 Chapter 10: Tests for Single-Table Constraints

Page 88: PC 950 DataValidationOption UserGuide En

Deleting TestsTo delete a test, right-click the test in the Navigator or the Tests tab, and select Delete Test. You can also use theDelete key.

Running TestsUse one of the following methods to run tests:

¨ Select one or more table pairs and click Run Tests.

¨ Right-click a folder in the Navigator and select Run Folder Tests.

¨ Right-click a test in the Tests tab and select Run Selected Tests.

Data Validation Option runs all tests for a single table together. You cannot run tests individually unless only onetest is set up for the table. If you select an individual test, Data Validation Option runs all tests for the single table.

After you run a test, you can view the results on the Results tab.

Data Validation Option uses the following logic to determine whether a test passes or fails:

¨ An aggregate test is calculated as “value <operator> constraint.” If this relationship is true, the test passes. Ifthe operator chosen is approximate, then the test is calculated as ABS(value-constraint) <= Threshold.

¨ A VALUE test must produce fewer or an equal number of records that do not match compared to the thresholdvalue. If there is no threshold value and there are no records that do not match, the test passes.

¨ A FORMAT test is calculated as “value <operator> constraint.” If this relationship is true, the test passes.

¨ A UNIQUE, NOT_NULL, or NOT_BLANK test passes if the field value is unique, is not null, or is not blank,respectively. For string values, not blank means the string is not null or empty. For numeric values, not blankmeans the number is not null or 0.

When you run tests, the target repository folders must be closed in the Designer or Workflow Manager. If thetarget folders are open, Data Validation Option cannot write to the folders, and the tests fail.

Bad RecordsWhen you select a test on the Tests tab, the records that fail a test appear on the Results tab. Different columnsappear on the Results tab depending on the test.

Aggregate Tests

Aggregate tests do not display bad records. The Results tab displays the test result value.

VALUE, FORMAT, NOT_NULL, and NOT_BLANK Tests

These tests display the following columns for each bad record:

¨ The key or expression for the field

¨ The field value

UNIQUE Tests

UNIQUE tests display the non-unique field values.

Deleting Tests 75

Page 89: PC 950 DataValidationOption UserGuide En

C H A P T E R 1 1

SQL ViewsThis chapter includes the following topics:

¨ SQL Views Overview, 76

¨ SQL View Properties, 76

¨ Adding SQL Views, 78

¨ Editing SQL Views, 78

¨ Deleting SQL Views, 78

SQL Views OverviewSQL views facilitate the use of more complex functionality for single tables and table pairs. An SQL view allowsyou to use several tables and several calculations in a query to produce a set of fields that you can use as a tablein a single table or table pair. This functionality is similar to the SQL override in PowerCenter or a view in arelational database. You can use any valid SQL statement to create an SQL view.

SQL View PropertiesYou can view SQL view properties by selecting an SQL view in either the Navigator or the SQL Views tab andviewing the properties. Most properties come from the values entered in the SQL View Editor. Other propertiescome from the tests set up for and run on the SQL view.

Edit SQL view properties in the SQL View Editor dialog box when you add or edit an SQL view.

The following table describes the SQL view properties:

Property Description

Description SQL view description.

Table Definitions Tables to create the SQL view. If you identify the table with an alias, enter the alias namewith the table name. All tables you use in the SQL view must exist in the same database.

Connection PowerCenter connection for the tables

76

Page 90: PC 950 DataValidationOption UserGuide En

Property Description

Column Definition The columns that make up the SQL view. Data Validation Option imports all columns fromthe tables you select. You can create, delete, and rearrange columns.

SQL Statement SQL statement you run against the database to retrieve data for the SQL view.

Comment Information about an SQL view. Data Validation Option displays the comment when youview the SQL view in the Properties area.

DescriptionEnter a description so you can identify the SQL view. Data Validation Option displays the description in theNavigator and on the SQL Views tab. The description can include spaces and symbols.

Table Definitions and ConnectionTo provide Data Validation Option with the information it needs to create an SQL view, you must specify the tablesthat the SQL statement is based on and the corresponding database connection. When you provide the tables andconnection information, Data Validation Option can access the metadata that is necessary for the view to functioncorrectly.

To add a table, click Add Table. The Choose Data Source dialog box opens. This dialog box displays all of therelational tables available in the repositories. You can sort information in this dialog box by clicking the columnheaders. You can reduce the number of items to select by typing one or more letters of the table, file, or viewname in the Search field. Select a table and click Select. All of the tables you use in an SQL view must exist inthe same database.

Note: You cannot create an SQL view with self-join. Use a join view to create self-joins.

If you identify the table with an alias in the SQL statement you use to create the view, enter the alias name next tothe table name.

When you finish adding tables, select the PowerCenter connection for the tables from the Connection list.

Column DefinitionAfter you specify the tables on which the SQL view is based, you must specify the columns that make up the view.The number of columns you define for the view must match the SQL statement.

To import the columns from the tables you select, click Populate. Data Validation Option imports all columns inthe tables. Delete the columns that you do not want to use. You can rearrange the columns in the view.

You can also create a column. To do this, open a column field in the Column and Expression Definition list andselect Column. Enter the column name in the SQL View Column Editor dialog box. You must also specify thedatatype, precision, and scale for the column. The datatype, precision, and scale information must match thetransformation datatype that Informatica uses for the column. For datetime, string, and integer datatypes, the scalemust be zero.

SQL StatementEnter an SQL statement to retrieve data for the SQL view.

SQL View Properties 77

Page 91: PC 950 DataValidationOption UserGuide En

The statement that you enter runs as a query against the database, so it must use valid database syntax. Also, thecolumns that you enter in the SELECT statement must match the columns in the Column Definition list innumber, position, and datatype.

To avoid errors when you run tests, test the SQL statement in the database before you paste it into the SQLStatement field. Data Validation Option does not check the SQL statement syntax.

You can call a stored procedure from an SQL view. The connection that you specify in the SQL view for the sourcemust have permission on the stored procedure.

CommentYou can associate a comment with the view. Data Validation Option displays the comment when you view the SQLview in the Properties area.

Adding SQL ViewsTo create an SQL view, right-click SQL Views in the Navigator or right-click in the SQL Views tab, and select AddSQL View. The SQL View Editor dialog box opens.

Editing SQL ViewsTo edit an SQL view, right-click the SQL view in the Navigator or SQL Views tab, and select Edit SQL View. TheSQL View Editor dialog box opens.

Deleting SQL ViewsTo delete an SQL view, right-click the SQL view in the Navigator or SQL Views tab, and select Delete SQL View.You can also select the SQL view and press the Delete key.

When you delete an SQL view, Data Validation Option deletes all table pairs, single tables, and tests that use theSQL view.

78 Chapter 11: SQL Views

Page 92: PC 950 DataValidationOption UserGuide En

C H A P T E R 1 2

Lookup ViewsThis chapter includes the following topics:

¨ Lookup Views Overview, 79

¨ Lookup View Properties, 80

¨ Adding Lookup Views, 81

¨ Editing Lookup Views, 81

¨ Deleting Lookup Views, 82

¨ Lookup Views Example , 82

¨ Joining Flat Files or Heterogeneous Tables using a Lookup View, 83

Lookup Views OverviewData Validation Option lookup views allow you to test the validity of the lookup logic in your transformation layer.

Lookup views allow you to validate the process of looking up a primary key value in a lookup, or reference, tableusing a text value from a source, and then storing the lookup table primary key in the target fact table. Forexample, a product name in the source system might be in a dimension that serves as the lookup table. The datatransformation process involves looking up the product name and placing the primary key from the lookup table inthe target fact table as a foreign key. You must validate the product name in the source table against the foreignkey in the target table.

The following table lists the keys used in the example:

Source Table Lookup Table Target Table

source_idproduct_name

lookup_idproduct_name

target_idsource_idlookup_id

The source table product name field is found in the lookup table. After the product name is found, the primary keyfrom the lookup table is stored in the target table as a foreign key.

To test the validity of the lookup table foreign key in the target table, complete the following tasks:

1. Create the lookup view. Add the source table and the lookup table to the lookup view. Then create arelationship between the product name in the source and lookup tables.

79

Page 93: PC 950 DataValidationOption UserGuide En

2. Create a table pair with the lookup view and the table that is the target of the data transformation process.Join the tables on the source table primary key, which is stored in the target table as a foreign key.

3. Create an OUTER_VALUE test that compares the primary key of the lookup table to the lookup ID that isstored as a foreign key in the target table.

The OUTER_VALUE test checks the validity of the lookup table primary key stored in the target table against thecontents of the source table. The test also finds any orphans, which are records in the target table that do notmatch any records in the lookup table.

Lookup View PropertiesLookup view properties describe the properties of source and lookup tables.

You can view lookup view properties by selecting a lookup view in either the Navigator or the Lookup Views taband viewing the properties. Most properties come from the values entered in the Lookup View Editor dialog box.Other properties come from the tests set up for and run on the lookup view.

Edit lookup view properties in the Lookup View Editor dialog box when you add or edit a lookup view.

The following table describes the lookup view properties:

Property Description

Source Table Source table name.

Source Conn PowerCenter connection for the source table.

Override Owner Name Overrides the schema or owner name for the source table.

Source Dir Source file directory if the source table is a flat file. The path is relative to the machinethat hosts Informatica Services.

Source File File name, including file extension, if the source table is a flat file.

Lookup Table Lookup table name.

Lookup Conn PowerCenter connection for the lookup table.

Lookup Source Dir Source file directory if the lookup table is a flat file. The path is relative to the machinethat hosts Informatica Services.

Lookup Source File File name, including file extension, if the lookup table is a flat file.

Description Lookup view description.

Source to Lookup Relationship The fields on which the source table and lookup table are joined.

Selecting Source and Lookup TablesA lookup view consists of a source table and a lookup table. Use the Browse button and the Select Data Sourcesdialog box to select the source and lookup table in the same way that you select tables for table pairs.

80 Chapter 12: Lookup Views

Page 94: PC 950 DataValidationOption UserGuide En

Selecting ConnectionsSelect the correct connections for the source and lookup tables in the same way that you select connections fortable pairs.

Overriding Owner NameYou can override the owner name for the source table, but not for the lookup table. To specify a different owner orschema name for the lookup table, create a connection in the Workflow Manager, and use that connection for thelookup table.

Source Directory and FileIf either the source or lookup tables are flat files, specify the source directory and file name plus file extension, inthe same way that you specify source directories and file names for table pairs.

DescriptionData Validation Option automatically generates the description for a lookup view based on the tables you select.You can change the description.

Source to Lookup RelationshipIn the source and lookup tables, select the values you want to look up in the lookup table.

Adding Lookup ViewsTo create a lookup view, right-click Lookup Views in the Navigator or right-click in the Lookup Views tab, andselect Add Lookup View. The Lookup View Editor dialog box opens.

Select the source table and lookup table, and create the lookup relationship between them. That is, select the fieldto look up in the lookup table. You cannot use expressions for lookup view join fields.

The lookup view you create includes fields from both the source and the lookup table, joined on the lookuprelationship fields. Data Validation Option precedes the source table field names with "S_."

You can use the lookup view to validate data in the target table, where the lookup table primary key is stored as aforeign key.

Editing Lookup ViewsTo edit a lookup view, right-click the lookup view in the Navigator or SQL Views tab, and select Edit LookupView. The Lookup View Editor dialog box opens. You cannot modify the sources in a lookup view. You canmodify the lookup relationship.

Adding Lookup Views 81

Page 95: PC 950 DataValidationOption UserGuide En

Deleting Lookup ViewsTo delete a lookup view, right-click the lookup view in the Navigator or Lookup Views tab, and select DeleteLookup View. You can also select the lookup view and press the Delete key.

When you delete a lookup view, Data Validation Option deletes all table pairs and tests that use the lookup view.

Lookup Views ExampleUse a lookup view to test the validity of the foreign key stored in the target, or fact, table and to confirm that thereare no orphans.

The following tables display sample data that is typical of data used to build a target table.Source Table

ORDER_ID PRODUCT_NAME AMOUNT

101 iPod 100

102 Laptop 500

103 iPod 120

Product Lookup Table

LKP_PRODUCT_ID LKP_PRODUCT_NAME

21 iPod

22 Laptop

Target Table

TARGET_ID ORDER_ID LKP_PRODUCT_ID AMOUNT

1 101 21 100

2 102 22 500

3 103 21 120

To test the validity of the lookup table foreign key in the target table, perform the following steps:

Create the lookup view.

Create a lookup view with the source and lookup tables. The lookup relationship uses the product name fieldsin both the source and the lookup tables. The fields that are now included in the lookup view are listed below:

¨ S_ORDER_ID¨ S_PRODUCT_NAME

82 Chapter 12: Lookup Views

Page 96: PC 950 DataValidationOption UserGuide En

¨ S_AMOUNT¨ LKP_PRODUCT_ID¨ LKP_PRODUCT_NAME

Note that the tables that originate in the source have "S_" as a prefix.

Create the table pair.

Create a table pair using the lookup view and the target table. Create a join relationship between the sourcetable primary key and the same field stored in the target table as a foreign key as follows:

S_ORDER_ID and ORDER_ID

Create an OUTER_VALUE test.

Create an OUTER_VALUE test. Compare LKP_PRODUCT_ID in both the lookup table and the target table asfollows:

LKP_PRODUCT_ID and LKP_PRODUCT_ID

Joining Flat Files or Heterogeneous Tables using aLookup View

One disadvantage of the SQL view is that it does not allow the use of flat files or heterogeneous database tables.You can join two heterogeneous sources with a lookup view. You can think of the source to lookup relationship asan inner join between the two tables or files.

Joining Flat Files or Heterogeneous Tables using a Lookup View 83

Page 97: PC 950 DataValidationOption UserGuide En

C H A P T E R 1 3

Join ViewsThis chapter includes the following topics:

¨ Join Views Overview, 84

¨ Join View Data Sources , 84

¨ Join View Properties, 85

¨ Adding a Join View, 88

¨ Join View Example, 90

Join Views OverviewA join view is a virtual table that contains columns from related heterogeneous data sources joined by key columns.

Use a join view to run tests on several related columns across different tables. You can create a join view insteadof multiple SQL views with joins. For example, the Employee table has employee details, Inventory table has salesdetails, and the customer table has customer details. If you create a join view with the tables, you can obtain aconsolidated view of the inventory sold by the partner and the revenue generated by the employees associatedwith the partners. You can run tests with the join view to validate data across the tables.

You can create a join view with different types of data sources. For example, you can create a join view with a flatfile, SAP table, and an Oracle table.

You can add a join view in a single table or a table pair. You can then create tests with the table pair or singletable to validate the data in the join view. Add multiple data sources to the join view and add join conditions todefine the relationship between data sources.

Join View Data SourcesYou can create a join view with tables from multiple data sources such as relational, application, and flat files.

You can use the following sources when you create a join view:

¨ Oracle

¨ IBM DB2

¨ Microsoft SQL Server

¨ Netezza

84

Page 98: PC 950 DataValidationOption UserGuide En

¨ PowerExchange for DB2/zOS

¨ Flat files

¨ SAP

¨ SAS

¨ Salesforce.com

Join View PropertiesJoin view properties include table definitions and join conditions for each of the data source.

The following table describes the join view properties:

Property Description

Description Description of the join view.

Order Sequence of data sources in the join view. You can join a data source with any of the preceding datasources.

Table Name Data source that you add in the join view.

Alias Alias name for the table.

Join type Type of join used to join the data sources.

Where Clause The WHERE clause to filter rows in the data source.

Left field Left field of the join condition. Left field is the key column of the data source to which you create thejoin.

Right field Right field of the join condition. Right field is the key column of the data source for which you createthe join.

Connection PropertiesChoose the connection properties based on the data source type.

You must select a connection for all the data sources except for flat files. For flat files, you must provide thesource directory and the file name. Connections are PowerCenter connection objects created in the WorkflowManager.

Join View Properties 85

Page 99: PC 950 DataValidationOption UserGuide En

Relational Connection PropertiesChoose the relational connection properties for Microsoft SQL Server, Oracle, IBM DB2, Netezza, andPowerExchange for DB2 data sources.

Configure the following properties when you select a relational data source:

Property Description

Connection PowerCenter connection object to connect to the relational data source.

Override OwnerName

Override the database name and schema of the source.For example, a Microsoft SQL Server table is identified by <database>.<schema>.<table>. To overridethe database and the schema, enter <new database name>.<To change the schema, enter <new schemaname> in the text box. You cannot change only the database name.

SAS and Salesforce Connection PropertiesYou can use a SAS or Salesforce data source in Data Validation Option.

Select the PowerCenter connection object when you select an SAS or Salesforce data source.

Note: You cannot override the owner name for SAS and Salesforce data sources.

SAP Connection PropertiesYou must configure the SAP authentication information to use SAP data sources.

Configure the following properties when you select an SAP data source:

Property Description

Connection PowerCenter connection object to connect to the SAP data source.

SAP User Name SAP source system connection user name. Must be a user for which you have created a sourcesystem connection.

SAP Password Password for the user name.

SAP Client SAP client number.

SAP Language Language you want for the mapping. Must be compatible with the PowerCenter Client code page.Data Validation Option does not authenticate the value. Ensure that you enter the correct value sothat the tests run successfully.

SAP Connect String Type A or Type B DEST entry in saprfc.ini.

SAP Data Sources in Data Validation OptionYou cannot override the owner name for an SAP data source. Data Validation Option uses the stream mode forinstallation of the ABAP programs and cannot use the FTP mode. SAP data sources must not contain thebackslash (/) character in the field names.

86 Chapter 13: Join Views

Page 100: PC 950 DataValidationOption UserGuide En

Flat File Connection PropertiesYou can use the flat file data sources in the PowerCenter repository.

Configure the following properties when you select a flat file data source:

Property Description

Source Dir Directory that contains the flat file. The path is relative to the machine that hosts Informatica Services.

Source File File name with the file name extension.

Database Optimization in a Join ViewYou can optimize the data sources in a join view for better performance. You can choose to select a subset of datain the data source and aggregate the rows for a relational data source.

To improve read performance of the join view, you can provide a WHERE clause. The WHERE clause ensuresthat the data source uses a subset of data that satisfies the condition specified in the WHERE clause.

Data Validation Option does not check the WHERE clause syntax. If the PowerCenter Integration Serviceexecutes the WHERE clause, any valid PowerCenter expression, including expressions that use PowerCenterfunctions, is allowed. If the PowerCenter syntax is not valid, a mapping installation error occurs.

Use PowerCenter expression in cases where you do not push down the WHERE clause in to the data source.

Use the following guidelines if the data source executes the WHERE clause:

¨ Relational data source. The WHERE clause must be a valid SQL statement. If the SQL statement is not valid, aruntime error occurs.

¨ SAP data source. The WHERE clause must be a valid SAP filter condition in the ERP source qualifier.

¨ Salesforce data source. The WHERE clause must be a valid SOQL filter condition.

¨ SAS data source. The WHERE clause must be a valid Where clause Overrule condition in the SAS sourcequalifier.

You can choose one of the following optimization levels when you configure a data source in a join view:

¨ Default. Data Validation Option converts all test logic to a PowerCenter mapping and applies sorting to the datasource.

¨ WHERE clause, Sorting, and Aggregation in DB. Data Validation Option pushes the WHERE clause, sortinglogic for joins, and all aggregate tests to the database. Data Validation Option converts all other test logic to aPowerCenter mapping. You can choose this option with relational data sources.

¨ Already Sorted Input. PowerCenter mapping does not sort the input. Ensure that you sort the data so that thetests run successfully.

Join TypesYou can choose join types to link fields in different data sources.

You can create the following types of joins between two tables in a join view:Inner join

Inner join creates a result table by combining column values in two tables A and B based on the joincondition. Data Validation Option compares each row in A with each row in B to find all pairs of rows thatsatisfy the join condition.

Join View Properties 87

Page 101: PC 950 DataValidationOption UserGuide En

Left outer join

Left outer join between tables A and B contains all records of the left table A, even if the join condition doesnot find any matching record in the right table B. The resulting join contains all rows from table A and the rowsfrom table B that match the join condition.

Right outer join

Right outer join between tables A and B contains all records of the right table B, even if the join conditiondoes not find any matching record in the left table A. The resulting join contains all rows from table B and therows from table A.

Full outer join

A full outer join contains all the rows from tables A and B. The resulting join has null values for every columnin the tables that does not have a matching row.

Alias in Join ViewAlias of the data source in a join view helps you identify data sources that share the same name.

By default, Data Validation Option assigns the data source name as the alias name. If you select a data sourcewith the same name as any other data source in the join view, Data Validation Option appends the alias name witha number.

Note: If you edit alias name after you a create a join view, Data Validation Option deletes the join conditions.Create the join conditions again to save the join view.

Join ConditionsYou must configure join conditions for all the data sources in the join view.

When you configure a join condition, select the data source in the table definition list and specify the left and rightfields of the join condition. Data Validation Option displays the alias names of the data sources. The left field of thejoin condition consists of the output fields from the alias name you choose. The right field consists of the outputfields of the data source for which you configure the join condition.

You can create a join condition with any of the data sources in the previous rows in a join view. You can createmultiple join conditions for the same data source. You cannot save a join view unless you create at least one validjoin condition for all the data sources.

Adding a Join ViewYou must configure all the table definitions and join conditions when you configure a join view.

1. Click File > New > Join View.

The Join View Editor dialog box appears.

2. Enter a description for the join view.

3. Click Add in the Table Definitions pane.

The Choose Data Source dialog box appears.

4. Select the data source.

5. Configure the table definition for the data source.

88 Chapter 13: Join Views

Page 102: PC 950 DataValidationOption UserGuide En

6. Optionally, click Output Fields and select the fields that you want to view when you create the join conditionand when you configure the tests.

7. Configure multiple table definitions as required.

You can change the sequence of data sources in the join view and delete table definitions. When you makeany change to the data sources in the join view, you must recreate the join conditions.

8. Configure join conditions for all the data sources.

You can join a data source with any of the preceding data sources in the Table Definitions pane. You neednot specify the join condition for the first data source.

Configuring a Table DefinitionConfigure a table definition after you add a data source to the join view in the Join View Editor dialog box.

1. Select the data source in the Join View Editor dialog box.

2. Click Edit in the Table Definitions pane.

The Edit Table dialog box appears.

3. Enter an alias name for the data source.

By default, the data source name appears as the alias name.

4. Select the join type for the data source.

Select join types for all the data sources except the first data source in the join view.

5. Configure the connection details for the table. The connection details vary depending on the data source type.

6. Optionally, enter the WHERE clause.

Data Validation Option runs the WHERE clause when it fetches data from the table.

7. If the table is relational, you can choose to push down the WHERE clause in the database.

8. Select the database optimization level.

9. Click OK.

Configuring a Join ConditionAdd join conditions to specify the relationship between data sources. You can create join conditions for a datasource with any other data source in the previous rows.

1. Select the data source in the Join View Editor dialog box.

2. Click Add in the Join Conditions pane.

The Join Condition dialog box appears.

3. Select the alias name of any of the data source in the previous rows.

4. Select the left field of the join condition.

The left field of the join condition consists of the output fields from the alias name you choose. You can alsoconfigure and validate a PowerCenter expression as the field. When you enter a field name in the expression,append the alias name followed by an underscore. For example, if the alias name of the table is customer1and you want to use the CustIDfield in an expression, enter the expression as customer1_CustID > 100.

5. Select the right field of the join condition.

The right field consists of the output fields of the data source for which you configure the join condition. Youcan also configure and validate a PowerCenter expression as the field.

6. Click OK.

Adding a Join View 89

Page 103: PC 950 DataValidationOption UserGuide En

Managing Join ViewsYou can edit and delete join views that you create in Data Validation Option

1. Click Join Views in the Navigator.

Join views appear on the right pane.

2. Edit or delete the join view.

If you modify the join view, re-create the join conditions in the join view. If you delete a join view, you must re-create the table pairs or single tables that contain the join view.

Join View ExampleYou need to validate the inventory sales done by the employees and partners to cross-check with the annual salesreport.

Account table in an SAP system holds the information of an employee account. Partner table is a Salesforce tablethat contains the information of the inventory sold to a partner associated with an employee. Inventory is a flat filethat contains the details of the inventory sold. Account_History is an Oracle table that contains the history ofactivities done by the account.

Current requirement is to validate data across the tables based on the inventory sales of an account. You alsoneed to validate the account details with the historic account details to check for discrepancies. You can create ajoin view with the tables so that you can run single table tests to validate data.

Tables and FieldsThe following table lists the join view tables and their columns:

Table Columns

Account (SAP) Account table contains the following columns:- Account ID- Account Name- Collection- Inventory

Partner (Salesforce) Partner contains the following columns:- Partner ID- Partner Name- Inventory- Cost- Associated Account ID

90 Chapter 13: Join Views

Page 104: PC 950 DataValidationOption UserGuide En

Table Columns

Inventory (Flat file) Inventory table contains the following columns:- Inventory ID- Quantity- Associated Partner ID- Associated Account ID

Account_History (Oracle) Account_History contains the following columns:- Historic Account ID- Account Name- Total Inventory- Total Collection

Creating the Join View1. Enter Account_Cumulative as the description.

2. Add Account as the first table in the join view.

3. Add Partner, Inventory, and Account_History tables in that order.

4. Configure the table definitions with the required join types.

5. Create join conditions for Partner, Inventory, and Account_History.

Table Definition ConfigurationThe following list describes the tables and their join types when you configure the table definitions:Partner

You want to capture the details of partners associated with each account. Configure an inner join for thePartner table so that Data Validation Option adds the details of the partners for which there are correspondingaccounts to the join view.

Inventory

You want to capture the details of the inventory sold by the partners. Configure an inner join for the Inventorytable so that Data Validation Option adds the details of the inventory for which there are correspondingpartners to the join view.

Account_History

You want to capture the historic details of an account. Configure a left outer join for the Account_History tableso that Data Validation Option adds all the historic account details to the join view.

Adding Join ConditionsConfigure the following join conditions for the tables:

Partner

Select Account as the join table. Select Account ID output field from the Account table as the left field andAssociated Account ID output field from the Partner table as the right field of the join.

Inventory

Select Partner as the join table. Select Partner ID output field from the Partner table as the left field andAssociated Partner ID output field from the Inventory table as the right field of the join.

Account_History

Select Account as the join table. Select Account ID output field from the Account table as the left field andHistoric Account ID output field from the Account_History table as the right field of the join.

Join View Example 91

Page 105: PC 950 DataValidationOption UserGuide En

The following figure illustrates the formation of the join view with the table relationship:

After you create the join view, create a single table with the join view. Generate and run tests on the single table tovalidate the data in the join view.

Removing Table from the Join ViewAfter you run the required tests for the join view with all the tables, you may want to remove the partnerinformation and run tests solely for the account. If you remove the table Partner from the join view, the joincondition for the table Inventory is no longer valid. You need to create another join condition for the table Inventoryto save the join view.

The following figure illustrates the broken join view:

Add a join condition to the table Inventory with Account as the join table. Join Account ID field in the Account tablewith Associated Account ID field in the Inventory table.

The following figure illustrates the join view without the Partner table:

92 Chapter 13: Join Views

Page 106: PC 950 DataValidationOption UserGuide En

C H A P T E R 1 4

ReportsThis chapter includes the following topics:

¨ Reports Overview, 93

¨ Business Intelligence and Reporting Tools (BIRT) Reports, 93

¨ Jasper Reports, 95

¨ Jasper Report Types, 99

¨ Dashboards, 101

¨ Metadata Manager Integration, 102

Reports OverviewData Validation Option stores all test definitions and test results in the Data Validation Option repository. You canrun reports to display test definitions and results.

You can use the BIRT reporting engine to generate the reports. If you have the enterprise license, you can useJasperReports Server to generate the reports.

If you use the JasperReports Server, you can also generate dashboards at different levels. You can also view themetadata properties of the data source in a test if you configure Metadata Manager integration for the DataValidation Option repository.

Business Intelligence and Reporting Tools (BIRT)Reports

You can use the BIRT reporting engine available in Data Validation Option to generate reports.

You can generate a report for one or more table pairs or single tables.

You can generate the following BIRT reports in Data Validation Option:

Summary of Testing Activities

Summary of Testing Activities report displays the number of table pairs or single tables, the number of testsfor each table pair or single table, and the overall test results.

93

Page 107: PC 950 DataValidationOption UserGuide En

Table Pair Summary

Table Pair Summary report lists each table pair or single table with the associated tests. Data ValidationOption displays each table pair or single table on a separate page. The report includes a brief description ofeach test and result.

Detailed Test Results

Detailed Test Results report displays each test on a separate page with a detailed description of the testdefinition and results. If one of the test sources is a SQL or a lookup view, the report also displays the viewdefinition.

Note: Report generation can take several minutes, especially when the report you generate contains hundreds oftests or test runs.

BIRT Report GenerationYou can generate a report for one or more table pairs or single tables. You can also generate a report for all tablepairs and single tables in a folder or a test.

Right-click the objects for which you want to generate a report, and select Generate Report. The ReportParameters dialog box appears.The following table describes the report options:

Option Description

User User that created the tests. By default, Data Validation Option generates a report for the tests that thecurrent user creates and runs. Select All to display tests created and run by all users. Select a user nameto display tests created and run by that user.

Table Pair Table pairs or single tables for which you want to run a report. You can generate a report on all tablepairs and single tables, on the table pairs and single tables in a folder, or on a test. Data ValidationOption gives you different options depending on what you select in the Navigator or details area. Forexample, you can generate a report on all table pairs and single tables or on the table pairs and singletables in the folder.

Recency Test runs for which you want to run a report. You can select the latest test runs or all test runs.

Result Type Test results for which you want to run a report. You can select all results, tests that pass, or tests that donot pass.

Run Dates The test run dates. Enter the from date, to date, or both in the format MM/DD/YYYY.

Report Subtitle Data Validation Option displays the subtitle on each page of the report.

Note: If you change the Data Validation Option repository after you configure BIRT reports, you must restart DataValidation Option Client to generate accurate reports.

SQL and Lookup View DefinitionsIf a test contains either an SQL view or a lookup view as a source, Data Validation Option prints the view definitionas part of the report.

You cannot print a report showing the definition of the view by itself because each view is tied to a specific testdefinition. For example, you create an SQL view, use it in a table pair, and run a test. You then update the view bychanging the SQL statement, and re-run the test. Each test result is based on a different view definition, which iswhy the view definition must be tied to a specific result.

94 Chapter 14: Reports

Page 108: PC 950 DataValidationOption UserGuide En

Custom ReportsYou can write custom reports against database views in the Data Validation Option schema.

All Data Validation Option reports run against database views that are set up as part of the installation process.You can write custom reports based on the database views. Do not write reports against the underlying databasetables because the Data Validation Option repository metadata can change between versions.

RELATED TOPICS:¨ “Reporting Views Overview” on page 129

Viewing ReportsData Validation Option displays reports in a browser window.

Use the arrows in the upper right corner to scroll through the pages of the report. To display a specific page, enterthe page number in the Go To Page field.

You can display or hide the table of contents. To do this, click the table of contents icon in the upper left corner ofthe report. When you display the table of contents, you can click a heading to display that section of the report.

You can also print a report or export it to a PDF file. Click the print icon in the upper left corner of the report.

Jasper ReportsYou can generate and view reports in the JasperReports Server if you have the enterprise license.

You can use the JasperReports Server available with Informatica Services or a standalone JasperReports Server.If you use the JasperReports Server bundled with Informatica Services, you can use single sign-on when you viewthe Jasper reports.

You can generate reports at the following levels:

¨ Table pairs and Single tables

¨ Folders

¨ Data Validation Option user

¨ PowerCenter repository

¨ Views

When you generate reports, you can also select multiple table pairs and single tables, or folders. If you have alarge data test data set use the report annotations to view the report details.

Data Validation provides the following administrative reports:

¨ External IDs Used In Table Pairs

¨ Views Used in Table Pairs/Tables

¨ Sources Used In Table Pairs/Tables/Views

Jasper Reports 95

Page 109: PC 950 DataValidationOption UserGuide En

The following table lists all the Jasper reports and the corresponding levels of reporting:

Report Name Report Level

Run Summary User, Folder, Table Pair/Single Table

Table Pair Summary User, Folder, Table Pair/Single Table

Detailed Test Results User, Folder, Table Pair/Single Table

Table Pair Run Summary Table Pair/Single Table

Last Run Summary User, Folder, Table Pair/Single Table

Percentage of Bad Rows User, Folder, Table Pair/Single Table

Percentage of Tests Passed User, Folder, Table Pair/Single Table

Tests Run Vs Tests Passed User, Folder, Table Pair/Single Table

Total Rows Vs Percentage of Bad Rows User, Folder, Table Pair/Single Table

Bad Rows Folder, Table Pair/Single Table

Most Recent Failed Runs User, Folder

Failed Runs Folder

Failed Tests Folder

Validation Failures User, Folder

External IDs Used In Table Pairs User, Folder, Table Pair/Single Table

Views Used in Table Pairs/Tables Views

Sources Used In Table Pairs/Tables/Views PowerCenter Repository

You can export Jasper reports in the following formats:

¨ PDF

¨ DOC

¨ XLS

¨ XLSX

¨ CSV

¨ ODS

¨ ODT

96 Chapter 14: Reports

Page 110: PC 950 DataValidationOption UserGuide En

Status in Jasper ReportsReports display the status of a test, table pair, or single table based on the report type.

The following status are available in Jasper reports for tests:

¨ Pass. The test has passed.

¨ Fail. The test has failed.

¨ Error. The test encountered a run error or the test has no result.

The following status are available in Jasper reports for table pairs and single tables:

¨ Pass. Pass status for a table pair or single table can occur in the following scenarios:

- All the tests have passed.

- If there is at least one test with pass status and rest of the tests with no results.

¨ Fail. If one of the tests in a table pair or single table fail, reports display the status as fail.

¨ Error. If all the tests in a table pair or single table has an error or no result, reports display the status as error.

Tests and table pairs or single tables with error status do not appear in the bar charts.

Configuring Jaspersoft ReportingConfigure the Jasper Reporting Service settings before you generate a report.

1. Click File > Settings > Preferences.

The Preferences dialog box appears.

2. Click Jaspersoft Reports.

3. Select Enable Jaspersoft Reporting.

4. Enter the JasperReports Server host name.

5. Enter the JasperReports Server port.

6. Enter the Jaspersoft web app name.

If you want to use a standalone JasperReports Server, enter the Jaspersoft web app name based on theJasperReports Server configuration. If you want to use the JasperReports Server available with InformaticaServices, enter the Jaspersoft web app name as ReportingandDashboardsService.

7. Enter the folder in the JasperReports Server to store your reports.

8. Click Test to validate the settings.

9. Click Configure.

Data Validation Option copies the jrxml files from <Data Validation Option installation directory>\DVO\jasper_reports folder to the JasperReports Server.

10. Enter the login details for the JasperReports Server if you use a standalone JasperReports Server or the logindetails of the Informatica Administrator if you use the JasperReports Server available with Informaticaservices.

The user must have administrative privileges in the JasperReports Server.

11. Click OK.

Data Validation Option creates the root folder in the Jaspersoft server.

12. Click Save to save the settings.

Note: If you change the Data Validation Option repository after you configure Jaspersoft Reporting, clickConfigure before you generate new reports.

Jasper Reports 97

Page 111: PC 950 DataValidationOption UserGuide En

Generating a ReportYou can generate a report at the table pair, folder, repository, or user level.

1. Select the object for which you want to generate a report. You can select multiple of objects of the same type.

2. Click Action > Generate Report.

The Report Parameters dialog box appears.You can also right-click the object and select Generate Report if you want to generate a report at the tablepair or folder level.

3. Select the report type.

The following table displays the report options that you can select for each report:

Report Name User Recency Run Date

Run Summary All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Table Pair Summary All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Detailed Test Results All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Table Pair Run Summary N/A N/A Last 24 hours/Last 30 days/Custom date range

Last Run Summary All Users/Current User/Any User

N/A Last 24 hours/Last 30 days/Custom date range

Percentage of Bad Rows All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Percentage of Tests Passed All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Tests Run Vs Tests Passed All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Total Rows Vs Percentage ofBad Rows

All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Bad Rows N/A All/Latest Last 24 hours/Last 30 days/Custom date range

Most Recent Failed Runs All Users/Current User/Any User

All/Latest N/A

Failed Runs N/A N/A Last 24 hours/Last 30 days/Custom date range

Failed Tests N/A All/Latest Last 24 hours/Last 30 days/Custom date range

98 Chapter 14: Reports

Page 112: PC 950 DataValidationOption UserGuide En

Report Name User Recency Run Date

Validation Failures All Users/Current User/Any User

All/Latest Last 24 hours/Last 30 days/Custom date range

Table pairs/Tables withExternal ID

N/A N/A N/A

Note: You can select the user only if you generate the report from the user level.

4. Optionally, enter the report subtitle.

You can use the report subtitle to identify a specific report.

5. Click Run.

The report appears in the browser.

Jasper Report TypesIf you have the enterprise license, you can generate different Jasper reports in Data Validation Option.

You can generate the following Jasper reports in Data Validation Option:

Summary of Tests Run

Summary of Testing Activities report displays the number of table pairs or single tables, the number of testsfor each table pair or single table, and the overall test results.

Table Pair/Table Summary

Lists each table pair or single table with the associated tests. Data Validation Option displays each table pairor single table on a separate page. The report includes a brief description of each test and result.

Detailed Test Result

Displays each test on a separate page with a detailed description of the test definition and results. If one ofthe test sources is a SQL, join, or a lookup view, the report also displays the view definition.

The following information is available in the Detailed Test Result report:

¨ Test details

¨ Table pair/single table details

¨ Runtime information

¨ Bad record details

Table Pair/Table Run Summary

Displays the summary of all the tests run for a table pair or single table for a given time period. You can clickon the date in the report to view the Detailed Test Results report.

Last Run Summary

Displays the details of the last run of tests at the table pair or single table, folder, or user level. Reports liststhe last test runs for the objects in the given time period.

If you generate the report at the user level, the report displays the folder summary. You can click on the folderto view the Last Run Summary report for that folder.

Jasper Report Types 99

Page 113: PC 950 DataValidationOption UserGuide En

If you generate the report at the folder level or table pair/single table level, you can view the last run for all thetable pairs or single tables for the given time period. You can click on a table pair or single table to view theDetailed Test Results report.

Percentage of Bad Rows

Displays the aggregate percentage of bad records in comparison with all the rows processed over a period oftime. You can generate the report at the table pair or single table, folder, or user level. You can click on a barfor a particular date on the graph to view the Bad Rows report.

The bars on the graph appear for the days on which you ran the tests.

Percentage of Tests Passed

Displays the percentage of passed tests in comparison with total number of tests over a period of time. Youcan generate the report at the table pair or single table, folder, or user level.

The bars on the graph appear for the days on which you ran the tests.

Tests Run Vs Tests Passed

Displays the total number of tests run over a period of time as a bar chart. The number of tests passed isplotted across the bar chart. You can generate the report at the table pair or single table, folder, or user level.You can click on the test passed points on the graph to view the Validation Failure by Day report.

The bars on the graph appear for the days on which you ran the tests.

Total Rows Vs Percentage of Bad Records

Displays the total number of rows tested over a period of time as a bar chart. The percentage of bad recordsis plotted across the bar chart. You can generate the report at the table pair or single table, folder, or userlevel. You can click on the percentage of bad records points on the graph to view the Bad Rows report.

The bars on the graph appear for the days on which you ran the tests.

Bad Rows

Displays the number of bad records across test runs over a period of time in the form of a bar chart. You canclick on a bar to view the Table Pair/Table Run Summary report. You can generate the Bad Rows report atthe folder level or the table pair/single table level.

Most Recent Failed Runs

Displays the top ten most recent failed runs. You can run the Most Recent Failed Runs report at the folder oruser level.

If you run the report at the user level, the report displays the top ten failed runs across all the folders in theData Validation Option repository in the given time period. If you click on a folder, you can view the MostRecent Failed Runs report for the folder. You can also click on the table pair or single table to view theDetailed Test Result report.

If you run the report at the folder level, the report displays the top ten failed runs for that particular folder inthe given time period. You can click on the table pair or single table to view the Detailed Test Result report.

Note: Select the recency as latest if you want to get the most recent state of the table pair or single table.

Failed Runs

Displays the number of failed runs for each table pair or single table over a period of time in the form of a barchart. Each bar represents the number of failed runs for a table pair or single table. You can click on a bar toview the Table Pair/Table Run Summary report. You can run the Failed Runs report at the folder level.

Failed Tests

Displays the number of failed tests across test runs over a period of time in the form of a bar chart. Each barrepresents the number of failed tests for a table pair or single table. If you click the Table hyperlink, you can

100 Chapter 14: Reports

Page 114: PC 950 DataValidationOption UserGuide En

view the report in a tabular format. You can click on a bar or the table pair/single table name to view theDetailed Test Result report. You can run the Failed Tests report at the folder level.

Validation Failure by Folder

Displays the number of failed table pairs or single table as a bar chart for a given time period. Each barrepresents the folder in which the failure occurred. If you click the Table hyperlink, you can view the report ina tabular format. You can click on a bar or the folder name to view the Failed Tests report available as barchart and tabular format. You can run the Validation Failure by Folder report at the user or folder level.

External IDs Used In Table Pairs

Displays the list of table pairs or single tables with external IDs across all users. You can run the External IDsUsed In Table Pairs report at the user, folder, or table pair level.

Sources Used In Table Pairs/Tables/Views

Displays the list of table pairs, single tables, and views in which a data source is used. Right-click on thePowerCenter repository, a repository folder, or a data source in the repository and select Get Source UsageIn Table Pairs/Tables/Views to generate this report. You cannot generate a report at the repository folder levelor repository level if there are more than 100 sources.

Views Used In Table Pairs/Tables/Views

Displays the list of table pairs and single tables in which a view is used. Right-click on the view and select GetView Usage In Table Pairs/Tables/Views to generate this report.

Note: Data Validation Option reports might display the details of table pairs or single tables that you previouslydeleted from the Data Validation Option Client. If you generate a report after you modify the description of a tablepair or a single table, the reports might display two entries for the object with the object ID appended to the newand old description with same object ID or in the annotations of the bar chart.

DashboardsYou can generate Data Validation Option dashboards to get an overview of the testing activities and test results.

Dashboards display multiple reports in a single page. You need not generate individual reports to get an overviewof test results across a fixed time period.

You can view the following dashboards in Data Validation Option:Home Dashboard

Displays the following reports for the past 30 days:

¨ Tests Run Vs Tests Passed

¨ Total Rows vs Percentage of Bad Rows

¨ Percentage of Tests Passed

¨ Percentage of Bad Rows

You can click through the Total Rows vs Percentage of Bad Rows report and Percentage of Bad Rows reportsto view the Bad Rows report. You can click through Tests Run Vs Tests Passed to view the ValidationFailures report.

Repository Dashboard

Displays the Validation Failures for Folders report for the past 24 hours and 30 days in both graphical andtabular formats. The dashboard also displays the Most Recent Failed Runs report for the repository.

Dashboards 101

Page 115: PC 950 DataValidationOption UserGuide En

You can click through the Validation Failure by Folder report to view the Failed Tests report. You can clickthrough the Most Recent Failed Runs report to view the Most Recent Failed Runs report for a folder and theDetailed Test Result report.

Folder Dashboard

Displays the Bad Rows report and Failed Tests report for the past 24 hours and 30 days in both graphical andtabular formats.

You can click through the Failed Tests report to view the Table Pair/Table report and the Bad Rows report toview the Detailed Test Result report.

Table Dashboard

The Table dashboard displays the following reports for the past 30 days:

¨ Tests Passed Vs Test Failed

¨ Bad Rows

¨ Table Pair/Table Run Summary

You can click through the Bad Rows report to view the contents to view the Detailed Test Result report.

Metadata Manager IntegrationYou can view the metadata of data sources if you configure Metadata Manager integration in Data ValidationOption.

You can analyze the impact of test results on data sources if you enable Metadata Manager integration. You canview the metadata of the data source in the PowerCenter repository. To view the metadata of data sources, youmust setup a Metadata Manager Service in the Informatica domain. You must create a PowerCenter resource forthe PowerCenter repository that contains the data source.

Right-click on a data source in the repository and select Get Metadata to view the metadata of the data source.

Configuring Metadata Manager IntegrationConfigure Metadata Manager integration in Data Validation Option to view the data lineage of data sources.

1. Right-click the repository for which you want to enable Metadata Manager Integration and select EditRepository.

2. Select Enable Metadata Manager Integration.

3. Select whether the Metadata Manager Service runs on a secure connection.

4. Enter the server host name.

5. Enter the Metadata Manager Service port.

6. Enter the name of the PowerCenter resource.

7. Click Test to the test the settings.

8. Click Save.

102 Chapter 14: Reports

Page 116: PC 950 DataValidationOption UserGuide En

C H A P T E R 1 5

Command Line IntegrationThis chapter includes the following topics:

¨ Command Line Integration Overview, 103

¨ CopyFolder, 104

¨ CreateUserConfig, 105

¨ DisableInformaticaAuthentication, 105

¨ ExportMetadata, 106

¨ ImportMetadata, 107

¨ InstallTests, 107

¨ LinkDVOUsersToInformatica, 108

¨ PurgeRuns, 109

¨ RefreshRepository, 110

¨ RunTests, 111

¨ UpdateInformaticaAuthenticationConfiguration, 112

¨ UpgradeRepository, 113

Command Line Integration OverviewYou can invoke Data Validation Option capabilities at the command line. For example, you can create and runtests without using the Data Validation Option Client.

Running tests at the command line allows you to schedule test execution. It also allows you to embed a specifictest as part of the ETL workflow or as part of another process. For example, you can create an ETL process thatmoves data from source to staging, runs validation, and then moves data into the target or an error table based onthe validation results.

The command line utility writes regular messages to the STDOUT output stream. It writes error messages to theSTDERR output stream. Normally, command utility messages appear on the screen. To capture messages to afile, use the redirection operator.

In a Windows machine, the Data Validation Option command line utility is DVOCmd.exe. DVOCmd.exe exists in one ofthe following directories:

¨ 32-bit operating systems: C:\Program Files\Informatica<version>\DVO\¨ 64-bit operating systems: C:\Program Files (x86)\Informatica<version>\DVO\

Important: You must run DVOCmd from the Data Validation Option installation directory.

103

Page 117: PC 950 DataValidationOption UserGuide En

DVOCmd uses the following syntax:

DVOCmd Command [Argument] [--Options] [Arguments]

To get help on a command, enter the command at the prompt without any argument. For example: $HOME/DVO/DVOCmd RunTests

To enable users to run tests from a UNIX machine, run the following command from DVOCmd installation directory:

$HOME/DVO/DVOCmd dos2unix DVOCmd

Note: In the syntax descriptions, options and arguments enclosed in square brackets are optional.

CopyFolderCopies the contents of a folder in a user workspace to a different folder in the same workspace or to another userworkspace. The target folder must be a new folder. The target folder cannot exist in the target workspace.

The CopyFolder command copies the table pairs, single tables, and test cases that exist within the source folder. Itdoes not copy test runs or the external IDs associated with table pairs or single tables.

If the table pairs or single tables in the source folder use an SQL or lookup view, the CopyFolder command copiesthe SQL or lookup view to the target user workspace unless the workspace contains a view with the same name.

Before Data Validation Option copies a folder, it verifies that all data sources associated with the objects beingcopied exist in the target workspace. If any data source is missing, Data Validation Option does not copy the folder.

The CopyFolder command uses the following syntax:

DVOCmd CopyFolder [--confdir conf_dir] --fromUser source_user --fromFolder source_folder --toUsertarget_user [--toFolder target_folder] [--reuseViews Yes] [--username Username] [--password Password]

The following table describes CopyFolder options and arguments:

Option Argument Description

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multipleData Validation Option repositories on a client machine. If you have one Data ValidationOption repository on a client machine and have not changed the configuration directory,you do not need to specify this option. Because Windows directories often containspaces, you must enclose the file path in quotes.

--fromUser source_user Name of the source user. Data Validation Option copies the folder in this user workspace.

--fromFolder source_folder Name of the source folder. Data Validation Option copies this folder.

--toUser target_user Name of the target user. Data Validation Option copies the folder to this user workspace.The source user and the target user can be the same user.

--toFolder target_folder Name of the target folder. The target folder must be unique in the target workspace. Ifyou do not specify a target folder, Data Validation Option creates a folder in the targetworkspace with the same name as the source folder.

--reuseViews Yes Reuses an SQL or lookup view in the target workspace when the workspace contains aview with the same name as a source SQL or lookup view. If you specify this option, DataValidation Option does not copy the source view to the target workspace. If you do not

104 Chapter 15: Command Line Integration

Page 118: PC 950 DataValidationOption UserGuide En

Option Argument Description

specify this option, Data Validation Option prompts you for the action to take when viewswith the same name are found.

--username User name Informatica domain user name for the domain to which you configured Informaticaauthentication.Required if you configure Informatica Authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica Authentication.

CreateUserConfigCreates Data Validation Option users with the specified user names.

The CreateUserConfig command uses the following syntax:

DVOCmd CreateUserConfig user_name [, user_name, …] [--confdir conf_dir] --outputdir output_dir [--overwrite]

Creates a preferences file called <username>-preferences.xml for each user in the output directory. Thepreferences file contains connection information for the Data Validation Option repository. Copy each preferencesfile from the output directory to the user configuration directory and rename it to preferences.xml. This allows eachuser to access the Data Validation Option repository.

The following table describes CreateUserConfig options and arguments:

Option Argument Description

n/a user_name The name of the Data Validation Option user. To create multiple users, enter multiple usernames separated by commas.

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multiple DataValidation Option repositories. If you have one Data Validation Option repository on a clientmachine and have not changed the configuration directory, you do not need to specify thisoption. Enclose the file path in quotes.

--outputdir output_dir Directory in which to store user preferences files.Enclose the file path in quotes. Use double-quotes if the path has space or special characters.

--overwrite n/a Overwrites the configuration files.

RELATED TOPICS:¨ “Data Validation Option Configuration for Additional Users” on page 26

DisableInformaticaAuthenticationDisables Informatica authentication in a Data Validation Option schema.

CreateUserConfig 105

Page 119: PC 950 DataValidationOption UserGuide En

The DisableInformaticaAuthentication command uses the following syntax:

DVOCmd DisableInformaticaAuthentication [--confdir conf_dir] [--username User name] [--password Password]

The following table describes DisableInformaticaAuthentication options and arguments:

Option Argument Description

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multipleData Validation Option repositories on a client machine. If you have one Data ValidationOption repository on a client machine and have not changed the configuration directory,you do not need to specify this option.Enclose the file path in quotes.

--username User name Informatica domain administrator user name for the domain to which you configuredInformatica authentication.

--password Password Password for the Informatica user name.

ExportMetadataExports all Data Validation Option metadata to an XML file.

The ExportMetadata command uses the following syntax:

DVOCmd ExportMetadata file_name [--confdir conf_dir] [--username User name] [--password Password]

The following table describes ExportMetadata options and arguments:

Option Argument Description

n/a file_name The file to which you want to export metadata.

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multiple DataValidation Option repositories on a client machine. If you have one Data Validation Optionrepository on a client machine and have not changed the configuration directory, you do notneed to specify this option.Because Windows directories often contain spaces, you must enclose the file path in quotes.

--username User name Informatica domain user name for the domain to which you configured Informatica authentication.Required if you configure Informatica Authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica Authentication.

106 Chapter 15: Command Line Integration

Page 120: PC 950 DataValidationOption UserGuide En

RELATED TOPICS:¨ Metadata Export and Import

ImportMetadataImports metadata into Data Validation Option from an export XML file.

The ImportMetadata command uses the following syntax:

DVOCmd ImportMetadata file_name [--confdir conf_dir] [--overwrite] [--username Username] [--passwordPassword]

The following table describes ImportMetadata options and arguments:

Option Argument Description

n/a file_name The file that contains metadata to be imported.

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multiple DataValidation Option repositories on a client machine. If you have one Data Validation Optionrepository on a client machine and have not changed the configuration directory, you do notneed to specify this option.Because Windows directories often contain spaces, you must enclose the file path in quotes.

--overwrite n/a Overwrites existing objects.

--username User name Informatica domain user name for the domain to which you configured Informaticaauthentication.Required if you configure Informatica Authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica Authentication.

RELATED TOPICS:¨ Metadata Export and Import

InstallTestsPrepares all tests for a single table or table pair. For each test, this command generates the PowerCentermapping in the Data Validation Option target folder in the PowerCenter repository.

The InstallTests command uses the following syntax:

DVOCmd InstallTests external_ID [, external_ID, …] [--confdir conf_dir] [--cacheSize CACHESIZE] [--usernameUser name] [--password Password]

ImportMetadata 107

Page 121: PC 950 DataValidationOption UserGuide En

The following table describes InstallTests options and arguments:

Option Argument Description

n/a external_ID The external ID for the single table or table pair.

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multiple DataValidation Option repositories on a client machine. If you have one Data Validation Optionrepository on a client machine and have not changed the configuration directory, you do notneed to specify this option.Enclose the file path in quotes.

--cacheSize CACHESIZE

Memory allocation to generate transformations in PowerCenter mappings. Increase the cachesize for tests that contain multiple joins and lookups.Default is 20 MB for the data cache and 10 MB for the index cache.Specify "Auto" to enable PowerCenter to compute the cache size.

--username User name Informatica domain administrator user name for the domain to which you configured Informaticaauthentication.Required if you configure Informatica authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica authentication.

--forceInstall

n/a Creates new mappings for table pairs in the repository.DVOCmd uses the DTM buffer size value configured in preferences.xml. If you modify the DTMbuffer size value in the preferences file, run the InstallTests command with the forceInstalloption for existing table pairs before you run the RunTests command.

Cache SettingsYou can configure the cache settings for the PowerCenter transformations that Data Validation Option generatesfor the tests.

Data Validation Option generates a PowerCenter mapping with Joiner transformations and Lookup transformationsfor a test that contains joins and lookups. Joiner transformations and Lookup transformations require a largeamount of memory. Configure the cache settings for a complex test that contains joins and lookups.

The value that you specify as cache settings for a test persists until you modify the test. For information aboutoptimal cache setting for the tests, see the PowerCenter Advanced Workflow Guide.

LinkDVOUsersToInformaticaLinks the existing Data Validation Option users with Informatica domain users.

Create a text file that contains a list of the Data Validation Option users and Informatica domain users in thefollowing format:

<dvo_user_name1>,<Informatica_user_name1><dvo_user_name2>,<Informatica_user_name2>..<dvo_user_nameN>,<Informatica_user_nameN>

108 Chapter 15: Command Line Integration

Page 122: PC 950 DataValidationOption UserGuide En

The LinkDVOUsersToInformatica command uses the following syntax:

DVOCmd LinkDVOUsersToInformatica [file_name]

The following table describes LinkDVOUsersToInformatica argument:

Option Argument Description

n/a file_name Name of the file that contains the mapping between Data Validation Option users andInformatica users.Enclose the file path in quotes.

PurgeRunsPurges test runs from the Data Validation Option repository. You can purge deleted test runs or purge test runs bydate. When you purge test runs by date, you can purge all test runs that occur on or after a specified date, beforea specified date, or between two dates.

The PurgeRuns command uses the following syntax:

DVOCmd PurgeRuns [--confdir conf_dir] [--deleted] [--fromdate from_date] [--todate to_date] [--usernameUser name] [--password Password]

If you configure Informatica authentication for the Data Validation Option schema, enter --username and --password.

The following table describes PurgeRuns options and arguments:

Option Argument Description

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multipleData Validation Option repositories on a client machine. If you have one Data ValidationOption repository on a client machine and have not changed the configuration directory, youdo not need to specify this option.Because Windows directories often contain spaces, you must enclose the file path in quotes.

--deleted n/a Purges deleted test runs.

--fromdate from_date Purges test runs that occur on or after this date.

--todate to_date Purges test runs that occur before this date.

--username User name Informatica domain user name for the domain to which you configured Informaticaauthentication.Required if you configure Informatica authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica authentication.

PurgeRuns 109

Page 123: PC 950 DataValidationOption UserGuide En

RefreshRepositoryRefreshes a source or target repository.

The RefreshRepository command uses the following syntax:

DVOCmd RefreshRepository repo_name [--confdir conf_dir] [--all] [--connections] [--folderlist] [--allfolders] [--folder folder_name] [--username User name] [--password Password] [--dryrun]

Tip: Use the --folderlist and --folder options to get the sources and targets in a new PowerCenter repository folder.For example, if the repository name is "DVTgtRepo" and the new folder name is "NewOrders," enter the followingcommand:

DVOCmd RefreshRepository DVTgtRepo --folderlist --folder NewOrders

Important: The RefereshRepository command fails to run from the UNIX command line. If you want to runRefreshRepository command from the command line, use a Windows machine.

The following table describes RefreshRepository options and arguments:

Option Argument Description

n/a repo_name Name of the repository you want to refresh.Note: This can take several minutes to several hours depending on the size of therepository.

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multipleData Validation Option repositories on a client machine. If you have one Data ValidationOption repository on a client machine and have not changed the configuration directory,you do not need to specify this option.Use double-quotes if the path has space or special characters.

--all n/a Refreshes all source, target, folder, and connection metadata for the repository.Note: This option can take several minutes to several hours depending of the sourcesand targets in the PowerCenter Repository Service, but Data Validation Option does notcopy the objects into the Data Validation Option repository. You can use the option tochecks whether the import from PowerCenter Repository Services works. ng on the sizeof the repository.

--connections n/a Refreshes connection metadata for the target repository.

--folderlist n/a Refreshes the folder list for the repository.

--allfolders n/a Refreshes source and target metadata in all folders in the repository.

--folder folder_name Refreshes source and target metadata for the named folder.

--username User name Informatica domain user name for the domain to which you configured Informaticaauthentication.Required if you configure Informatica authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica authentication.

--dryrun n/a Checks whether import from the PowerCenter Repository Services works.RefreshRepository reads the sources and targets in the PowerCenter RepositoryService, but does not write the objects into the Data Validation Option repository.

110 Chapter 15: Command Line Integration

Page 124: PC 950 DataValidationOption UserGuide En

RELATED TOPICS:¨ “Repositories Overview” on page 36

RunTestsRuns all tests for a single table or table pair.

For example, to run tests for a table pair with the external ID "abc123," you might enter the following command:

DVOCmd RunTests abc123

If one test fails, the overall result also fails. The exit status code for a successful test is 0. A non-zero status codedesignates a failed test or an error.

The following table describes the status codes:

StatusCode

Description

0 The test is successful.

1 The test fails to install.

2 The test fails to run.

3 The test does not give any result.

4 The test fails.

The RunTests command uses the following syntax:

DVOCmd RunTests external_ID [, external_ID, …] [--confdir conf_dir] [--email email_ID,...] [--sendEmailNotPass] [--cacheSize CACHESIZE] [--username User name] [--password Password]

The following table describes RunTests options and arguments:

Option Argument Description

n/a external_ID The external ID for the single table or table pair.

--confdir conf_dir The user configuration directory. Specify the configuration directory if you have multipleData Validation Option repositories on a client machine. If you have one Data ValidationOption repository on a client machine and have not changed the configuration directory,you do not need to specify this option.Enclose the file path in quotes.

--email email_ID The email address to which Data Validation Option sends an email when the tests arecomplete. You can provide multiple email addresses separated by commas. The emailspecifies whether the test has passed or failed and provides a link to the test results.Note: Configure the SMTP settings for the outgoing email server on the PowerCenterIntegration Service with the following custom properties: SMTPServerAddress,SMTPPortNumber, SMTPFromAddress, and SMTPServerTimeout. If you want to useMicrosoft Outlook to send email, enter the Microsoft Exchange profile in theMSExchangeProfile configuration property in the PowerCenter Integration Service.

RunTests 111

Page 125: PC 950 DataValidationOption UserGuide En

Option Argument Description

--sendEmail NotPass Limits Data Validation Option to sending an email only if the test fails.

--cacheSize CACHESIZE Memory allocation to generate transformations in PowerCenter mappings. Increase thecache size for tests that contain multiple joins and lookups.Default is 20 MB for the data cache and 10 MB for the index cache.Specify "Auto" to enable PowerCenter to compute the cache size.When you run the RunTests command and set the cache size, Data Validation Optioninstalls the tests in the repository before it runs the test.

--username User name Informatica domain user name for the domain to which you configured Informaticaauthentication.Required if you configure Informatica authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica authentication.

Cache SettingsYou can configure the cache settings for the PowerCenter transformations that Data Validation Option generatesfor the tests.

Data Validation Option generates a PowerCenter mapping with Joiner transformations and Lookup transformationsfor a test that contains joins and lookups. Joiner transformations and Lookup transformations require a largeamount of memory. Configure the cache settings for a complex test that contains joins and lookups.

The value that you specify as cache settings for a test persists until you modify the test. For information aboutoptimal cache setting for the tests, see the PowerCenter Advanced Workflow Guide.

UpdateInformaticaAuthenticationConfigurationUpdates the Informatica authentication properties in the Data Validation Option schema.

The UpdateInformaticaAuthenticationConfiguration command uses the following syntax:

DVOCmd UpdateInformaticaAuthenticationConfiguration [--confdir conf_dir] [--infahostname INFAHOSTNAME] [--infahttpport INFAHTTPPORT] [--isSecureConnection ISSECURECONNECTION] [--infaAdminUserName INFAADMINUSERNAME] [--infaAdminPassword INFAADMINPASSWORD]

You can obtain these parameters from the nodemeta.xml file available in the following location:

<InformaticaInstallationDir>/server/ispThe following table describes UpdateInformaticaAuthenticationConfiguration options and arguments:

Option Argument Description

--confdir conf_dir The user configuration directory. Specify the configuration directory if you havemultiple Data Validation Option repositories on a client machine. If you have one

112 Chapter 15: Command Line Integration

Page 126: PC 950 DataValidationOption UserGuide En

Option Argument Description

Data Validation Option repository on a client machine and have not changed theconfiguration directory, you do not need to specify this option.Because Windows directories often contain spaces, you must enclose the filepath in quotes.

--infahostname INFAHOSTNAME Host name of the Informatica gateway.

--infahttpport INFAHTTPPORT Port number to connect to the Informatica gateway. Enter the value of the portelement in the nodemeta.xml file.

--isSecureConnection

ISSECURECONNECTION

Set the argument as true if TLS is enabled in the Informatica gateway.

--infaAdminUserName

INFAADMINUSERNAME

Informatica administrator user for the Informatica gateway.

--infaAdminPassword

INFAADMINPASSWORD

Password for the Informatica administrator user.

UpgradeRepositoryUpgrades the Data Validation Option repository. Use this command when you upgrade from a previous version ofData Validation Option.

The UpgradeRepository command uses the following syntax:

DVOCmd UpgradeRepository [--username User name] [--password Password]

The following table describesUpgradeRepository options and arguments:

Option Argument Description

--username User name Informatica domain user name for the domain to which you configured Informaticaauthentication.Required if you configure Informatica Authentication.

--password Password Password for the Informatica user name.Required if you configure Informatica Authentication.

RELATED TOPICS:¨ “Data Validation Option Installation and Configuration” on page 20

UpgradeRepository 113

Page 127: PC 950 DataValidationOption UserGuide En

C H A P T E R 1 6

TroubleshootingThis chapter includes the following topics:

¨ Troubleshooting Overview, 114

¨ Troubleshooting Initial Errors, 114

¨ Troubleshooting Ongoing Errors, 115

¨ Troubleshooting Command Line Errors, 116

Troubleshooting OverviewWhen you run a test, Data Validation Option performs the following tasks:

1. Creates a mapping in the specified PowerCenter folder.

2. Creates the PowerCenter workflow.

3. Runs the PowerCenter session.

Besides initial installation problems, Data Validation Option errors can occur in one of the following steps:

Installation Error

Data Validation Option cannot create a PowerCenter mapping.

Run Error

Data Validation Option cannot create or run a PowerCenter workflow.

No Results

The PowerCenter session runs but fails, or there are no results in the results database.

Troubleshooting Initial ErrorsThis section assumes that Data Validation Option has just been installed and no successful tests have beenexecuted. It also assumes that the first test is a simple test that does not contain expressions or SQL views.

114

Page 128: PC 950 DataValidationOption UserGuide En

The following table describes common initial errors:

Error Possible Cause and Solution

Cannot connect to theData Validation Optionrepository

Database credentials are incorrect. Check the server, port, and database name specified in the URLline. If the problem persists, contact your database administrator.If the problem is database username and password, the error message explicitly states this.

Cannot read thePowerCenter repository

Check the repository settings up to Security Domain. (Informatica Domain names are not used untillater.) If you cannot resolve the error, contact the Informatica administrator.Another way to troubleshoot this error is by trying to log into the repository through the pmrepcommand line utility.If you get an out-of-memory error when you access large repositories, increase the Java heap sizeof the Data Validation Option client from the command line.You can increase the Java heap size from the command line with the following command:DVOClient.exe -J-Xmx<heapsize value>Default is 1024 MB.

Installation Error Verify that the Data Validation Option folder is closed in the Designer, Workflow Manager and theRepository Manager. The Workflow Monitor can be open.Verify that the INFA_HOME environment variable is set.Verify that the Data Validation Option folder exists.

Run Error Verify that the Data Validation Option folder is closed.Check the Informatica Domain name and Integration Service names. Verify that they are running.Verify that the PowerCenter connection name (connection to the Data Validation Option repository)is correct, and that the user has the privilege to use it.Open the session log and look for session errors. Most session failures are caused by an incorrectconnection.If the error is “Cannot get object class for dvo/infact/PivotPluginImpl,” the dvoct.jar file cannot beread either because it is not on the server, because of its privileges, or because the informationentered in the Administrator tool is incorrect.Verify that the user has the privilege to use the connection to the Data Validation Option repositoryspecified in the Tools > Preferences > Data Validation Option database. This will also beapparent in the session log.If you install PowerCenter 9.0.1 or earlier and the data source is SAP, install ABAP program on themapping generated by the test in the Designer tool.

No Results Verify that there is data in the data set you are analyzing. Tables should have records, and filtersand joins should not result in an empty set.Verify that the connection to the Data Validation Option repository specified in the WorkflowManager points to the Data Validation Option repository.

Troubleshooting Ongoing ErrorsThis section assumes that successful tests have been created and run before the error occurred.

In general, you should always check for the following sources of errors:

¨ Incorrect connection

¨ The Data Validation Option folder is open in the Designer, Workflow Manager, or Repository Manager

Troubleshooting Ongoing Errors 115

Page 129: PC 950 DataValidationOption UserGuide En

The following table describes common ongoing errors:

Error Possible Cause and Solution

Installation or run errors Verify that the Data Validation Option folder is closed in the Designer, Workflow Manager andRepository Manager.Verify that the PowerCenter environment is functioning correctly, for example, services arerunning and the repository is up.If the error occurred right after you created an expression either in a test editor dialog box or asa WHERE clause, check the syntax.Open the session log and verify that the PowerCenter connections are correct.

No results Verify that there is data in the data set you are analyzing. Tables should have records. Filters orjoins should not result in an empty set.

Inability to generate reports Verify that you have read and write permissions on the Data Validation Option installationdirectory and subdirectories.

Inability to copy folders Verify that the repository, data sources, and folders that contain the data sources have identicalnames in the source and the target workspaces. Object names in Data Validation Option arecase sensitive.Verify that all data sources associated with the objects to copy exist in the target workspace inthe same location and that the names match.

Troubleshooting Command Line ErrorsI ran a DVOCmd command that got an error and used the redirection operator to write the messages to a file. The redirectionoperator does not redirect all messages to the output file.

When you run a DVOCmd command, the command line utility writes regular messages to the STDOUT outputstream and writes error messages to the STDERR output stream. You use the redirection operator to writemessages to an output file. To merge messages from both output streams, enter "2>&1" after the output filename.

For example, you encounter an error while refreshing folder "MyFolder" in Data Validation Option repository"DVORepo." To write all messages, including the error messages, to a text file called "Log.txt," use thefollowing command:

DVOCmd RefreshRepository DVORepo --folder MyFolder > C:\Log.txt 2>&1

To append messages to an existing log file called "DVOLog.txt," use the following command:

DVOCmd RefreshRepository DVORepo --folder MyFolder >> C:\DVOLog.txt 2>&1

116 Chapter 16: Troubleshooting

Page 130: PC 950 DataValidationOption UserGuide En

A P P E N D I X A

Datatype ReferenceThis appendix includes the following topics:

¨ Test, Operator, and Datatypes Matrix for Table Pair Tests, 117

¨ Test, Operator, and Datatypes Matrix for Single-Table Constraints, 118

Test, Operator, and Datatypes Matrix for Table PairTests

Table pair tests use the following datatypes:

¨ s = string datatypes

¨ n = numeric datatypes

¨ d = date/time datatypes

¨ b = binary/other datatypes

The following table describes the operators and datatypes for table pair tests:

Operators Allowed Datatypes Allowed:Approx. Operator

Datatypes Allowed: AllOther

COUNT All s,n,d,b s,n,d,b

COUNT_DISTINCT All s,n,d s,n,d

COUNT_ROWS All s,n,d,b s,n,d,b

MIN All n s,n,d

MAX All n s,n,d

AVG All n n

SUM All n n

SET_AinB -- s,n,d s,n,d

SET_BinA -- s,n,d s,n,d

117

Page 131: PC 950 DataValidationOption UserGuide En

Operators Allowed Datatypes Allowed:Approx. Operator

Datatypes Allowed: AllOther

SET_AeqB -- s,n,d s,n,d

VALUE All n s,n,d

OUTER_VALUE All n s,n,d

Note: SET tests do not use operators and allow string, numeric and date/time datatypes.

Test, Operator, and Datatypes Matrix for Single-TableConstraints

Single-table constraints use the following datatypes:

¨ s = string datatypes

¨ n = numeric datatypes

¨ d = date/time datatypes

¨ b = binary/other datatypes

The following table describes the operators and datatypes for single-table constraints:

Operators Allowed Datatypes Allowed Result ExpressionDatatype

COUNT All s,n,d,b n

COUNT_DISTINCT All s,n,d n

COUNT_ROWS All s,n,d,b n

MIN All s,n,d n

MAX All s,n,d n

AVG All n n

SUM All n n

VALUE All s,n,d s,n,d

FORMAT =, <> s,n,d s

UNIQUE -- s,n,d --

NOT_NULL -- s,n,d --

NOT_BLANK -- s,n,d --

118 Appendix A: Datatype Reference

Page 132: PC 950 DataValidationOption UserGuide En

A P P E N D I X B

BIRT Report ExamplesThis appendix includes the following topics:

¨ Summary of Testing Activities, 119

¨ Table Pair Summary, 120

¨ Detailed Test Results – Test Page, 122

¨ Detailed Test Results – Bad Records Page, 123

Summary of Testing ActivitiesThe following figure shows an example of a Summary of Testing Activities report:

119

Page 133: PC 950 DataValidationOption UserGuide En

Table Pair SummaryThe following figure shows an example of a Table Pair Summary report:

120 Appendix B: BIRT Report Examples

Page 134: PC 950 DataValidationOption UserGuide En

Table Pair Summary 121

Page 135: PC 950 DataValidationOption UserGuide En

Detailed Test Results – Test PageThe following figure shows an example of a test page in the Detailed Test Results report:

122 Appendix B: BIRT Report Examples

Page 136: PC 950 DataValidationOption UserGuide En

Detailed Test Results – Bad Records PageThe following figure shows an example of a bad records page in the Detailed Test Results report:

Detailed Test Results – Bad Records Page 123

Page 137: PC 950 DataValidationOption UserGuide En

A P P E N D I X C

Jasper Report ExamplesThis appendix includes the following topics:

¨ Home Dashboard, 124

¨ Repository Dashboard, 125

¨ Folder Dashboard, 126

¨ Tests Run Vs Tests Passed, 126

¨ Total Rows Vs Percentage of Bad Records , 127

¨ Most Recent Failed Runs, 127

¨ Last Run Summary , 128

Home DashboardHome dashboard displays the test details in the Data Validation Schema over the past 30 days.

124

Page 138: PC 950 DataValidationOption UserGuide En

Repository DashboardRepository Dashboard displays the details of tests run in the Data Validation Option repository over the past 30days and the past 24 hours.

Repository Dashboard 125

Page 139: PC 950 DataValidationOption UserGuide En

Folder DashboardFolder Dashboard displays the details of tests run in the repository over the past 30 days and the past 24 hours.

Tests Run Vs Tests PassedThe following figure shows an example of a Tests Run Vs Tests Passed report:

126 Appendix C: Jasper Report Examples

Page 140: PC 950 DataValidationOption UserGuide En

Total Rows Vs Percentage of Bad RecordsThe following figure shows an example of a Total Rows Vs Percentage of Bad Records report:

Most Recent Failed RunsThe following figure shows an example of a Most Recent Failed Runs report:

Total Rows Vs Percentage of Bad Records 127

Page 141: PC 950 DataValidationOption UserGuide En

Last Run SummaryThe following figure shows an example of a Last Run Summary report:

128 Appendix C: Jasper Report Examples

Page 142: PC 950 DataValidationOption UserGuide En

A P P E N D I X D

Reporting ViewsThis appendix includes the following topics:

¨ Reporting Views Overview, 129

¨ results_summary_view, 129

¨ rs_bad_records_view, 132

¨ results_id_view, 133

¨ meta_sv_view, 133

¨ meta_lv_view, 134

¨ meta_jv_view, 135

¨ meta_ds_view, 136

¨ meta_tp_view, 136

¨ rs_sv_id_view, rs_lv_id_view, and rs_jv_id_view, 137

Reporting Views OverviewAll Data Validation Option reports run against database views that are set up as part of the installation process.You can write custom reports against these views. Do not write reports against the underlying database tablesbecause the Data Validation Option repository metadata can change between versions.

results_summary_viewThis view combines all table pair and test metadata and all test results. This view consists of the following generalsections:

¨ tp_. Table pair metadata

¨ tc_. Test metadata

¨ tc_rs_. Test results

¨ tr_, ti_. Table pair runtime information

129

Page 143: PC 950 DataValidationOption UserGuide En

The following table describes table pair information:

Metadata Description

tp_user_id User ID of the person who ran this test

tp_username User name of the person who ran this test

tp_obj_id Unique table pair ID

tp_version Table pair version

tp_name Table pair description

tp_time_stamp Time table pair was last edited

tp_comments Table pair comments

tp_description Table pair description. Same as tp_name.

tp_table_a,tp_table_b

Either full table name, including the PowerCenter repository directory or view name

tp_table_version_atp_table_version_b

Version name for SQL and lookup views, otherwise empty

tp_type_a,tp_type_b

A and B source type: 1 = relational, 2 = flat file, 3 = SQL view, 4 = lookup view

tp_conn_name_atp_conn_name_b

A and B connection names

tp_owner_name_a,tp_owner_name_b

A and B owner names

tp_src_dir_a,tp_src_file_a,tp_src_dir_b,tp_src_file_b

A and B directory and file names if flat files

tp_in_db_a,tp_in_db_b

Run aggregations in database for A or B

tp_where_clause_a,tp_where_clause_b

A and B WHERE clauses

tp_is_where_clause_dsq_a,tp_is_where_clause_dsq_b

Run WHERE clause in database for A or B

tp_join_list_str Joined fields as one string

tp_external_id Table pair external ID

130 Appendix D: Reporting Views

Page 144: PC 950 DataValidationOption UserGuide En

The following table describes test information:

Metadata Description

tc_index Internal ID

tc_description Test description

tc_comment Test comment

tc_type AGG if aggregate, otherwise equal to test (VALUE, OUTER_VALUE, etc.)

tc_agg_func Aggregate test if AGG (COUNT, SUM, etc.), otherwise blank

tc_column_a Field A

tc_operator Operator

tc_column_b Field B

tc_tables_type 0 = two-table pair, 1 = one-table constraint

tc_threshold Threshold

tc_max_bad_records Maximum bad records

tc_is_case_insensitive Case insensitive checkbox: 0 = false, 1 = true

tc_is_treat_nulls_equal Null = Null, 0 = false, 1 = true

tc_is_trim_right_ws Trim trailing spaces: 0 = false, 1 = true

tc_expression_a Expression A

tc_expression_b Expression B

The following table describes test results information:

Metadata Description

tc_rs_result Test Result: 1 = pass, 0 = fail: -1 = no_results, -2 = error

tc_rs_failure_count Number of bad records

tc_rs_processed_count Number of records processed.

tc_rs_count_rows_atc_rs_count_rows_b

Number of records in A and B.

tc_rs_agg_value_a,tc_rs_agg_value_b

Aggregate results A and B

results_summary_view 131

Page 145: PC 950 DataValidationOption UserGuide En

The following table describes table pair runtime information:

Metadata Description

tr_id Unique internal run ID incremented for each run

tr_state Result state. 2 = install_error, 4 = run_success, 5 = run_error

tr_start_time Table pair run time start, in milliseconds, since 1970 UTC

tr_finish_time Run time finish

tr_is_latest Whether this is the latest run of a given table pair: 1 = latest; 0 = not latest

tr_error_msg Error message for a run error.

ti_id Internal ID. Do not use.

ti_folder_name PowerCenter folder name where mapping was created

ti_mapping_name PowerCenter mapping name

ti_session_name PowerCenter session name

ti_workflow_name PowerCenter workflow name

rs_bad_records_viewThis view is a detailed view of all bad records. It can be joined with results_summary_view on tr_id and tc_index.

The following table describes bad records information:

Metadata Description

tr_id Run ID, joined to tr_id in results_summary_view

tc_index Joined to tc_index in results_summary_view

tc_rs_br_key_a Key A

tc_rs_br_value_a Value A

tc_rs_br_key_b Key B

tc_rs_br_value_b Value B

132 Appendix D: Reporting Views

Page 146: PC 950 DataValidationOption UserGuide En

results_id_viewThis view uses results_summary_view as the source and aggregates it on the table pair level. This view containsthe table pair result.

The following table describes results ID information:

Metadata Description

tr_id Internal run ID incremented for each run

tr_is_latest Whether this is the latest run of a given table pair: 1 = latest; 0 = not latest

tr_start_time Table pair run time start, in milliseconds, since 1970 UTC

ti_id Internal ID

tp_obj_id Table pair ID

tp_version Table pair version

tp_user_id User ID of the person who ran this test

tp_rs_result Table pair result: 1 = pass, 0 = fail, -1 = no results, -2 = error

meta_sv_viewThis view returns all SQL view information.

The following table describes SQL view information:

Metadata Description

sv_id Unique ID

sv_name Internal name

sv_obj_id View object ID

sv_version View version

sv_description Description

sv_comment Comment

sv_owner_name Owner name

sv_conn_name Connection

sv_dsname Table name

results_id_view 133

Page 147: PC 950 DataValidationOption UserGuide En

Metadata Description

sv_sql_query SQL statement

svf_name Column name

svf_business_name Not used

svf_datatype Column datatype

svf_precision Column precision

svf_scale Column scale

svf_is_key Not used

meta_lv_viewThis view returns all lookup view information.

The following table describes lookup view information:

Metadata Description

lv_id Unique ID

lv_name Internal name

lv_obj_id View object ID

lv_version View version

lv_tp_name Table pair description

lv_time_stamp Time lookup view was last edited

lv_comments Comments

lv_description Lookup view description

lv_table_a Source table name

lv_table_b Lookup table name

lv_type_a,lv_type_b

Source and lookup types: 1 = relational, 2 = flat file

lv_conn_name_a,lv_conn_name_b

Source and lookup connections

lv_owner_name_a, Source and lookup owner names

134 Appendix D: Reporting Views

Page 148: PC 950 DataValidationOption UserGuide En

Metadata Description

lv_owner_name_b

lv_src_dir_a,lv_src_file_a,lv_src_dir_b,lv_src_file_b

Source and lookup directory and file names if flat files

lv_where_clause_a,lv_where_clause_b

Not used

lv_is_where_clause_dsq_a,lv_is_where_clause_dsq_b

Not used

lv_join_list_str Lookup relationship (join) as a string description

meta_jv_viewThis view returns all join view information.

The following table describes join view information:

Metadata Description

jv_id Unique ID

jv_name Internal name

jv_obj_id View object ID

jv_version View version

jv_description Description

jv_table_name Table name

jv_alias_name Alias name of the table

jv_table_join_type Type of join

jv_table_position Position of join in the table

jv_is_live Whether the view is in use.

meta_jv_view 135

Page 149: PC 950 DataValidationOption UserGuide En

meta_ds_viewThis view returns all the data sources used by table pairs, SQL views, lookup views, and join views.

The following table describes SQL view information:

Metadata Description

id Unique ID of the table pair, SQL view, lookup view, or join view.

object_name Name of the data source.

object_description Description of the data source.

object_folder_name Folder in which the data source is available.

object_id Unique ID of the data source.

object_version Version of the data source in the PowerCenter repository.

object_type Type of the data source.

object_is_live Whether the table pair, SQL view, lookup view, or join view is available to theusers.

object_table_name Table name of the data source.

object_user_name User who used the data source.

object_user_id ID of the user who used the data source.

meta_tp_viewThis view returns all table pair information.

The following table describes table pair information:

Metadata Description

tp_user_id User ID

tp_username User name

tp_obj_id Unique table pair ID

tp_version Table pair version

tp_name Table pair description

tp_time_stamp Time table pair was last edited

tp_comments Table pair comments

136 Appendix D: Reporting Views

Page 150: PC 950 DataValidationOption UserGuide En

Metadata Description

tp_description Table pair description. Same as tp_name.

tp_table_a,tp_table_b

Either full table name, including the PowerCenter repository directory or view name

tp_table_version_atp_table_version_b

Version name for SQL and lookup views, otherwise empty

tp_type_a,tp_type_b

A and B source type: 1 = relational, 2 = flat file, 3 = SQL view, 4 = lookup view

tp_conn_name_atp_conn_name_b

A and B connection names

tp_owner_name_a,tp_owner_name_b

A and B owner names

tp_src_dir_a,tp_src_file_a,tp_src_dir_b,tp_src_file_b

A and B directory and file names if flat files

tp_in_db_a,tp_in_db_b

Run aggregations in database for A or B

tp_where_clause_a,tp_where_clause_b

A and B WHERE clauses

tp_is_where_clause_dsq_a,tp_is_where_clause_dsq_b

Run WHERE clause in database for A or B

tp_join_list_str Joined fields as one string

tp_external_id Table pair external ID

tp_is_live Indicates whether the table pair is active

rs_sv_id_view, rs_lv_id_view, and rs_jv_id_viewThese views return the SQL view, lookup view, or join view IDs and columns for the querying criteria in theresults_summary_view view. The view may contain duplicate IDs. You can use SELECT DISTINCT to avoidduplicate IDs.

rs_sv_id_view, rs_lv_id_view, and rs_jv_id_view 137

Page 151: PC 950 DataValidationOption UserGuide En

A P P E N D I X E

Metadata Import SyntaxThis appendix includes the following topics:

¨ Metadata Import Syntax Overview, 138

¨ Table Pair with One Test, 138

¨ Table Pair with an SQL View as a Source, 139

¨ Table Pair with Two Flat Files, 139

¨ Single-Table Constraint, 140

¨ SQL View, 141

¨ Lookup View, 141

Metadata Import Syntax OverviewThe following sections display examples of metadata syntax definition.

Table Pair with One Test<TablePair> Name = "CLIDETAIL_CLISTAGE" Description = "CLIDETAIL-CLISTAGE" ExternalID = "cliTest" SaveDetaliedBadRecords = true SaveBadRecordsTo = "SCHEMA" BadRecordsFileName = "" ParameterFile = "C:\\test_param.txt" <TableA> Name = "pc_repository/cli_demo/Sources/demo_connection/cliDetail" Connection = "dvo_demo_connection" WhereClause = "" WhereClauseDSQ = false InDB = false <TableB> Name = "pc_repository/cli_demo/Targets/cliStage" Connection = "dvo_demo_connection" WhereClause = "" WhereClauseDSQ = false InDB = false <Parameter> Name = "$$NewParameter1" Type = "string" Precision = "10"

138

Page 152: PC 950 DataValidationOption UserGuide En

Scale = "0" <TestCase> TestType = "AGG" Aggregate = "SUM" ColumnA = "ProductAmount" ColumnB = "CustomerAmount" Operator = "=" Comments = "" CaseInsensitive = false TrimRightWhitespace = false TreatNullsEqual = true

Table Pair with an SQL View as a Source<TablePair> Name = "Joined_MYVIEW_FACTORDERS" Description = "Joined MYVIEW-FACTORDERS" ExternalID = "" <TableA> Name = "SQLView_470" WhereClause = "" WhereClauseDSQ = false InDB = false <TableB> Name = "pc_repository/dvo_demo/Targets/factOrders" Connection = "dvo_demo_connection" WhereClause = "" WhereClauseDSQ = false InDB = false <Join> ColumnA = "MyID" ColumnB = "LineID" ColumnA = "MyCurrency" ColumnB = "CurrencyName" <TestCase> TestType = "VALUE" ColumnA = "MyCurrency" ColumnB = "CurrencyName" Operator = "=" Comments = "" CaseInsensitive = false TrimRightWhitespace = true TreatNullsEqual = true <TestCase> TestType = "AGG" Aggregate = "SUM" <ExpressionA> Expression = "if(MyID>10,10,MyID)" Datatype = "integer" Precision = 10 Scale = 0 ColumnB = "LineID" Operator = "=" Comments = "" CaseInsensitive = false TrimRightWhitespace = false TreatNullsEqual = true

Table Pair with Two Flat Files<TablePair> Name = "FLATFILE_FLATFILE" Description = "FLATFILE-FLATFILE"

Table Pair with an SQL View as a Source 139

Page 153: PC 950 DataValidationOption UserGuide En

ExternalID = "" SaveDetaliedBadRecords = true SaveBadRecordsTo = "FLAT_FILE" BadRecordsFileName = "test.txt" <TableA> Name = "pc_repository/dvo_demo/Sources/FlatFile/FlatFile" SourceDirectory = "C:\\FlatFile\\Sourcess" SourceFilename = "flatfile.txt" WhereClause = "" WhereClauseDSQ = false InDB = false <TableB> Name = "pc_repository/dvo_demo/Sources/FlatFile/FlatFile" SourceDirectory = "C:\\FlatFiles\\Targets" SourceFilename = "flatfile2.txt" WhereClause = "" WhereClauseDSQ = false InDB = false <TestCase> TestType = "SET_ANotInB" ColumnA = "order_name" ColumnB = "order_name" Operator = "=" Comments = "" CaseInsensitive = true TrimRightWhitespace = true TreatNullsEqual = true

Single-Table Constraint<SingleTable> Name = "DIMEMPLOYEES" Description = "DIMEMPLOYEES" ExternalID = "" <TableA> Name = "pc_repository/dvo_demo/Targets/dimEmployees" Connection = "dvo_demo_connection" WhereClause = "" WhereClauseDSQ = false InDB = false <Key> ColumnA = "EmployeeID" <TestCase> TestType = "AGG" Aggregate = "COUNT" ColumnA = "EmployeeID" <ExpressionB> Expression = "100,200" Datatype = "integer" Precision = 10 Scale = 0 Operator = "Between" Comments = "" CaseInsensitive = false TrimRightWhitespace = false TreatNullsEqual = true <TestCase> TestType = "NOT_NULL" ColumnA = "LastName" ColumnB = "" Operator = "=" Comments = "" CaseInsensitive = false TrimRightWhitespace = false TreatNullsEqual = true

140 Appendix E: Metadata Import Syntax

Page 154: PC 950 DataValidationOption UserGuide En

SQL View<SQLView> Name = "SQLView_991" Description = "MyView991" <Table> Name = "pc_repository/dvo_demo/Sources/demo_connection/srcOrders" Connection = "dvo_demo_connection" SQLQuery = "Select * from srcOrders" Comments = "This is a comment" <Columns> <Column> Name = "MyID" Datatype = "int" Precision = 10 Scale = 0 <Column> Name = "MyCurrency" Datatype = "varchar" Precision = 30 Scale = 0 <Column> Name = "MyAmount" Datatype = "decimal" Precision = 10 Scale = 2

Lookup View<LookupView> Name = "LookupView" Description = "Lookup srcOrders --> dimProducts" <SourceTable> Name = "pc_repository/dvo_demo/Sources/demo_connection/srcOrders" Connection = "dvo_demo_connection" <LookupTable> Name = "pc_repository/dvo_demo/Targets/dimProducts" Connection = "dvo_demo_connection" <Join> ColumnA = "ProductID" ColumnB = "ProductID" ColumnA = "ProductName" ColumnB = "ProductName"

SQL View 141

Page 155: PC 950 DataValidationOption UserGuide En

A P P E N D I X F

GlossaryA

aggregate testsTests that check for lack of referential integrity in the source, incorrect ETL logic in the WHERE clauses, and rowrejection by the target system.

B

bad recordsRecords that fail a test. You can view the bad records in the Reports tab. Aggregate tests do not display badrecords.

C

constraint valueA representation of a constant value against which you can compare the field values.

count testA test that compares the number of values in each field of the tables in a table pair.

D

data validationThe process of testing and validating data in a repository.

Data Validation Option repositoryThe database that stores the test metadata and test results.

Data Validation Option userA user who creates and run tests in Data Validation Option. Data Validation Option stores the settings for eachuser in a unique user configuration directory.

DVOCmdThe command line program for Data Validation Option that you can use to do perform data validation tasks.

Page 156: PC 950 DataValidationOption UserGuide En

Data Validation Option folderThe folder in the Data Validation Option repository that stores the test metadata.

F

format testA test that checks if the datatype of the fields in the source and target tables match.

I

inner joinA join that returns all rows from multiple tables where the join condition is met.

J

Join ViewA join view is a virtual table that contains columns from related heterogeneous data sources joined by keycolumns. You can use a join view in a table pair or single table.

L

lookup viewA view that looks up a primary key value in a lookup table or reference table with a text value from a source.Lookup view stores the primary key in the target fact table. Lookup view allows you to test the validity of thelookup logic in your transformations.

O

outer joinA join that returns all rows from one table and those rows from a secondary table where the joined fields are equal.

S

single tableData Validation Option object that references a database table or flat file in a PowerCenter repository. Use asingle table to create tests that require data validation on a table.

SQL viewA set of fields created from several tables and several calculations in a query. You can use an SQL View as atable in a single table or table pair.

T

table pairA pair of sources from the PowerCenter repository or lookup views and SQL views that you create in DataValidation Option. You can select a relational table or flat file.

Appendix F: Glossary 143

Page 157: PC 950 DataValidationOption UserGuide En

thresholdNumeric value that defines an acceptable margin of error for a test. You can enter a threshold for aggregate testsand for value tests with numeric datatypes. If the margin crosses the threshold, the record Data Validation Optionmarks the record as a bad record.

U

unique testA test to check if the value in a field is unique.

V

value testA test to compare the values for fields in each row of the tables in a table pair that determines if the values match.

144 Glossary

Page 158: PC 950 DataValidationOption UserGuide En

I N D E X

Aarchitecture

Data Validation Option 2automatic generation

compare folder 58table pair 58

Bbad records

single-table constraints 75table pair tests 61

behavior changes in 9.1.0Data Validation Option 11

behavior changes in 9.1.2.0Data Validation Option 10

behavior changes in 9.1.4.0Data Validation Option 10

Cclient

Data Validation Option 13client layout

Join Views tab 16Lookup Views tab 15SQL Views tab 15Tests tab 15

configurationinstructions for additional users 26

CopyFolder commandsyntax 104

CreateUserConfig commandsyntax 105

Ddata validation

purpose 1testing approach 3typical workflow 2

Data Validation Optionarchitecture 2behavior changes in 9.1.0 11behavior changes in 9.1.2,0 10behavior changes in 9.1.4,0 10client 13configuration for additional users 26installation for first user 22installation overview 20installation prerequisites 21installation required information 22

menus 18new features 9new features in 9.1.2.0 8new features in 9.1.4.0 7overview 1required system permissions 21Settings folder 19system requirements 3UNIX 29upgrade 27upgrading from version 3.0 27upgrading from version 3.1 28users 1

datatypessingle table tests 118table pair tests 117

DisableInformaticaAuthentication commandsyntax 106

DVOCmdCopyFolder command 104CreateUserConfig command 105DisableInformaticaAuthentication command 106ExportMetadata command 106ImportMetadata command 107InstallTests command 107location 103overview 103PurgeRuns command 109RefreshRepository command 110RunTests command 111syntax 103troubleshooting 116UpgradeRepository command 113

EExportMetadata command

syntax 106

Ffolders

copying 17copying at the command line 104overview 16refreshing 37restrictions on copying 16Settings folder 19troubleshooting copying folders 115

145

Page 159: PC 950 DataValidationOption UserGuide En

IImportMetadata command

syntax 107installation

instructions for first user 22overview 20prerequisites 21required information 22required permissions 21troubleshooting 114, 115UNIX 29upgrading from version 3.0 27upgrading from version 3.1 28

InstallTests commandsyntax 107

Jjoin views

adding 88overview 84properties 85WHERE clause 87

Join Views tabData Validation Option 16

joinsheterogeneous sources 83table pairs 45

Llookup views

adding 81deleting 82description 81directory for file sources 81editing 81example 82joining flat files 83joining heterogeneous tables 83metadata import syntax 141name for file sources 81overriding source table owner 81overview 79properties 80selecting connections 81selecting the lookup table 80selecting the source table 80source to lookup relationship 81

Lookup Views tabData Validation Option 15

Mmenus

description 18metadata export

exporting at the command line 106exporting objects 39overview 38

metadata importimporting at the command line 107importing objects 39overview 38

metadata import syntaxlookup view 141single-table constraint 140SQL view 141table pair with an SQL view source 139table pair with one test 138table pair with two flat files 139

Nnew features

Data Validation Option 9new features in 9.1.2.0

Data Validation Option 8new features in 9.1.4.0

Data Validation Option 7

Ppermissions

Data Validation Option installation 21prerequisites

Data Validation Option installation 21PurgeRuns command

syntax 109

RRefreshRepository command

syntax 110report views

meta_ds_view 136meta_jv_view 135meta_lv_view 134meta_sv_view 133meta_tp_view 136overview 129results_id_view 133results_summary_view 129rs_bad_records_view 132rs_jv_id_view 137rs_lv_id_view 137rs_sv_id_view 137

reportscustom 95Detailed Test Results-Bad Records example 123Detailed Test Results-Test example 122exporting 95filtering 94generating 93lookup view definitions 94overview 93printing 95scrolling 95SQL view definitions 94Summary of Testing Activities example 119table of contents 95Table Pair Summary example 120troubleshooting 115viewing 95

ReportsOverview 93

repositoriesadding 36deleting 37

146 Index

Page 160: PC 950 DataValidationOption UserGuide En

editing 37exporting metadata 38overview 36refreshing 37refreshing at the command line 110refreshing folders 37saving 36testing connections 36upgrading at the command line 113

RunTests commandcache settings 107, 111send email 111syntax 107, 111

Sscripting

metadata export and import 38Settings folder

displaying 19single table tests

adding 74bad records 75condition 72constraint value 73datatypes 118deleting 75descriptions 71editing 74field 72filter condition 72operators 72overview 70preparing at the command line 107properties 70purging test runs at the command line 109running 75running at the command line 111troubleshooting 114, 115values to test 72

single tablesadding 67deleting 68editing 68properties 62test properties 70viewing test results 69

single-table constraintsadding 74bad records 75datatypes 118deleting 75editing 74metadata import syntax 140overview 62properties 62running 75test properties 70viewing test results 69

SQL viewsadding 78adding comments 78column definitions 77connections 77deleting 78description 77editing 78metadata import syntax 141

overview 76properties 76retrieving data 78SQL statement 78table definitions 77

SQL Views tabData Validation Option 15

system requirementsData Validation Option 3

Ttable pair tests

adding 57automatic generation 58bad records 61comments 56concatenating strings 56conditions A and B 54datatypes 117deleting 57descriptions 52editing 57excluding bad records 55expression tips 56fields A and B 53filter condition 54IF statements 56ignoring case 55margin of error 55null values 56operators 54preparing at the command line 107properties 52purging test runs at the command line 109running 58running at the command line 111substrings 56threshold 55trimming trailing spaces 55troubleshooting 114, 115types 51using expressions for fields 56values to test 53

table pairsadding 48deleting 49editing 49joining tables 45metadata import syntax , SQL view source 139metadata import syntax, one test 138metadata import syntax, two flat files 139overview 41processing large tables 43properties 41pushing logic to the source 44test properties 52test types 51viewing test results 50WHERE clause 44

testing and methodologycomparing sources and targets 5counts, sums, and aggregate tests 3data testing approach 3enforcing constraints on target tables 4target table referential integrity 4validating logic with SQL views 5

Index 147

Page 161: PC 950 DataValidationOption UserGuide En

Tests tabData Validation Option 15

troubleshootingcommand line errors 116copying folders 115initial errors 114installation errors 114, 115no test results 114, 115ongoing errors 115overview 114report generation 115repository connections 114run errors 114, 115

UUNIX

Data Validation Option 29upgrade

Data Validation Option 3.0 27Data Validation Option 3.1 27

UpgradeRepository commandsyntax 113

upgradingversion 3.0 to current version 27version 3.1 to current version 28

user configuration directorychanging with a batch file 33changing with an environment variable 33creating at the command line 105location 32

usersData Validation Option 1

WWHERE clause

join views 87table pairs 44

148 Index