20
Informatica Proactive Monitoring for Data Quality (Version 2.0) Solutions Guide

Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

Informatica Proactive Monitoring for Data Quality (Version 2.0)

Solutions Guide

Page 2: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

Informatica Proactive Monitoring for Data Quality Solutions Guide

Version 2.0September 2014

Copyright (c) 2003-2014 Informatica Corporation. All rights reserved.

This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or international Patents and other Patents Pending.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.

The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us in writing.

Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange, Informatica On Demand, Informatica Cloud, AddressDoctor, Agent Logic, Latency Busters, Parallel Persistence, PowerPartner, RTAM, Real Time Alert Manager, RulePoint, Siperian, Ultra Messaging , Event Detection and Response, User-Driven Complex Event Processing, "To Detect and Respond," "CEP for Humans," L2H, Low-to-High, High-to-Low, Enterprise Agent Server are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Firefox is a trademark of the Mozilla Foundation. Intel and Pentium are registered trademarks of Intel Corporation in the United States, other countries, or both. Microsoft, Active Directory, Internet Explorer, NetMeeting, PowerPoint, SQL Server, Windows 98, Windows 2000, Windows 2003, Windows NT, and WordPad are either registered trademarks or trademarks of Microsoft Corporation in the United States, other countries, or both. Sun Microsystems, Sun, AnswerBook, Java, JVM, Solaris, Solaris JumpStart, StarOffice, Sun Ray, SunForum, Ultra, and Trusted Solaris are either registered trademarks or trademarks of Sun Microsystems, Inc., in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States, other countries, or both. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Apache Tomcat and Tomcat are trademarks of the Apache Software Foundation in the United States, other countries, or both. BEA WebLogic is a registered trademark of BEA Systems, Inc., in the United States, other countries, or both. IBM and WebSphere are registered trademarks of International Business Machines Corporation in the United States, other countries, or both. All other company and product names may be trade names or trademarks of their respective owners.

Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Microsoft Corporation. All rights reserved. Copyright (c) The Regents of the University of California. All rights reserved.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is licensed under the Apache License, Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This product includes software which is licensed under the GNU Lesser General Public License Agreement, which may be found at http://www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.

This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.

This product includes software licensed under the terms at http://www.hpsearch.org/, http://www.antlr.org/license.html, http://displaytag.sourceforge.net/11/license.html, http://openmap.bbn.com/license.html, http://dist.codehaus.org/janino/new_bsd_license.txt, https://github.com/jquery/jquery/blob/master/MIT-LICENSE.txt, http://www.jython.org/license.html, http://madrobby.github.com/scriptaculous/license/, http://xdoclet.sourceforge.net/xdoclet/licenses/xdoclet-license.html, http://xstream.codehaus.org/license.html, and http://developer.yahoo.com/yui/license.html

This product includes software licensed under the the Common Development and Distribution License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the BSD License (http://www.opensource.org/licenses/bsd-license.php), the Eclipse Public License (http://www.eclipse.org/org/documents/epl-v10.php), the Sun Binary Code License Agreement and the MIT License (http://www.opensource.org/licenses/mit-license).

This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775; 6,640,226; 6,789,096; 6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422, 7,720,842; 7,721,270; and 7,774,791 , international Patents and other Patents Pending.

DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. The information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is subject to change at any time without notice.

NOTICES

This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software Corporation ("DataDirect") which are subject to the following terms and conditions:

1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.

2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.

Part Number: PMDQ-SLG-20000-0001

Page 3: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiInformatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Informatica My Support Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Support YouTube Channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Marketplace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Chapter 1: Introduction to Proactive Monitoring for Data Quality. . . . . . . . . . . . . . . . 1Proactive Monitoring for Data Quality Solutions Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Solution Capabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 2: Monitoring a PowerCenter Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Introduction to Complex Event Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

RulePoint Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Solution Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Solution Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Chapter 3: Installation and Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Installation and Configuration Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Before You Install. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Verify System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Installing Proactive Monitoring for Data Quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

After You Install. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Validating the Installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Configured Channels in Real-Time Alert Manager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Appendix A: Template Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Appendix B: Advanced Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Appendix C: Source Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Appendix D: Predefined Responses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Table of Contents i

Page 4: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

PrefaceThe Proactive Monitoring for Data Quality Solutions Guide describes the solution provided for proactively monitoring Data Quality operations. This guide also contains how to install and configure Proactive Monitoring for Data Quality.

The Proactive Monitoring for Data Quality Solutions Guide is written for data quality developers, analysts, and system administrators. This guide assumes that you have an understanding of data quality concepts, flat file and relational database concepts, and the database engines in your environment to install and deploy Proactive Monitoring for Data Quality 2.0.

Informatica Resources

Informatica My Support PortalAs an Informatica customer, you can access the Informatica My Support Portal at http://mysupport.informatica.com.

The site contains product information, user group information, newsletters, access to the Informatica customer support case management system (ATLAS), the Informatica How-To Library, the Informatica Knowledge Base, Informatica Product Documentation, and access to the Informatica user community.

Informatica DocumentationThe Informatica Documentation team takes every effort to create accurate, usable documentation. If you have questions, comments, or ideas about this documentation, contact the Informatica Documentation team through email at [email protected]. We will use your feedback to improve our documentation. Let us know if we can contact you regarding your comments.

The Documentation team updates documentation as needed. To get the latest documentation for your product, navigate to Product Documentation from http://mysupport.informatica.com.

Informatica Web SiteYou can access the Informatica corporate web site at http://www.informatica.com. The site contains information about Informatica, its background, upcoming events, and sales offices. You will also find product and partner information. The services area of the site includes important information about technical support, training and education, and implementation services.

ii

Page 5: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

Informatica How-To LibraryAs an Informatica customer, you can access the Informatica How-To Library at http://mysupport.informatica.com. The How-To Library is a collection of resources to help you learn more about Informatica products and features. It includes articles and interactive demonstrations that provide solutions to common problems, compare features and behaviors, and guide you through performing specific real-world tasks.

Informatica Knowledge BaseAs an Informatica customer, you can access the Informatica Knowledge Base at http://mysupport.informatica.com. Use the Knowledge Base to search for documented solutions to known technical issues about Informatica products. You can also find answers to frequently asked questions, technical white papers, and technical tips. If you have questions, comments, or ideas about the Knowledge Base, contact the Informatica Knowledge Base team through email at [email protected].

Informatica Support YouTube ChannelYou can access the Informatica Support YouTube channel at http://www.youtube.com/user/INFASupport. The Informatica Support YouTube channel includes videos about solutions that guide you through performing specific tasks. If you have questions, comments, or ideas about the Informatica Support YouTube channel, contact the Support YouTube team through email at [email protected] or send a tweet to @INFASupport.

Informatica MarketplaceThe Informatica Marketplace is a forum where developers and partners can share solutions that augment, extend, or enhance data integration implementations. By leveraging any of the hundreds of solutions available on the Marketplace, you can improve your productivity and speed up time to implementation on your projects. You can access Informatica Marketplace at http://www.informaticamarketplace.com.

Informatica VelocityYou can access Informatica Velocity at http://mysupport.informatica.com. Developed from the real-world experience of hundreds of data management projects, Informatica Velocity represents the collective knowledge of our consultants who have worked with organizations from around the world to plan, develop, deploy, and maintain successful data management solutions. If you have questions, comments, or ideas about Informatica Velocity, contact Informatica Professional Services at [email protected].

Informatica Global Customer SupportYou can contact a Customer Support Center by telephone or through the Online Support.

Online Support requires a user name and password. You can request a user name and password at http://mysupport.informatica.com.

The telephone numbers for Informatica Global Customer Support are available from the Informatica web site at http://www.informatica.com/us/services-and-training/support-services/global-support-centers/.

Preface iii

Page 6: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

C H A P T E R 1

Introduction to Proactive Monitoring for Data Quality

This chapter includes the following topics:

• Proactive Monitoring for Data Quality Solutions Overview, 1

• Solution Capabilities, 1

Proactive Monitoring for Data Quality Solutions Overview

Informatica Data Quality and Informatica Analyst users can proactively monitor and identify data quality issues to ensure the consistency and completeness of data.

The Proactive Monitoring for Data Quality solution identifies data that differ from the expected trend by a certain percentage or deviation. A set of pre-configured, prepackaged sources, rules, and alert rules are provided with this solution. These rules perform completeness and conformity checks on data used by Informatica Analyst Tool for column profiling. When RulePoint detects events that match the conditions specified in a rule, it sends alerts to the users specified in Real-Time Alert Manager.

The proactive monitoring and alert operation is performed by the complex event processing products, Informatica RulePoint and Informatica Real-Time Alert Manager.

For more information about Informatica RulePoint and Real-Time Alert Manager, see the documentation for these products.

Solution CapabilitiesData analysts can automate monitoring tasks for Informatica Data Quality.

When incorrect or poor data is used in profiling and analyzing large sets of data, the downstream processes that rely on the data will generate inaccurate results. Manual checking or custom scripts to check for problems with data quality is time consuming and labor intensive.

Proactive monitoring contains extensible rules and templates that you can modify to meet your data quality requirements. The data analyst or business user will receive an alert when data that matches the rule is detected. Alerts include a link to the Informatica Analyst with the details of the data quality issue.

1

Page 7: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

The proactive monitoring solution offers the following advantages for an Informatica Data Quality analyst:

Accelerate Data Quality deployment

You can augment and accelerate your Informatica Data Quality deployment with proactive data quality monitoring capabilities. The monitoring solution delivers alerts for faster responses when data quality thresholds are exceeded. It is a cost-effective method to mitigate the negative impact of poor or questionable data on your applications and processes.

Proactively monitor thresholds

The solution monitors and sends alerts if the data quality values exceed a certain threshold. For example, the monitoring solution evaluates the null count against a specified threshold, percentage of uniqueness against a supplied minimum threshold percent, and the count of unique patterns against a supplied maximum pattern count threshold.

Improve data quality continually

You can monitor various data quality metrics and compare historical values to improve data quality continually and operate more efficiently.

Customize templates and rules

You can define new template rules using the existing templates based on your monitoring requirements. The monitoring solution contains pre-built rules for self-service editing and tuning, along with the tools for creating new-rules, which helps organizations scale and manage rules consistently.

2 Chapter 1: Introduction to Proactive Monitoring for Data Quality

Page 8: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

C H A P T E R 2

Monitoring a PowerCenter Environment

This chapter includes the following topics:

• Introduction to Complex Event Processing , 3

• Solution Architecture, 4

Introduction to Complex Event ProcessingProactive Monitoring for Data Quality leverages complex event processing product, Informatica RulePoint, to process and analyze Informatica Analyst profiling events. RulePoint can monitor or listen in near‑real time to data across diverse sources.

RulePoint is a complex event processing software platform that you can use to identify patterns in real-time event flows and batch data to proactively alert people, systems and processes.

You can integrate multiple types of data sources seamlessly with RulePoint. Data sources are exposed to rule writers as topics. RulePoint users can use multiple modes, from simple templates to advanced rules that describe patterns that they want identified.

When events match the specified patterns, RulePoint sends alerts based on the configured responses. Responses are delivered to recipients through email or dashboards and into systems. The responses contain details of the error or event. You can customize additional sources and responses through the SDK.

RulePoint ArchitectureRulePoint monitors specified data streams for predetermined events and then acts on those events with the configured rules.

RulePoint uses services, which are configurable interfaces that link to another software system, to collect information from relevant streams of data flowing from different data sources. Each piece of information is

3

Page 9: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

referred to as an event. These events are published into RulePoint and grouped into categories familiar to and defined by users, which are referred to as topic.

RulePoint then uses other services to coordinate responses to those events based on user‑defined event processing rules. A rule encapsulates the business logic of analyzing event data from multiple sources to detect specific events based on logical conditions, and then responds to the appropriate party with the proper information.

Users can create and modify rules themselves, ensuring that an organization’s responsiveness to changing conditions is not hindered by the traditional software development cycle.

When RulePoint detects events that match the conditions specified in a rule, it executes the response specified in the rule. These responses can be simple, such as sending an email, instant message, or text message, or complex, such as updating a database, triggering a web service, initiating other processes across the enterprise, or creating new events used by other rules.

Solution ArchitectureThe proactive monitoring solution works with your Informatica Analyst environment with minimum configuration requirements. Predefined RulePoint services integrate with the Profiling Warehouse.

The proactive monitoring solution is a RulePoint application which runs on a Java application server. The application server and RulePoint are installed on a machine, separate from the Informatica Data Quality

4 Chapter 2: Monitoring a PowerCenter Environment

Page 10: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

installation. Through a Web browser, you can access and manage data sources, users, rule writing, and alert definitions.

The following illustration shows the solution architecture:

The proactive monitoring solution runs on a RulePoint instance and connects to the Profiling Warehouse to collect profiling statistics and scorecard data.

The alert information is displayed in the Real-Time Alert Manager dashboard.

Solution ContentsThe proactive monitoring solution contains RulePoint objects, and database scripts, to quickly configure and monitor the Informatica Data Quality Profiling Warehouse.

The installation program installs the required components on the specified RulePoint instance and configures all RulePoint objects, such as pre-defined sources, topics, connections, templates, advanced rules, template rules, SQL analytics, watchlists, responders, and responses.

The following users are configured by default with the proactive monitoring solution:

dquser

Any data quality user or subject matter expert.

dqmonitor

Any user who receives alerts for data quality issues detected by the monitoring solution.

Solution Architecture 5

Page 11: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

C H A P T E R 3

Installation and ConfigurationThis chapter includes the following topics:

• Installation and Configuration Overview, 6

• Before You Install, 6

• Installing Proactive Monitoring for Data Quality, 7

• After You Install, 9

• Validating the Installation, 9

• Configured Channels in Real-Time Alert Manager, 10

Installation and Configuration OverviewYou can install the Proactive Monitoring for Data Quality 2.0 on a Windows, Linux, Solaris, or AIX machine. Complete the pre-installation tasks to prepare for the installation.

Before You InstallBefore you install Proactive Monitoring for Data Quality, ensure to meet the minimum software and hardware requirements.

Complete the following prerequisites before you install Proactive Monitoring for Data Quality:

u Install RulePoint 6.1 in $RULEPOINT_HOME.

$RULEPOINT_HOME is the path of the RulePoint installation directory.

6

Page 12: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

Verify System RequirementsThe following table lists the platforms supported by Proactive Monitoring for Data Quality:

Domain Supported Platforms

Operating Systems - Windows- Linux- Solaris- AIX

Database Servers - Oracle- IBM DB2- Microsoft SQL Server

Recommended Hardware Requirements - 64-bit Intel or AMD-compatible, Xeon equivalent or better, 1.7 Ghz minimum CPU

- 12-16 GB RAM- 5-10 GB application disk space- 1 GB Ethernet network connection

Informatica RulePoint RulePoint 6.1

Informatica PowerCenter Informatica PowerCenter 9.1.0 and above

For more information about product requirements and supported platforms, see the Product Availability Matrix on the Informatica My Support Portal: https://mysupport.informatica.com/community/my-support/product-availability-matrices

Installing Proactive Monitoring for Data QualityYou can install Proactive Monitoring for Data Quality on Windows, Linux, Solaris, or AIX.

1. Create a database user for Informatica Analyst Service Profiling Warehouse with read-only permission.

The installation package has the database scripts to create the Profiling Warehouse read-only user with the required privileges for the databases. The database administrator must run these scripts.

Use one of the following files to create the read-only user, and grant privileges according to the database repository:

• Oracle: ..\DQPM\ddl\oracle\create_user_dis_wh_ro.ddl.sql• IBM DB2: ..\DQPM\ddl\db2\create_user_dis_wh_ro.ddl.sql• Microsoft SQL Server: ..\DQPM\ddl\mssql\create_user_dis_wh_ro.ddl.sql

2. Create views and synonyms on the repository database.

The installation package has the scripts to create the views and synonyms. The database user must run these scripts.

a. Log in as dis_wh_ro or the selected user.

Installing Proactive Monitoring for Data Quality 7

Page 13: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

b. Use one of the following files to create views and synonyms according to the repository database type:

• Oracle: ..\DQPM\ddl\oracle\dis_wh_ro.ddl.sql• IBM DB2: ..\DQPM\ddl\db2\dis_wh_ro.ddl.sql• Microsoft SQL Server: ..\DQPM\ddl\mssql\dis_wh_ro.ddl.sql

3. Import the XML files to RulePoint.

You need the XML files to import the custom services of Proactive Monitoring for Data Quality to RulePoint. Import the RulePoint objects using one of the following XML files for the Profiling Warehouse that you want to monitor:

• Oracle: ..\DQPM\exports\rulepoint\PMDQ_oracle_v2_0.xml• IBM DB2: ..\DQPM\exports\rulepoint\PMDQ_db2_v2_0.xml• Microsoft SQL Server: ..\DQPM\exports\rulepoint\PMDQ_mssql_v2_0.xmlTo import the XML files to a project in RulePoint, perform the following tasks:

a. Log in to RulePoint using the administrator credentials.

b. From the Actions menu on the Design tab, create a project, PMDQ.

c. On the Administration tab, click the Import view.

d. Select the PMDQ project in the left pane, and then click Upload File.

e. In the Available Files view in the contents panel, select the uploaded file, and click Start Import from the Actions menu on the upper-right pane to import the objects into the PMDQ project.

f. In the Import dialog box, select Update to overwrite the properties of objects on collision.

g. Click Import.

A message appears that indicates successful import.

h. Click OK.

You can view the import status of the file in the Import History view.

4. To edit the SQL connection in RulePoint, perform the following tasks:

a. On the Design tab, click the Connections view.

b. Select pmdq_connection, and then select Edit from the menu.

c. Update the Connection URL field with the JDBC connection URL that connects to the profiling warehouse.

d. Update the user name and password of the database user with read-only permissions for Informatica Analyst Service Profiling Warehouse.

e. Click Save.

5. To configure the rules for alert hyperlink and Real-Time Alert Manager users, perform the following tasks:

a. On the Design tab, click the Rules view.

b. Select the rule in the contents panel, click Edit, and then change the following parameters of the rule, if applicable, to suit the alert mechanism you configured:

• Profile Name

• Field Name

• Field Value

• Profile URL

8 Chapter 3: Installation and Configuration

Page 14: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

6. To configure new rules, perform the following tasks:

a. On the Design tab, click the Template view

b. Select the template in the contents panel, and select Create Template Rule from the menu.

c. In the Details section, provide a name for the rule.

d. In the Parameters section, edit the configurations as required.

e. Click Save.

7. To change the value of the tstamp parameter to the current time of installation in the SQL source, perform the following tasks:

a. On the Design tab, click the Sources view.

b. Select the SQL source in the contents pane, and then select Edit from the menu to set the tstamp parameter to the current time.

If you set the tstamp parameter value to an earlier value, lot of unwanted events generated for profiling from that time might enter RulePoint. The format of tstamp is yyyy-mm-dd hh:mm:ss.

8. To deploy the SQL source and connected objects, perform the following steps:

a. On the Design tab, click Actions > Deploy > Rules, Sources & Responder.

b. In the Deploy Rules, Sources & Responders dialog box, select the objects that you want to deploy, and then click Deploy.

The supporting objects associated with the rule, source, and responder are also deployed. The state of the source, rule, responder, and the supporting objects changes from Draft to Deployed. You can view the activations on the Dashboard tab.

After You InstallAfter installation, configure the environment for the new installation. Perform the post-installation tasks to ensure that the Proactive Monitoring for Data Quality runs properly.

u Log in to Real-Time Alert Manager with each of the following two user IDs. You log in with the IDs to configure Real-Time Alert Manager to receive alerts for each of the users.

The following table provides the list of user IDs and passwords to log in to RTAM:

User Name Password

dquser dquser123

dqmonitor dqmonitor123

Validating the InstallationIf the installation is successful, you can view events and responses on the dashboard after you deploy the objects.

1. Log in to RulePoint.

After You Install 9

Page 15: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

2. On the Dashboard tab, view the events generated. Perform the following tasks:

a. To view the statistics for the SQL source, select the source controller on the left pane.

The contents panel displays the sources and topics deployed in that source controller. It also displays the number of events generated for the SQL source and its properties. The lower panel displays the state, aggregate count, and a graph depicting the number of events that occurred per second.

b. To view specific events and their properties, click the Topics view in the contents panel, select the topic, and then select View Topic from the Actions menu on the Activity and Status pane.

The topic details for the source displays the details of the event, such as the event name, the source type, and the timestamp of the event.

3. To view responses, perform the following tasks:

a. Click the responder controller on the left pane.

b. Click the Responses view in the contents panel.

The contents panel displays the number of alerts generated for the RTAM responder.

c. Select the RTAM response, and click View Responses from the Actions menu on the Activity and Status pane.

The details for the RTAM response type sent from the source to an administrator, the time stamp, and the property of the response are displayed.

Configured Channels in Real-Time Alert ManagerChannels are used to group Real-Time Alert Manager alerts into logical categories. These are logical groups that associate a set of alerts to a common theme.

The following channels are predefined in Proactive Monitoring for Data Quality:

• Completeness

• Conformity

• Value Count

10 Chapter 3: Installation and Configuration

Page 16: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

A P P E N D I X A

Template RulesYou can create rules from these templates by specifying parameter values based on the requirements.

The following table lists the predefined template rules that are available by default when you install Proactive Monitoring for Data Quality:

Template Rule Name Description Properties

DQ_PT1 Number of patterns in the profiled column name exceeds the threshold

Checks if the number of patterns in the profiled column name exceeds the threshold.

- Topic: dq_profile_patterns- Sources: Data Quality column pattern

frequency- Analytics: -- Response: Data Quality Real-Time Alert

Manager alert- Channel: Conformity

DQ_PT2 Number of NULL values for the profiled column exceeds the threshold

Checks if the number of null values for the profiled column exceeds the threshold.

- Topic: dq_profile_details- Sources: Data Quality column profiling

statistics- Analytics: -- Response: Data Quality Real-Time Alert

Manager alert- Channel: Completeness

DQ_PT3 Number of occurrences of the value in a profiled column exceeds the threshold

Checks if the number of occurrences of the value in a profiled column exceeds the threshold.

- Topic: dq_profile_values- Sources: Data Quality column data

frequency- Analytics: -- Response: Data Quality Real-Time Alert

Manager alert- Channel: Value Count

11

Page 17: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

A P P E N D I X B

Advanced RulesAdvanced rules do not have parameters. You can extend these rules once you are comfortable with the functioning of these rules.

The following table lists the predefined advanced rules that are available by default upon installing Proactive Monitoring for Data Quality:

Rule Name Description Properties

DQ_P1 Number of occurrences of a pattern in the profiled column name TIER exceeds 2

Checks if the number of patterns in the profiled column name exceeds the threshold.

- Topic: dq_profile_patterns- Sources: Data Quality column

pattern frequency- Analytics: -- Response: Data Quality Real-

Time Alert Manager alert- User to be alerted: dqmonitor

DQ_P2 Number of NULL values for the profiled column PHONE exceeds 40

Checks if the number of NULL values for the profiled column PHONE exceeds 40.

- Topic: dq_profile_details- Sources: Data Quality column

profiling statistics- Analytics: -- Response: Data Quality Real-

Time Alert Manager alert- User to be alerted: dqmonitor

DQ_P3 Number of occurrences of the value I in the profiled column STATUS exceeds 100

Checks if the number of occurrences of the value I in the profiled column STATUS exceeds 100.

- Topic: dq_profile_values- Sources: Data Quality column

data frequency- Analytics: -- Response: Data Quality Real-

Time Alert Manager alert- User to be alerted: dqmonitor

12

Page 18: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

A P P E N D I X C

Source ServicesThe source services fetch data from Data Quality profile warehouse that eventually triggers the predefined rules.

The following table lists the predefined source services that are available by default upon installing Proactive Monitoring for Data Quality:

Source Service Name Description Properties

Data Quality column profiling statistics

Retrieve the profiling statistics of the profiled columns.

- Type: SQL- Topic: dq_profile_details- Connected to: Data Quality Profile

Warehouse- Default: 10 minutes

Data Quality column data frequency

For each profiled column, retrieve the number of times a value occurs and its percentage.

- Type: SQL- Topic: dq_profile_details- Connected to: Data Quality Profile

Warehouse- Default: 10 minutes

Data Quality column pattern frequency

For each profiled column, retrieve the number of patterns that occur in the profiled columns.

- Type: SQL- Topic: dq_profile_details- Connected to: Data Quality Profile

Warehouse- Default: 10 minutes

13

Page 19: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

A P P E N D I X D

Predefined ResponsesThe response is where you define how you want responses if the rule’s event matches the rule condition. You can configure a response to function like an action.

You can configure to send responses to a single user or groups of users through the Real-Time Alert Manager user interface.

The following table lists the predefined responses that are available by default upon installing Proactive Monitoring for Data Quality:

Response Name Description

Data Quality RTAM alert This response sends alerts to Real-Time Alert Manager.

14

Page 20: Informatica Proactive Monitoring for Data Quality … Documentation...Chapter 1: Introduction to Proactive Monitoring for Data Quality..... 1 P r o a c t i v e M o n i t o r i n g

I N D E X

AAlert frequency 9

CConfiguration

Channels Completeness 10Conformity 10Value Count 10

EEnable schedules 9Enable services 9

IInstallation

Import XML 7

Installation (continued)Read-only user 7synonyms 7Validate 9views 7

PProactive Monitoring for Data Quality

Overview 1

RRulePoint

RULEPOINT_HOME 6

15