38
Creating the Foundation for Enterprise Information Management

Creating the Foundation for Enterprise Information Management

Embed Size (px)

Citation preview

Page 1: Creating the Foundation for Enterprise Information Management

Creating the Foundation for Enterprise Information Management

Page 2: Creating the Foundation for Enterprise Information Management

+The Problem

+Global IDs Focus

+Software Functionality

+Automation

+Enterprise-Level Scalability

+Global IDs Value Offering

About Global

IDs

Page 3: Creating the Foundation for Enterprise Information Management

3

Organizations have growing cost & complexity

Most large organizations have many thousands of databases that drive up complexity and cost.

Complex data ecosystems

Highly dynamic

Limited traceability

Systemic Risk : Hard to measure

About Global IDs

section 01Global IDs ©2015

DataLandscapes

Page 4: Creating the Foundation for Enterprise Information Management

4

Global IDs FocusAbout Global IDs

To help large organizations reduce the cost of managing enterprise information by using software that is

• Systematic• Automated• Scalable

and can manage complexity in enterprise data landscapes.

section 01

Global IDs ©2015

Page 5: Creating the Foundation for Enterprise Information Management

5

Software FunctionalityAbout Global IDs

section 01

Global IDs ©2015

1. Scan

2. Analyze

3. Map / Organize

4. Govern

Page 6: Creating the Foundation for Enterprise Information Management

6

AutomationAbout Global IDs

section 01

Global IDs ©2015

Page 7: Creating the Foundation for Enterprise Information Management

7

Enterprise-Level ScalabilityAbout Global IDs

section 01

Global IDs ©2015

Metadata Discovery Tasks

Data Quality Tasks

Data Profiling Tasks

Data Mapping Tasks

Data Monitoring Tasks

Massively parallel distributed computing

Page 8: Creating the Foundation for Enterprise Information Management

8

Global IDs Value OfferingAbout Global IDs

section 01

Global IDs ©2015

Improve TransparencyUsing Metadata Governance

Improve QualityUsing Data Quality Audits

Improve TraceabilityUsing Data Lineage Analysis

Increase Trust in DataMaster Data Governance

Increase Operational Efficiency

Reduce Costs Increase Revenues

Identify RedundancyGraph Analysis of Duplicate Data

Support AutomationIdentify Manual Steps for Automation

Run-time Data Validation & Cleansing Accelerates Data Cleansing

Upwards Of 500%

Customer 360 AnalysisUsing Linked Data

Customer-Product Predictive Analysis

Using Graph Databases

Page 9: Creating the Foundation for Enterprise Information Management

+ Introduction

+Why download Global IDs trial server

+ How to get a trial server

Trying out Global IDs on your own

trial server

Page 10: Creating the Foundation for Enterprise Information Management

10

Introduction

Try out Global IDs software on Amazon Web Services platform without downloading it. You can now get a hands-on experience on the product suite on the cloud.

The portal environment lets you scan and discover databases, profile data in them, recognize and organize information and map them appropriately. Global IDs provides a set of useful workflows that simplify your work.

Try out Global IDs software independently without downloading it.

Trying out Global IDs on your own trial server

section 02Global IDs ©2015

Page 11: Creating the Foundation for Enterprise Information Management

11Global IDs ©2015

Trying out Global IDs on your own trial server

Why try out Global IDs server

section 02

Understand your data better by running it through standard data management processes that the trial server offers

Learn Data Management and Governance using Global IDs

Get certified on Global IDs platform

Assess Global IDs for using it in your organization

Page 12: Creating the Foundation for Enterprise Information Management

12

You can request the trial server from our website (www.globalids.com). You need to perform the following steps.

• Use the following link - http://www.globalids.com/request-global-ids-server

• The Request Global IDs Server form appears. You need to fill up the form and click Submit button.

• An email gets sent to you with details about the trial server. This takes 5 -10 minutes.

• Click on the link for accessing the trial server (in the email) to browse the Data Transparency Portal.

Trying out Global IDs on your own trial server

Form for getting access to your trial server

How to get a trial server

Global IDs ©2015

section 02

Page 13: Creating the Foundation for Enterprise Information Management

+ Add connections

+ Run workflows

+ Inspect metadata and profiling results

+ Outliers

+ Organize

+ Map

What to do after you get a

trial server

Page 14: Creating the Foundation for Enterprise Information Management

14Global IDs ©2015

What to do after you get a trial server

Working with connections

section 03

Notice two connections already added

Add connections in bulk from an Excel sheet

Add connections manually

Page 15: Creating the Foundation for Enterprise Information Management

15

Working with already added connections

What to do after you get a trial server

For the two connections already added, you can run any of the following workflows.

Discovery, Profiling and Recognition Workflow - This workflow lets you discover, profile, organize and map tables.

Discovery Workflow - This workflow tables, their metadata and explicit relationships.

Global IDs ©2015

section 03

Page 16: Creating the Foundation for Enterprise Information Management

16

• Click Discovery menu Connections section Actions drop-down menu Add Connection.

• You can connect to the PRODUCT_HUB database on the trial server.

• In the wizard, provide:

Connection name

System name - select the trial server name

Instance name - PRODUCT_HUB

Database type - Derby

Port - 1530

Username - sample & password - sample. Click Test Connection to test the connection

Database Owner’s name, Database Administrator’s name & Data Steward’s name

Database category & Business entity

Password expiration date

What to do after you get a trial server

Add connections manuallyThe added connection shows up in the tabular view on the Connections section under the Discover menu when the table is refreshed. After the connection is added, the Discovery, Profiling and Recognition Workflow can be run to profile, organize and map the tables.

Global IDs ©2015

section 03

Page 17: Creating the Foundation for Enterprise Information Management

17

• Click Discovery menu Connections section Actions drop-down menu Add Connections in Bulk.

• In the wizard:

The connection file has to be selected from the server system or local system. A sample file is selected by default. Use this file.

Database User - sample & Database Password - sample.

The connections listed in the Excel file are created and are displayed on Discover menu Connections section when the table is refreshed.

The Discovery, Profiling and Recognition Workflow can be run on each of the added connections to scan, profile, classify and organize the data landscape.

What to do after you get a trial server

Add connections in bulk from an Excel sheet

Global IDs ©2015

section 03

Page 18: Creating the Foundation for Enterprise Information Management

18

Run Workflows

Global IDs ©2015

What to do after you get a trial server

Menu Workflow Purpose

DiscoverAdd Connection and Create Transparency Workflow

This workflow connects to a database instance and automatically starts the discovery, profiling, classification and recognition processes.

DiscoverDiscovery, Profiling and Recognition Workflow

This workflow discovers and profiles database instances and also runs the classification and recognition process.

DiscoverDiscovery Workflow This workflow discovers tables in a database, discovers

their metadata and documented relationships.

DiscoverAdd and Profile File Workflow

This workflow scans delimited and Excel files, discovers their metadata, loads their data in the database and then profiles them.

ProfileRun Profile Relationship Workflow

This workflow finds documented, undocumented and user-defined relationships between tables in databases.

MapFind Database Clones Workflow

This workflow finds similar database instances in the landscape that have a 95% match.

section 03

Page 19: Creating the Foundation for Enterprise Information Management

19

What to do after you get a trial server

Add Connection and Create Transparency Workflow

Global IDs ©2015

The Add Connection and Create Transparency Workflow performs the following activities:

Creates connection to database

Discovers tables, metadata & explicit relationships

Generates Metadata Report

Profiles columns, relationships & domains

Generates Profile Report

Classifies and organizes tables

section 03

Page 20: Creating the Foundation for Enterprise Information Management

20

Click Discover menu Connections section Actions drop-down menu Add Connection and Create Transparency Workflow.

• In the wizard:

select the connection name.

System name – select the trial server name.

Instance name – PRODUCT_HUB

Database Type – Derby

Port - 1530

Username - sample & password - sample. Click Test Connection to test the connection.

Database Owner’s name, Database Administrator’s name & Data Steward’s name

What to do after you get a trial server

Add Connection and Create Transparency Workflow (cont.)

Global IDs ©2015

Database category & Business entity

Password expiration date

section 03

Page 21: Creating the Foundation for Enterprise Information Management

21

What to do after you get a trial server

Discovery, Profiling and Recognition Workflow

Global IDs ©2015

The Discovery, Profiling and Recognition Workflow performs the following activities:

Discovers tables, metadata & explicit relationships

Generates Metadata Report

Profiles columns, relationships & domains

Generates Profile Report

Classifies and organizes tables

Pre-requisites:

Connection to database

section 03

Page 22: Creating the Foundation for Enterprise Information Management

22

Click Discover menu Metadata section Actions drop-down menu Discovery,Profiling and Recognition Workflow.

• In the wizard:

select the connection name.

select the schemas on which you want to run the workflow.

provide the user credentials. For the two added connections the credentials are provided by default.

Check the appropriate checkboxes for discovering tables, views and synonymous tables & for finding row count of tables from system tables.

What to do after you get a trial server

Discovery, Profiling and Recognition Workflow (cont.)

Global IDs ©2015

section 03

Page 23: Creating the Foundation for Enterprise Information Management

23

What to do after you get a trial server

Discovery Workflow

Global IDs ©2015

The Discovery Workflow performs the following activities:

Discovers tables in databases

Scans metadata of discovered tables

Discovers documented/explicit relationships

Generates Metadata Report

Pre-requisites:

Connection to database

section 03

Page 24: Creating the Foundation for Enterprise Information Management

24

Click Discover menu Connections section Actions drop-down menu Add Connection and Create Transparency Workflow.

• In the wizard:

select the connection name.

System name – select the trial server name.

Instance name – PRODUCT_HUB

Database Type – Derby

Port - 1530

Username - sample & password - sample. Click Test Connection to test the connection.

Database Owner’s name, Database Administrator’s name & Data Steward’s name

What to do after you get a trial server

Discovery Workflow (cont.)

Global IDs ©2015

Database category & Business entity

Password expiration date

section 03

Page 25: Creating the Foundation for Enterprise Information Management

25

Track Workflow Progress

What to do after you get a trial server

Click Account menu My Actions sub-menu

You can view:

Running Tasks - This screen lists the currently running task groups submitted by the workflow.

Running Workflows - This screen shows the currently running workflow.

Running Tasks

Running Workflows

Global IDs ©2015

section 03

Page 26: Creating the Foundation for Enterprise Information Management

26

Track Workflow Progress (cont.)

What to do after you get a trial server

Click a running task group from the Running Tasks screen.

You can view:

Running Tasks - This screen lists the currently running tasks in the task group.

Summary - This screen shows a summary of the tasks that are running or have finished

All Tasks – This screen lists all the tasks along with their status

Scheduled Tasks – This screen shows the tasks that have been scheduled

Paused Tasks – This screen shows the tasks that have been paused

Failed Tasks – This screen lists the tasks that have failed

History – This screen displays history of all the tasks in the workflow

Global IDs ©2015

section 03

Page 27: Creating the Foundation for Enterprise Information Management

27

Inspect Metadata and Profiling Results

Results of Data Discovery:

Click Discover menu Metadata section.

This screen lists all the schemas whose metadata has been discovered. Click on a schema to view the following:

Tables - Lists all tables in the schema along with their metadata

Relationships - Shows the various documented relationships

between tables in the schema

Table Descriptions – Displays the documented descriptions of tables

SQL Editor – This screen allows running SQL queries on the

discovered tables

Drill-down is available for viewing metadata of columns

Metadata

Data

Relationships

Indexes

Descriptions

What to do after you get a trial server

Global IDs ©2015

section 03

Page 28: Creating the Foundation for Enterprise Information Management

28

Inspect Metadata and Profiling Results (cont.)

Results of Column Profiling:

Click Profile menu Profile section.

This screen lists all the schemas which have been column profiled. Click on a schema to view the following:

Profile Summary – Displays a summary of the profiling results for all

tables in the schema whose metadata has been discovered

Outliers – Lists various outliers detected while profiling

Highlighted Outliers – Displays all outliers that have been

highlighted

Metadata– Displays metadata of columns of tables that have been

profiled

Indexes –Lists indexes in the schema

Relationships – Lists the various Primary Key – Foreign Key

relationships between tables in the schema (both documented and

undocumented/inferred)

Table Descriptions – Displays documented descriptions of tables

SQL Editor - This screen allows running SQL queries on the tables

Drill-down is available for viewing profiling results of columns

Profile

Outliers

Highlighted Outliers

Descriptions

Profile History

What to do after you get a trial server

Global IDs ©2015

section 03

Page 29: Creating the Foundation for Enterprise Information Management

29

Inspect Metadata and Profiling Results (cont.)

Results of Relationship Profiling:

Click Profile menu Relationships section.

This screen lists all the schemas which have been profiled. Click on a schema to view the following:

Documented Relations - Lists all documented/explicit Primary Key –

Foreign Key relationships between tables

Inferred Relations – Lists all Primary Key – Foreign Key relationships

between tables that have been detected through relationship

profiling

User Defined Relations –. Displays all user-defined relationships

between tables . Also allows creation of Primary Key – Foreign Key

relationships

Relationship Analysis

Cardinality (Average,

Maximum & Minimum)

Orphan Values

Childless Values

What to do after you get a trial server

Global IDs ©2015

section 03

Page 30: Creating the Foundation for Enterprise Information Management

30

Reports

What to do after you get a trial server

Metadata Report

Domain Report

Profile Report

Discover Reports

ProfileReports

OrganizeDomain Reports

Global IDs ©2015

section 03

Page 31: Creating the Foundation for Enterprise Information Management

31

Outliers

What to do after you get a trial server

Global IDs ©2015

Character Distribution

Outlier

Datatype Outlier

Frequency Outlier

Length Outlier

Pattern Outlier

Value Outlier

The software, post profiling activity, highlights/suggests certain values as outliers or possible errors.

Click Profile menu Outliers section

Values with the least occurring data type (column

level) get marked as

outliers

Values with improper/unu

sual character

distribution (column

level) get marked as

outliers

Values occurring

once where other values

in the column occur

multiple times and vice versa

get marked as outliers

Values of the least

occurring length

distribution in a column get

marked as outliers

Values with the least occurring

Regex pattern in a column get marked as

outliers

Calculated on the basis of Six Sigma

Methodology. Values

outside the +6 and -6

sigma range get marked as outliers

section 03

Page 32: Creating the Foundation for Enterprise Information Management

32

Domains and Semantic Objects

Domains are business specific concepts represented in the data landscape. They are associated with reference values or patterns.

Types of Domains:

Global

Business

Auto Generated

View domains from Organize menu Recognition Domains

The product organizes the domains into groupings called semantic objects.

View semantic objects from Map menu Semantic Objects

What to do after you get a trial server

Global IDs ©2015

ID First Name

Last NamePhone

Number

City Country

CUSTOMER

Semantic Object

Domains

section 03

Page 33: Creating the Foundation for Enterprise Information Management

33

The software automatically creates a “bucket” for each domain and places each column that it recognizes into the appropriate domain.

• Click Organize menu Recognize section Domain Profiling section to view the total number of columns domain profiled in the schema.

• Click the schema name to view:

Domain Profile Status – Shows summary of profiling activity

Domains – Lists tables in the schema and shows the details of columns recognized during the domain profiling activity

Domains (table level) – Shows domain profiling details for columns of a table

What to do after you get a trial server

Interpret results of domain recognition

Global IDs ©2015

Domains Tab

Domains Tab (table level)

section 03

Page 34: Creating the Foundation for Enterprise Information Management

34

The software automatically locates similarities between semantic objects and tables. These similar tables get classified with objects under 4 classification heads –

Location

Product

People

Organization

Similarities are detected in two ways:

Domain based – based on data similarity between domains of objects and columns of tables

Name based – based on similarity between domain and column names

What to do after you get a trial server

Interpret results of object classification

Global IDs ©2015

section 03

Page 35: Creating the Foundation for Enterprise Information Management

35

Results can be reviewed in two ways:

Map menu Semantic Mappings section

Map menu Semantic Objects section

click a semantic object from the list

Mapping Details section

What to do after you get a trial server

Review results of automatic domain recognition

Global IDs ©2015

Mapping Details

section 03

Page 36: Creating the Foundation for Enterprise Information Management

36

Click Map menu Semantic Objects section Actions drop-down menu Add Object.

• In the wizard, provide the following:

Scope – select Global or Business

Name – name of the semantic object.

Logical name

Description

Notes

URL

Domains – select the domains that would be included in the semantic object

What to do after you get a trial server

Create Semantic Objects

Global IDs ©2015

The added semantic object is listed under Semantic Objects section when the tabular view is refreshed.

section 03

Page 37: Creating the Foundation for Enterprise Information Management

37

You can map domains of objects to columns of tables :

manually

from maps suggested by the software

Click Map menu Semantic Objects section click a semantic object from the list Suggested Mappings section.

Select a row on the top

Suggestions tab at bottom shows the various domains of the object on the left. On the right, drop-down lists are provided for selecting appropriate columns of the selected table.

Notice that in some drop-down lists columns are already selected. Those are the mappings suggested by the software through domain recognition activity.

Check the checkboxes next to all the columns you want to map. Select the appropriate columns from the drop-down lists. Click Save Mappings button.

The mappings are listed in the Mapping Details section.

What to do after you get a trial server

Map Semantic Objects

Global IDs ©2015

Manually:

Objects can also be mapped from Mapping Details section Map button.

section 03

Page 38: Creating the Foundation for Enterprise Information Management

Contact InformationFor more info, please contact us at

[email protected]@globalids.com609-683-1066

Thank you