19
M3UA Data Sharing and Indexing Michelle Ayers March 2018

M3UA Data Sharing and Indexing

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

M3UA Data Sharing and Indexing

Michelle AyersMarch 2018

…to Big Iron to Big Data

2

Trusted Industry Leadership

3

500+Experienced & Talented

Data Professionals

>6,000Customers

196850 Years of Market Leadership

& Award-Winning Customer Support

84of Fortune 100 are Customers

3xRevenue GrowthIn Last 12 Months

The global leader in Big Iron to Big Data

Driving Value for Customers across Multiple Domains

▪ Reduce computing costs on legacy data systems including mainframes & IBM i Power Systems

▪ Improve availability, reliability and integrity

▪ Meet growing security and compliance requirements

Data Liberation, Integration & Integrity

▪ Unlock mainframe and IBM i data for machine learning and advanced analytics

▪ Access, transform, integrate & deliver data to analytic environments

▪ Ensure data quality, lineage, security ▪ Enable data consumption on premise

and in the cloud

4

Data Infrastructure Optimization & Availability

Big Iron to Big Data

Data Sharing: Demands of an Application Database

Many demands on central database

All users/applications using same

data therefore data needs to be

consistent

Issues:

Performance – too much access of same data

Users running inefficient queries

Increased data security risks

Can lead to productivity issues for

core business processes

Impact of indexing…

5

Application Database

Apps

Website

ERP Application

Business Intelligence

User Queries

Interfaces

Impact of Indexing

What is an index?

Why are indexes important?

Why might indexes cause issues?

– Index maintenance

– Too many indexes

– Need to understand how to use them efficiently

Understanding indexes…

6

Terminology - Table

Table (Physical File)

Column (Field)

Row (Record)

dd

7

OOHEADCustomer Order

Header

Rows

Columns

Terminology - Index

An index orders, or sequences, the data in a table using a ‘key’ comprising one or more columns.

Unique index – special type of indexing where the ‘key columns’ ensure that no rows can have duplicate entries on the ‘key column’ values.

Unique indexes help ensure data integrity – keeping data consistent and reliable

8

Unique Index – M3 Example

M3 uses general naming convention where the unique index for a table has same name as the table with ‘00’ appended

OOHEAD Customer Order Header has unique index OOHEAD00

Unique index OOHEAD00 uses columns OACONO (Company) and OAORNO (Order Number) for unique key

Note that OODIVI (Division) column is not included in the unique key

9

Unique Key Columns

Querying Data

Suppose you want to query Customer Order Header for a particular order number, or say list all open orders for a given customer… …how would you write such a query?

Perhaps SQL: SELECT * FROM OOHEAD WHERE OAORNO = ‘8527536’

OR SELECT * FROM OOHEAD WHERE OACUNO = ‘AA56987’ AND OAORSL < ‘77’

How will the database locate the required data?

10

Indexes on Table OOHEAD

M3 Customer Order table OOHEAD is supplied with several indexes:

– OOHEAD00 Unique OACONO, OAORNO

– OOHEAD10 Non unique OACONO, OACUNO, OAORSL, OAORNO

– OOHEAD20 Non unique OACONO, OACUNO, OARLDT, OAORNO

– etc

Most of the supplied M3 indexes contain aaCONO (Company Number) as the first column in the index key… …when no suitable index exists the database system builds a new temporary index to order the data… …building indexes causes overhead…

Consider SQL: SELECT * FROM OOHEAD WHERE OACONO = 100 AND OAORNO = ‘852736’

OR SELECT * FROM OOHEAD WHERE OACONO = 100 AND OACUNO = ‘AA56987’ AND OAORSL < ‘77’

11

Finding Related Tables – Another Use for Indexes

Unique indexes define the most important columns on tables

These columns are good candidate columns for joining one table to another

Unique key columns on one table can be used as foreign key(s)

Consider joining OOHEAD Customer Order Header and OOLINE Customer Order Line tables – use OACONO = OBCONO and OAORNO = OBORNO

Consider OCUSMA Customer Master, unique key OCUSMA00 on key columns OKCONO and OKCUNO. From earlier slide OOHEAD Customer Order Header contains column OACUNO which can be used as foreign key to join to OCUSMA

Understanding indexes can also be useful technique to understanding database relations

12

Data Sharing: Impacts for M3 Application Database

M3 database with supplied indexes

for M3 application

BI and user queries often want

different sequence

Also answer complex questions using

say nested queries and/or aggregated data

Requirements to restrict data access

Need to use latest data

Need to ensure data consistency

13

Application Database

Apps

Website

ERP Application

Business Intelligence

User Queries

Interfaces

Data Sharing – Batch Approach

Offloaded data warehouse using overnight batch load

Most tables are fully reloaded every night

Limitations:

– Data is out of date quickly

– Every day the database tables grow a little bit, therefore each night the load job takes a little bit longer…

– Cloud providers often charge for volumes transmitted

14

Application DatabaseERP Application

Business Intelligence

User Queries

Data Warehouse

Data Share – Real Time (CDC Change Data Capture) Approach

Alternative approach – ‘CDC’ Change Data Capture

Customer example, application interface and BI/Reporting

15

Two-directional replication

IBM System i DB2

Manufacturing ERP

Introduced automated manufacturing inspection tracking and also faster

Data Warehouse Load.

MS SQL Server

Real time CDC replication

with transformation

Production line information

is transformed and sent to

ERP

Production System

APPLICATION INTEGRATION PROJECT

Query reports

MS SQL Server

Replaced batch reload of data

warehouse with CDC continual feed

of incremental extracts of last

changed row, and deleted

transactions.

Third Party Inspection

Application

Data Sharing Case Study

Alternative Approach – ‘CDC’ Change Data Capture

Customer Story – Red Wing Shoes (USA)

16

OFFLOAD QUERY & REPORTING PROJECT

Manufacturing

MS SQL Server

Query reports

Data Warehouse load

Real time CDC replication

with transformation

Reduce CPU and I/O overhead

on production system

improve user response times

Many cost effective tools available

on MS SQL server

platform for query reports

Data is already partially

‘scrubbed’ and available for

loading data warehouses and

data marts without performance

impact on production system

Production System Offload Query System

IBM System i DB2

Infor M3

Data Sharing ‘CDC’ Change Data Capture Advantages

Target data is near real time – always up-to-date

Offloading user queries, reporting and BI from production environment

Create customer indexes suitable for reporting requirements on target tables

Maintaining data integrity and consistency across processes

Sending actual changes not whole tables each day

Optionally

– Transform data for easier reporting, eg date formats

– OR masking sensitive column data

– Use different schema, eg more user friendly column names in target tables

– Omit non-required columns – fewer bytes to send and security compliance

17

Thank you!

18

twitter.com/VSI_Power

linkedin.com/companies/6542

blog.visionsolutions.com

visionsolutions.com

twitter.com/Syncsort

linkedin.com/company/12576

blog.syncsort.com

syncsort.com

Michelle Ayers – [email protected]

Leading the next technology [r]evolution:

Big Iron to Big Data

Big Iron to Big Data: a fast-growing market segment composed of solutions that optimize traditional data systems and deliver mission-critical data from these systems to next-generation analytic environments.

19