51
Copyright © 2016, SAS Institute Inc. All rights reserved. DATA MANAGEMENT USER GROUP MANCHESTER 9 TH MARCH 2016

Data Management User Group PowerPoint Presentation

Embed Size (px)

Citation preview

Page 1: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA MANAGEMENT USER GROUP

MANCHESTER 9TH MARCH 2016

Page 2: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

AGENDASAS DATA MANAGEMENT USER GROUP

9th March 2016, SAS Manchester

Time Topic Speaker

9-9.30am Coffee and networking

9.30-9.35am Introductions All

9.35-10.15am SAS Data Integration Janice Newell

10.15-11.00am SAS Data Quality Rajeeve Narula

11.00am Coffee Break

11.20-11.40am SAS Data Governance Dave Smith

11:40-12:20pm SAS Data Federation Dave Smith

12.20 - 12.30pm Wrap up Sophie Ainley

12.30 – 2pm Lunch and networking All

Page 3: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

THE DATA PROCESS A DAY IN THE LIFE

Data

Transformation

Data Governance

Data

Assessment

DM Studio

BDN

SAS DI

Data Access

SAS DI

Fed Server

Lineage

DM Studio

Data

Cleansing

DM Studio

Fed Server Fed Server

Data

Monitoring

DM Studio

Fed Server

Fed Server

Page 4: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA INTEGRATION

Page 5: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA INTEGRATION

Transactional

Systems

Decision

Making

External Data

Feeds

Regulatory

Data Reporting

Requirements

Spreadsheets

Reshape

Adjust Time Dim

Clean

Analyse

Standardise

How did you

get there?

Page 6: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS DATA

INTEGRATION

SERVER

• Generates code through GUI

• SAS Code

• Database Code

• Library of transformations and functions

• Manages Metadata

• Records steps taken to build output data

• Able to trace any table or column forwards

and backwards through the process

• Much more efficient than hand coding

• Usually 50% faster through clarity, re-

usability, coding speed

Page 7: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Granular Security

ANALYTICAL

PROCESSDATA REQUIREMENT

Source DataAnalysis

Ready DataVisualisation Modelling

Productionised Process

Source DataAnalysis

Ready DataDashboards

Deployed

Models

Repeatability

Assured Quality

Stakeholders

Analysts

Data

Managers

Business

IT Support

Governance and Clarity

Page 8: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ANALYTICAL DATA

INTEGRATION

• Preparing data for analysis

• Can pass data mining metadata to

Enterprise Miner (Target etc.)

• Embedding analytical procedures

• Summarisation, esp medians (on all data)

• Time series preparation

• Multi-row data operations

• Creating correlation indexes

• Scoring Data

• Rapid model deployment

• Including managing in-database scoring

• Model monitoring data creation

Page 9: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DI DEMO

Page 10: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DEMO DAY IN THE LIFE OF DI ANALYST

• SAS data

• External files

Define data

• Create new fields

• Map data

Join tables• Check errors

• Control order

Test job

• Tables

• Fields

Impact analysis

Page 11: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ANALYTICAL DATA

MANAGEMENTDETAIL

What SAS DMA provides

Development framework to create SAS job

flows including a documentation framework

Inbuilt versioning framework

Inbuilt custom transformation framework to

provide re-use of complex processing

Metadata impact analysis and search

facility

Deployment of SAS data flows to a

scheduling tool

SAS analytical modelling code integration

(uses Enterprise Miner)

Data quality integration framework – bring

in DQ processing (uses DMA)

Data governance framework – share

lineage through a browser (DMA 9.4M2)

Clear evidence of process and

development ownership

Historical traceability of DI

changes

Simplify DI flows and speed up

development

Importance/usage of data

items within a SAS data flow

Integration with production

processes

Support predictive model

factory concept

Ensure trust in results, build

defensive process controls

Build business driven

definitions of data items

Page 12: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA QUALITY

Page 13: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

THE DATA PROCESS A DAY IN THE LIFE

Data

Assessment

DM Studio

Data Access

Data

Connection

DM Studio

Data

Cleansing

DM Studio

ProfilingBusiness

Rules

Data

Monitoring

DM Studio

Standardise

DashboardData Job

Page 14: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA GOVERNANCE

Page 15: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Financial organisation

WHY GOVERN DATA?

• Regulation

• Risk

• Efficiency

• Opportunity

Page 16: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CHALLENGE MAP EVERYTHING

• Customers are under increasing pressure to be able to link data in disparate

systems at a logical level – that is to show how metadata is connected

• At the same time, with the advent of Big Data systems and the concept of the

“Data Lake”, it is ever more important from a practical, user-driven point of

view, to have a system that tells data users where the data resides

Business

TermData Item

“Where in the Lake is my data?”

?

?

?

Page 17: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Analytics

and Reporting

Mainframe

Data via

COBOL

ETL

Data Quality

Database

Database

Logical Data

Model

Physical

Data

Model

Database

Page 18: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Business

Glossary

Data Monitoring

Metadata

Lineage

TYPICAL REQUIREMENTS

• Automated collection of metadata

• Services to manage the metadata

• Maintain relationships

• Provide context

• Metadata analysis

• Users specify data quality controls/checks

• Proactively monitor data

• Validate data

• Enforce policy

• Alerts

• Search / Discover data & metadata

• Business terms & technical data attributes

• Ownership, Structure, Usage

• Context

• Secured, governed, and workflow enabledConsensus

Collaboration

Transparency

Page 19: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

METADATA LINEAGE

SAS RELATIONSHIP SERVICE AND LINEAGE VIEWER

Page 20: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS RELATIONSHIP (LINEAGE) REPOSITORY

SAS Relationship

Repository

One repository to store metadata from multiple environments.

Page 21: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

A ROBUST SET OF SERVICES TO MANAGE THE REPOSITORY

Metabridge Loader

Relationship Loader

REST Services

Relationship

Repository

Relationship Reporter

Lineage Viewer

REST Services

• Automated metadata collection • Easy to access

• Many ways to analyze

Page 22: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

RESULT: DATASTAGE ETL JOB

Page 23: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CLEAR GOVERNANCE AND OWNERSHIP

SAS BUSINESS DATA NETWORK

Page 24: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Business Data Network

Business UserBusiness Term

Technical UserRule

Data Stewards

Alerts

DATA GOVERNANCE PEOPLE, PROCESS, TECHNOLOGY

Page 25: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Page 26: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DEMO

Page 27: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS FEDERATION SERVER

Page 28: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Federation Server

DATA FEDERATION WHAT IS IT?

Source 2

Source 1

Federated Views

Source 3

Caching views

Row and Column Access Control

Applications

Logging

Web Administration

Scheduling

Page 29: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA FEDERATION ENABLING COLLABORATION

Organisation 1 Organisation 2 Organisation 3

Collaboration Environment

Secure data filter

Page 30: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA FEDERATION MASKING SENSITIVE INFORMATION

Data LakeAnalysis

Environment

Data Masking

Page 31: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA FEDERATION AUDITING ANALYTICAL USAGE

EDWAnalysis

EnvironmentLogging

Notifiable Queries

Workflow

Page 32: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS FEDERATION SERVER

WHAT’S NEW IN 4.2?

Page 33: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

WHAT’S NEW? SUMMARY

• SAS Metadata Server and Web Infrastructure Platform (WIP) integration

• SAS Metadata Server replaces DataFlux Authentication Server for

authentication and persistence of users, groups, logins (for example, personal,

group, and shared) and domains

Page 34: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

WHAT’S NEW? SUMMARY

• Read/Write access to Hadoop (HIVE) using the new SAS Federation Server

Driver for Apache Hive

• Access to SAS data sets secured with metadata bound libraries

• Access to shared data sources across multiple SAS Federation Servers using a

new Federation Server Driver

• Enhanced data masking and encryption support

Page 35: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

WHAT’S NEW? SUMMARY

• Embedded data quality and cleansing functions in data views

• Support for SAS DS2

• Cache enhancements that include in–memory data cache

• A new migration guide is available for SAS Federation Server 4.2.

• Proc ASExport

Page 36: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

USES SAS METADATA SERVER

SAS Metadata Server replaces Authentication Server

for authentication and other permission-based

functions

• SAS Metadata Server provides

• access for user and group objects

• other permission-based functions such as

shared logins and trusted users.

This refresh icon can be used to

show newly created Authentication

Domains

Page 37: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ENHANCED DATA MASKING

Enhanced data masking and encryption support

• New data masking features include TRANC which

transliterates characters from the input string to characters in

the output string.

• to change (letters, words, etc.) into corresponding

characters of another alphabet or language

• A series of random data masking rules are also available.

The current set of available Data

Masking Functions

Page 38: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ENHANCED DATA MASKINGThe current set of available Data

Masking Functions

• TRANC which transliterates characters

• RANDOM rules are also available

Example of masking

a character column

Example of masking

a numeric column

Page 39: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CACHE ENHANCEMENTS

Cache enhancements that include cache refresh for data held in memory

• Federation Server now has the capability of refreshing cached data,

including MDS, after a server restart.

• In previous releases, cached data that was held in memory was

deleted if the server was restarted or shut down.

We can now cache queries to the

MDS (Memory Data Store)

= FAST PERFORMANCE

Page 40: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CACHE ENHANCEMENTSFED. SERVER 4.2

We can now cache queries to the

MDS (Memory Data Store)

= FAST PERFORMANCE

After a Fed. Server restart the

views are re-ran in the background

Page 41: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

EMBEDDED DATA QUALITYFED. SERVER 4.2

Data Quality

Parsing

Extraction

Pattern Analysis

Identification Analysis

Gender Analysis

Standardization, Casing

Matching

Mr. Roy G Biv Jr

Blue mens long-sleeved button-down collar denim shirt

999-999-9999

John Smith = Name / SAS = Organization

Jane Smith = F - Sam Adams = M

919.6778000 = (919) 677-8000

John Smith / J. Smith / Mr. Jon Smith

Where the DQ

functions live…

Page 42: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

EMBEDDED DATA QUALITY

Embedded data quality and cleansing

functions in data views

• Implemented using SAS Quality

Knowledge Base (QKB) with FedSQL

and DS2.

• The data quality methods use data

quality rules from the SAS QKB in order

to cleanse data.

The ‘standardized’

primary_state_code

Page 43: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

EMBEDDED DATA QUALITYFED. SERVER 4.2

Page 44: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA STEP 2 (DS2) LANGUAGE

• SAS Federation Server now supports the

DATA Step 2 (DS2) language.

• includes additional data types

• ANSI SQL types

• programming structure elements

• user-defined methods and packages.

• Processing gets automatically pushed down

• if Code Accelerator is present in the

corresponding data platform

• If DS2 code conforms to a ‘pushable’

format (e.g. threads defined, etc.)

To invoke DS2, you must configure

a DSN that uses the DS2 dialect

Page 45: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA STEP 2 (DS2) LANGUAGEFED. SERVER 4.2

DS2 equivalent

Actually, our DQ functions are

DS2 methods invoked from SQL

• Customers can write any DS2 code with if/then/else logic, iterating

over column data and producing programmatic results

• This integrates nicely with SQL and is a very useful way to use DS2

Page 46: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

READ/WRITE ACCESS TO HADOOP (HIVE)

Read/Write access to Hadoop (HIVE) using the SAS

Federation Server Driver for Apache Hive

• The Driver for Hive

• uses FedSQL and also provides limited

support for HiveQL.

• supports multiple versions of Hadoop.

• you can use Kerberos

• does not support Write operations such as

insert, update, and delete

Page 47: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

READ/WRITE ACCESS TO HADOOP (HIVE)FED. SERVER 4.2

Access Hadoop using SAS Studio to Federation Server

Create a table in Hadoop

The configuration of the

Hadoop Data Service

using the native Apache

HIVE driver

Page 48: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

QUESTIONS?

Page 49: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CUSTOMER LOYALTY UK USER GROUPS

To register:

www.sas.com/uk/usergroups

Page 50: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

USEFUL

INFORMATION

Page 51: Data Management User Group PowerPoint Presentation

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

THANK YOU FOR YOUR TIME