Data Management User Group PowerPoint Presentation

Preview:

Citation preview

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA MANAGEMENT USER GROUP

MANCHESTER 9TH MARCH 2016

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

AGENDASAS DATA MANAGEMENT USER GROUP

9th March 2016, SAS Manchester

Time Topic Speaker

9-9.30am Coffee and networking

9.30-9.35am Introductions All

9.35-10.15am SAS Data Integration Janice Newell

10.15-11.00am SAS Data Quality Rajeeve Narula

11.00am Coffee Break

11.20-11.40am SAS Data Governance Dave Smith

11:40-12:20pm SAS Data Federation Dave Smith

12.20 - 12.30pm Wrap up Sophie Ainley

12.30 – 2pm Lunch and networking All

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

THE DATA PROCESS A DAY IN THE LIFE

Data

Transformation

Data Governance

Data

Assessment

DM Studio

BDN

SAS DI

Data Access

SAS DI

Fed Server

Lineage

DM Studio

Data

Cleansing

DM Studio

Fed Server Fed Server

Data

Monitoring

DM Studio

Fed Server

Fed Server

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA INTEGRATION

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA INTEGRATION

Transactional

Systems

Decision

Making

External Data

Feeds

Regulatory

Data Reporting

Requirements

Spreadsheets

Reshape

Adjust Time Dim

Clean

Analyse

Standardise

How did you

get there?

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS DATA

INTEGRATION

SERVER

• Generates code through GUI

• SAS Code

• Database Code

• Library of transformations and functions

• Manages Metadata

• Records steps taken to build output data

• Able to trace any table or column forwards

and backwards through the process

• Much more efficient than hand coding

• Usually 50% faster through clarity, re-

usability, coding speed

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Granular Security

ANALYTICAL

PROCESSDATA REQUIREMENT

Source DataAnalysis

Ready DataVisualisation Modelling

Productionised Process

Source DataAnalysis

Ready DataDashboards

Deployed

Models

Repeatability

Assured Quality

Stakeholders

Analysts

Data

Managers

Business

IT Support

Governance and Clarity

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ANALYTICAL DATA

INTEGRATION

• Preparing data for analysis

• Can pass data mining metadata to

Enterprise Miner (Target etc.)

• Embedding analytical procedures

• Summarisation, esp medians (on all data)

• Time series preparation

• Multi-row data operations

• Creating correlation indexes

• Scoring Data

• Rapid model deployment

• Including managing in-database scoring

• Model monitoring data creation

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DI DEMO

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DEMO DAY IN THE LIFE OF DI ANALYST

• SAS data

• External files

Define data

• Create new fields

• Map data

Join tables• Check errors

• Control order

Test job

• Tables

• Fields

Impact analysis

C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ANALYTICAL DATA

MANAGEMENTDETAIL

What SAS DMA provides

Development framework to create SAS job

flows including a documentation framework

Inbuilt versioning framework

Inbuilt custom transformation framework to

provide re-use of complex processing

Metadata impact analysis and search

facility

Deployment of SAS data flows to a

scheduling tool

SAS analytical modelling code integration

(uses Enterprise Miner)

Data quality integration framework – bring

in DQ processing (uses DMA)

Data governance framework – share

lineage through a browser (DMA 9.4M2)

Clear evidence of process and

development ownership

Historical traceability of DI

changes

Simplify DI flows and speed up

development

Importance/usage of data

items within a SAS data flow

Integration with production

processes

Support predictive model

factory concept

Ensure trust in results, build

defensive process controls

Build business driven

definitions of data items

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA QUALITY

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

THE DATA PROCESS A DAY IN THE LIFE

Data

Assessment

DM Studio

Data Access

Data

Connection

DM Studio

Data

Cleansing

DM Studio

ProfilingBusiness

Rules

Data

Monitoring

DM Studio

Standardise

DashboardData Job

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA GOVERNANCE

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Financial organisation

WHY GOVERN DATA?

• Regulation

• Risk

• Efficiency

• Opportunity

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CHALLENGE MAP EVERYTHING

• Customers are under increasing pressure to be able to link data in disparate

systems at a logical level – that is to show how metadata is connected

• At the same time, with the advent of Big Data systems and the concept of the

“Data Lake”, it is ever more important from a practical, user-driven point of

view, to have a system that tells data users where the data resides

Business

TermData Item

“Where in the Lake is my data?”

?

?

?

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Analytics

and Reporting

Mainframe

Data via

COBOL

ETL

Data Quality

Database

Database

Logical Data

Model

Physical

Data

Model

Database

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Business

Glossary

Data Monitoring

Metadata

Lineage

TYPICAL REQUIREMENTS

• Automated collection of metadata

• Services to manage the metadata

• Maintain relationships

• Provide context

• Metadata analysis

• Users specify data quality controls/checks

• Proactively monitor data

• Validate data

• Enforce policy

• Alerts

• Search / Discover data & metadata

• Business terms & technical data attributes

• Ownership, Structure, Usage

• Context

• Secured, governed, and workflow enabledConsensus

Collaboration

Transparency

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

METADATA LINEAGE

SAS RELATIONSHIP SERVICE AND LINEAGE VIEWER

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS RELATIONSHIP (LINEAGE) REPOSITORY

SAS Relationship

Repository

One repository to store metadata from multiple environments.

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

A ROBUST SET OF SERVICES TO MANAGE THE REPOSITORY

Metabridge Loader

Relationship Loader

REST Services

Relationship

Repository

Relationship Reporter

Lineage Viewer

REST Services

• Automated metadata collection • Easy to access

• Many ways to analyze

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

RESULT: DATASTAGE ETL JOB

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CLEAR GOVERNANCE AND OWNERSHIP

SAS BUSINESS DATA NETWORK

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Business Data Network

Business UserBusiness Term

Technical UserRule

Data Stewards

Alerts

DATA GOVERNANCE PEOPLE, PROCESS, TECHNOLOGY

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DEMO

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS FEDERATION SERVER

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Federation Server

DATA FEDERATION WHAT IS IT?

Source 2

Source 1

Federated Views

Source 3

Caching views

Row and Column Access Control

Applications

Logging

Web Administration

Scheduling

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA FEDERATION ENABLING COLLABORATION

Organisation 1 Organisation 2 Organisation 3

Collaboration Environment

Secure data filter

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA FEDERATION MASKING SENSITIVE INFORMATION

Data LakeAnalysis

Environment

Data Masking

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA FEDERATION AUDITING ANALYTICAL USAGE

EDWAnalysis

EnvironmentLogging

Notifiable Queries

Workflow

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS FEDERATION SERVER

WHAT’S NEW IN 4.2?

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

WHAT’S NEW? SUMMARY

• SAS Metadata Server and Web Infrastructure Platform (WIP) integration

• SAS Metadata Server replaces DataFlux Authentication Server for

authentication and persistence of users, groups, logins (for example, personal,

group, and shared) and domains

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

WHAT’S NEW? SUMMARY

• Read/Write access to Hadoop (HIVE) using the new SAS Federation Server

Driver for Apache Hive

• Access to SAS data sets secured with metadata bound libraries

• Access to shared data sources across multiple SAS Federation Servers using a

new Federation Server Driver

• Enhanced data masking and encryption support

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

WHAT’S NEW? SUMMARY

• Embedded data quality and cleansing functions in data views

• Support for SAS DS2

• Cache enhancements that include in–memory data cache

• A new migration guide is available for SAS Federation Server 4.2.

• Proc ASExport

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

USES SAS METADATA SERVER

SAS Metadata Server replaces Authentication Server

for authentication and other permission-based

functions

• SAS Metadata Server provides

• access for user and group objects

• other permission-based functions such as

shared logins and trusted users.

This refresh icon can be used to

show newly created Authentication

Domains

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ENHANCED DATA MASKING

Enhanced data masking and encryption support

• New data masking features include TRANC which

transliterates characters from the input string to characters in

the output string.

• to change (letters, words, etc.) into corresponding

characters of another alphabet or language

• A series of random data masking rules are also available.

The current set of available Data

Masking Functions

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

ENHANCED DATA MASKINGThe current set of available Data

Masking Functions

• TRANC which transliterates characters

• RANDOM rules are also available

Example of masking

a character column

Example of masking

a numeric column

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CACHE ENHANCEMENTS

Cache enhancements that include cache refresh for data held in memory

• Federation Server now has the capability of refreshing cached data,

including MDS, after a server restart.

• In previous releases, cached data that was held in memory was

deleted if the server was restarted or shut down.

We can now cache queries to the

MDS (Memory Data Store)

= FAST PERFORMANCE

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CACHE ENHANCEMENTSFED. SERVER 4.2

We can now cache queries to the

MDS (Memory Data Store)

= FAST PERFORMANCE

After a Fed. Server restart the

views are re-ran in the background

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

EMBEDDED DATA QUALITYFED. SERVER 4.2

Data Quality

Parsing

Extraction

Pattern Analysis

Identification Analysis

Gender Analysis

Standardization, Casing

Matching

Mr. Roy G Biv Jr

Blue mens long-sleeved button-down collar denim shirt

999-999-9999

John Smith = Name / SAS = Organization

Jane Smith = F - Sam Adams = M

919.6778000 = (919) 677-8000

John Smith / J. Smith / Mr. Jon Smith

Where the DQ

functions live…

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

EMBEDDED DATA QUALITY

Embedded data quality and cleansing

functions in data views

• Implemented using SAS Quality

Knowledge Base (QKB) with FedSQL

and DS2.

• The data quality methods use data

quality rules from the SAS QKB in order

to cleanse data.

The ‘standardized’

primary_state_code

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

EMBEDDED DATA QUALITYFED. SERVER 4.2

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA STEP 2 (DS2) LANGUAGE

• SAS Federation Server now supports the

DATA Step 2 (DS2) language.

• includes additional data types

• ANSI SQL types

• programming structure elements

• user-defined methods and packages.

• Processing gets automatically pushed down

• if Code Accelerator is present in the

corresponding data platform

• If DS2 code conforms to a ‘pushable’

format (e.g. threads defined, etc.)

To invoke DS2, you must configure

a DSN that uses the DS2 dialect

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

DATA STEP 2 (DS2) LANGUAGEFED. SERVER 4.2

DS2 equivalent

Actually, our DQ functions are

DS2 methods invoked from SQL

• Customers can write any DS2 code with if/then/else logic, iterating

over column data and producing programmatic results

• This integrates nicely with SQL and is a very useful way to use DS2

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

READ/WRITE ACCESS TO HADOOP (HIVE)

Read/Write access to Hadoop (HIVE) using the SAS

Federation Server Driver for Apache Hive

• The Driver for Hive

• uses FedSQL and also provides limited

support for HiveQL.

• supports multiple versions of Hadoop.

• you can use Kerberos

• does not support Write operations such as

insert, update, and delete

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

READ/WRITE ACCESS TO HADOOP (HIVE)FED. SERVER 4.2

Access Hadoop using SAS Studio to Federation Server

Create a table in Hadoop

The configuration of the

Hadoop Data Service

using the native Apache

HIVE driver

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

QUESTIONS?

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

CUSTOMER LOYALTY UK USER GROUPS

To register:

www.sas.com/uk/usergroups

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

USEFUL

INFORMATION

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

THANK YOU FOR YOUR TIME