Upload
nguyennga
View
220
Download
3
Embed Size (px)
Citation preview
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA MANAGEMENT USER GROUP
MANCHESTER 9TH MARCH 2016
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
AGENDASAS DATA MANAGEMENT USER GROUP
9th March 2016, SAS Manchester
Time Topic Speaker
9-9.30am Coffee and networking
9.30-9.35am Introductions All
9.35-10.15am SAS Data Integration Janice Newell
10.15-11.00am SAS Data Quality Rajeeve Narula
11.00am Coffee Break
11.20-11.40am SAS Data Governance Dave Smith
11:40-12:20pm SAS Data Federation Dave Smith
12.20 - 12.30pm Wrap up Sophie Ainley
12.30 – 2pm Lunch and networking All
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
THE DATA PROCESS A DAY IN THE LIFE
Data
Transformation
Data Governance
Data
Assessment
DM Studio
BDN
SAS DI
Data Access
SAS DI
Fed Server
Lineage
DM Studio
Data
Cleansing
DM Studio
Fed Server Fed Server
Data
Monitoring
DM Studio
Fed Server
Fed Server
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA INTEGRATION
C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA INTEGRATION
Transactional
Systems
Decision
Making
External Data
Feeds
Regulatory
Data Reporting
Requirements
Spreadsheets
Reshape
Adjust Time Dim
Clean
Analyse
Standardise
How did you
get there?
C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
SAS DATA
INTEGRATION
SERVER
• Generates code through GUI
• SAS Code
• Database Code
• Library of transformations and functions
• Manages Metadata
• Records steps taken to build output data
• Able to trace any table or column forwards
and backwards through the process
• Much more efficient than hand coding
• Usually 50% faster through clarity, re-
usability, coding speed
C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
Granular Security
ANALYTICAL
PROCESSDATA REQUIREMENT
Source DataAnalysis
Ready DataVisualisation Modelling
Productionised Process
Source DataAnalysis
Ready DataDashboards
Deployed
Models
Repeatability
Assured Quality
Stakeholders
Analysts
Data
Managers
Business
IT Support
Governance and Clarity
C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
ANALYTICAL DATA
INTEGRATION
• Preparing data for analysis
• Can pass data mining metadata to
Enterprise Miner (Target etc.)
• Embedding analytical procedures
• Summarisation, esp medians (on all data)
• Time series preparation
• Multi-row data operations
• Creating correlation indexes
• Scoring Data
• Rapid model deployment
• Including managing in-database scoring
• Model monitoring data creation
C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DI DEMO
C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DEMO DAY IN THE LIFE OF DI ANALYST
• SAS data
• External files
Define data
• Create new fields
• Map data
Join tables• Check errors
• Control order
Test job
• Tables
• Fields
Impact analysis
C op yr i g h t © 2013 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
ANALYTICAL DATA
MANAGEMENTDETAIL
What SAS DMA provides
Development framework to create SAS job
flows including a documentation framework
Inbuilt versioning framework
Inbuilt custom transformation framework to
provide re-use of complex processing
Metadata impact analysis and search
facility
Deployment of SAS data flows to a
scheduling tool
SAS analytical modelling code integration
(uses Enterprise Miner)
Data quality integration framework – bring
in DQ processing (uses DMA)
Data governance framework – share
lineage through a browser (DMA 9.4M2)
Clear evidence of process and
development ownership
Historical traceability of DI
changes
Simplify DI flows and speed up
development
Importance/usage of data
items within a SAS data flow
Integration with production
processes
Support predictive model
factory concept
Ensure trust in results, build
defensive process controls
Build business driven
definitions of data items
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA QUALITY
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
THE DATA PROCESS A DAY IN THE LIFE
Data
Assessment
DM Studio
Data Access
Data
Connection
DM Studio
Data
Cleansing
DM Studio
ProfilingBusiness
Rules
Data
Monitoring
DM Studio
Standardise
DashboardData Job
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA GOVERNANCE
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
Financial organisation
WHY GOVERN DATA?
• Regulation
• Risk
• Efficiency
• Opportunity
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
CHALLENGE MAP EVERYTHING
• Customers are under increasing pressure to be able to link data in disparate
systems at a logical level – that is to show how metadata is connected
• At the same time, with the advent of Big Data systems and the concept of the
“Data Lake”, it is ever more important from a practical, user-driven point of
view, to have a system that tells data users where the data resides
Business
TermData Item
“Where in the Lake is my data?”
?
?
?
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
Analytics
and Reporting
Mainframe
Data via
COBOL
ETL
Data Quality
Database
Database
Logical Data
Model
Physical
Data
Model
Database
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
Business
Glossary
Data Monitoring
Metadata
Lineage
TYPICAL REQUIREMENTS
• Automated collection of metadata
• Services to manage the metadata
• Maintain relationships
• Provide context
• Metadata analysis
• Users specify data quality controls/checks
• Proactively monitor data
• Validate data
• Enforce policy
• Alerts
• Search / Discover data & metadata
• Business terms & technical data attributes
• Ownership, Structure, Usage
• Context
• Secured, governed, and workflow enabledConsensus
Collaboration
Transparency
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
METADATA LINEAGE
SAS RELATIONSHIP SERVICE AND LINEAGE VIEWER
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
SAS RELATIONSHIP (LINEAGE) REPOSITORY
SAS Relationship
Repository
One repository to store metadata from multiple environments.
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
A ROBUST SET OF SERVICES TO MANAGE THE REPOSITORY
Metabridge Loader
Relationship Loader
REST Services
Relationship
Repository
Relationship Reporter
Lineage Viewer
REST Services
• Automated metadata collection • Easy to access
• Many ways to analyze
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
RESULT: DATASTAGE ETL JOB
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
CLEAR GOVERNANCE AND OWNERSHIP
SAS BUSINESS DATA NETWORK
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
Business Data Network
Business UserBusiness Term
Technical UserRule
Data Stewards
Alerts
DATA GOVERNANCE PEOPLE, PROCESS, TECHNOLOGY
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DEMO
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
SAS FEDERATION SERVER
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
Federation Server
DATA FEDERATION WHAT IS IT?
Source 2
Source 1
Federated Views
Source 3
Caching views
Row and Column Access Control
Applications
Logging
Web Administration
Scheduling
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA FEDERATION ENABLING COLLABORATION
Organisation 1 Organisation 2 Organisation 3
Collaboration Environment
Secure data filter
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA FEDERATION MASKING SENSITIVE INFORMATION
Data LakeAnalysis
Environment
Data Masking
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA FEDERATION AUDITING ANALYTICAL USAGE
EDWAnalysis
EnvironmentLogging
Notifiable Queries
Workflow
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
SAS FEDERATION SERVER
WHAT’S NEW IN 4.2?
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
WHAT’S NEW? SUMMARY
• SAS Metadata Server and Web Infrastructure Platform (WIP) integration
• SAS Metadata Server replaces DataFlux Authentication Server for
authentication and persistence of users, groups, logins (for example, personal,
group, and shared) and domains
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
WHAT’S NEW? SUMMARY
• Read/Write access to Hadoop (HIVE) using the new SAS Federation Server
Driver for Apache Hive
• Access to SAS data sets secured with metadata bound libraries
• Access to shared data sources across multiple SAS Federation Servers using a
new Federation Server Driver
• Enhanced data masking and encryption support
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
WHAT’S NEW? SUMMARY
• Embedded data quality and cleansing functions in data views
• Support for SAS DS2
• Cache enhancements that include in–memory data cache
• A new migration guide is available for SAS Federation Server 4.2.
• Proc ASExport
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
USES SAS METADATA SERVER
SAS Metadata Server replaces Authentication Server
for authentication and other permission-based
functions
• SAS Metadata Server provides
• access for user and group objects
• other permission-based functions such as
shared logins and trusted users.
This refresh icon can be used to
show newly created Authentication
Domains
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
ENHANCED DATA MASKING
Enhanced data masking and encryption support
• New data masking features include TRANC which
transliterates characters from the input string to characters in
the output string.
• to change (letters, words, etc.) into corresponding
characters of another alphabet or language
• A series of random data masking rules are also available.
The current set of available Data
Masking Functions
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
ENHANCED DATA MASKINGThe current set of available Data
Masking Functions
• TRANC which transliterates characters
• RANDOM rules are also available
Example of masking
a character column
Example of masking
a numeric column
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
CACHE ENHANCEMENTS
Cache enhancements that include cache refresh for data held in memory
• Federation Server now has the capability of refreshing cached data,
including MDS, after a server restart.
• In previous releases, cached data that was held in memory was
deleted if the server was restarted or shut down.
We can now cache queries to the
MDS (Memory Data Store)
= FAST PERFORMANCE
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
CACHE ENHANCEMENTSFED. SERVER 4.2
We can now cache queries to the
MDS (Memory Data Store)
= FAST PERFORMANCE
After a Fed. Server restart the
views are re-ran in the background
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
EMBEDDED DATA QUALITYFED. SERVER 4.2
Data Quality
Parsing
Extraction
Pattern Analysis
Identification Analysis
Gender Analysis
Standardization, Casing
Matching
Mr. Roy G Biv Jr
Blue mens long-sleeved button-down collar denim shirt
999-999-9999
John Smith = Name / SAS = Organization
Jane Smith = F - Sam Adams = M
919.6778000 = (919) 677-8000
John Smith / J. Smith / Mr. Jon Smith
Where the DQ
functions live…
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
EMBEDDED DATA QUALITY
Embedded data quality and cleansing
functions in data views
• Implemented using SAS Quality
Knowledge Base (QKB) with FedSQL
and DS2.
• The data quality methods use data
quality rules from the SAS QKB in order
to cleanse data.
The ‘standardized’
primary_state_code
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
EMBEDDED DATA QUALITYFED. SERVER 4.2
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA STEP 2 (DS2) LANGUAGE
• SAS Federation Server now supports the
DATA Step 2 (DS2) language.
• includes additional data types
• ANSI SQL types
• programming structure elements
• user-defined methods and packages.
• Processing gets automatically pushed down
• if Code Accelerator is present in the
corresponding data platform
• If DS2 code conforms to a ‘pushable’
format (e.g. threads defined, etc.)
To invoke DS2, you must configure
a DSN that uses the DS2 dialect
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
DATA STEP 2 (DS2) LANGUAGEFED. SERVER 4.2
DS2 equivalent
Actually, our DQ functions are
DS2 methods invoked from SQL
• Customers can write any DS2 code with if/then/else logic, iterating
over column data and producing programmatic results
• This integrates nicely with SQL and is a very useful way to use DS2
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
READ/WRITE ACCESS TO HADOOP (HIVE)
Read/Write access to Hadoop (HIVE) using the SAS
Federation Server Driver for Apache Hive
• The Driver for Hive
• uses FedSQL and also provides limited
support for HiveQL.
• supports multiple versions of Hadoop.
• you can use Kerberos
• does not support Write operations such as
insert, update, and delete
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
READ/WRITE ACCESS TO HADOOP (HIVE)FED. SERVER 4.2
Access Hadoop using SAS Studio to Federation Server
Create a table in Hadoop
The configuration of the
Hadoop Data Service
using the native Apache
HIVE driver
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
QUESTIONS?
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
CUSTOMER LOYALTY UK USER GROUPS
To register:
www.sas.com/uk/usergroups
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
USEFUL
INFORMATION
C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .
THANK YOU FOR YOUR TIME