Futurebrightdatamanagement engels

FUTURE BRIGHT

A DATA DRIVEN REALITY

5

7

10

14

18

20

24

32

36

42

46

48

52

56

60

64

Foreword by Bert Boers

Preface by Jeroen Dijkxhoorn

HR service provider Securex heads for 100% reliable CRM

Data-integration evolves thanks to Big Data and open source

Infographic Data Management

Rijkswaterstaat gains comprehensive insight into its performance

Ronald Damhof allows enterprise-wide discussion on data with the Data Quadrant Model

Improving insight into Belgium’s economic situation

DSM gains control of Master Data

Who is your data governor?

Jill Dyché blogs about her book The New IT

Credit insurance company integrates quality ratio’s in risk assessment

Data dependent on quality improvements

Data-driven decisions make the difference

Master Data Management as a foundation of your business

About SAS

TABLE OF CONTENTS

FOREWORD

The seventh edition of Future Bright explores the theme of ‘A Data Driven Reality’. The data driven society is evolving more rapidly than many organizations seem to realize. This is fuelled by developments such as the Internet of Things, which generates vast new flows of new data, creating new business opportunities. At the same time market research firm Forrester has declared that we are living in the ‘Age of the Customer’. Customers leave a digital trail and expect their suppliers to use that information to provide a better and more relevant customized offering.

Both developments have a significant impact, as many organizations are now beginning to realize. But as

yet they are taking little structured action to genuinely prepare their organization for this Data Driven

Reality. That’s understandable, because this is uncharted territory. How can you ago about it? Where do

you start?

Organizations are aware that they first need to get their foundation in order. At the same time they see a

lot of low-hanging fruit that they can pick with rewarding projects in the field of big data analytics. How do

those two worlds interrelate? Which investment generates the fastest return?

We aim to provide new insights with this book. Drawing on interviews with customers such as Securex,

DSM, Rijkswaterstaat and Crédito y Caución and experts as Jill Dyché and Ronald Damhof, we show the

steps necessary to give data management a central role in your organization, so that you can get the

basics in order and can fully exploit your data to drive innovation, conversion and satisfaction.

We hope you find it inspiring reading.

Bert Boers

Vice President South-West Europe region

SAS Institute

7

PREFACE

Jeroen Dijkxhoorn

DIRECTOR ANALYTICAL

PLATFORM CENTER OF

EXCELLENCE AT SAS

8

The fact that data was a by-product of the process resulted in databases with a considerable number of

errors and omissions. To cope with this, the data was always validated before anyone used it. If a lot of

data was found to be incorrect, all efforts were suddenly focused on supplying missing data, correcting

incorrect data, and/or cleaning contaminated databases. Human intervention was always required.

Data automatically initiates processesThis operating method is becoming problematic now that data sources are increasingly linked and

processes are likely to start at any time. Whereas the start time of an e-mail marketing campaign

used to be determined by a marketer, it now starts when triggers are received from the customer and

you want to respond. The more you understand the customer journey, the easier it will be to respond

to those triggers and the more relevant you will be as an organization for your customers. This forces

you to set out policies stating how your organization will respond when your customer or prospect

requests certain information or signs up for an e-mail newsletter.

The process then continues entirely automatically, without human intervention and hence also with-

out the validation that used to take place. The correctness of data therefore has to be checked auto-

matically, by means of services which can be deployed throughout the process. Here we can draw a

distinction between data validation (technical correction of data in a physical data stream) and data

quality (verification of functional correctness).

Data Driven RealityOrganizations used to be driven by processes, but now they are driven by data. This means any failure

to identify an error can have an immediate, major impact. Manual correction is no longer possible, so

the error will show up in multiple locations. That makes data quality monitoring much more impor-

tant. It also explains why compliancy rules and regulations are imposing new data quality require-

ments. Supervisors nowadays want data, not reports. That requires a data driven organization. We are

now speeding towards a data driven reality.

The problem is not technology, but the lack of central supervision of the consistency of data defini-

tions – data governance. That is typically the task of a Chief Data Officer, which many organizations

still lack.

Data quality has been an issue ever since the first database was created. It was a subject that for a long time received little attention, for the simple reason that process efficiency was always more important than the completeness and accuracy of data. Data was a by-product of the process. This time is over. We are heading towards a data driven reality.

PREFACE

Age of the Customer and Internet of Things are driversIt is high time to take action, because in the Age of the Customer you need to respond flexibly to trig-

gers from customers. This requires a 360 degree view of the customer. We have been talking about it

for many years, but still don’t have it because customer data are spread across various systems. The

lack of supervision of data definitions makes it impossible to pull the data together.

Another driver is developments resulting from the Internet of Things. This will generate a new stream

of data that you will want to use to optimize and largely automate your processes. This also requires a

good vision on data management.

Combination of different types of dataWhichever of the two stated realities is your main driver, in both situations it is increasingly important

to combine 100% reliable data with data containing a degree of uncertainty. Examples are weather

forecasts or social media sentiment analyses. How is it possible to combine these unstructured data,

often stored in Hadoop clusters, in an appropriate way with structured data that is 100% accurate,

such as route planning for truck drivers or purchasing behaviour data?

From a cost point of view it is not feasible to store all that data in the same database. But that would

also be highly undesirable from an organizational point of view, as Ronald Damhof explains later in

this book. After all, there is a big difference between data with which you have to account to supervi-

sors and data which you use to experiment, in pursuit of ideas for innovation. And yet those different

ways of using data must be combined, without physically lumping all the data together.

This complexity requires a clear logical data model and clear data definitions. Without these data

definitions and good data stewardship, it is impossible to exploit the opportunities that are arising in

the market and which your competitors will respond to in droves. The question is therefore no longer

whether you will start, but when. Our advice is: act today. Data is your main asset. Act accordingly and

do something with it, before a competitor or a new market player beats you to it. ■

9

“Organizations used to be driven by processes, now they are driven by data”

Jeroen Dijkxhoorn

PREFACE

10

CLEANING UP THE CLIENT RELATIONS DATABASE AND THEN KEEPING IT CLEAN

HR SERVICE PROVIDER

SECUREX HEADS FOR

A 100% RELIABLE CRM

CASE

11

Like many companies, HR service provider Securex was witnessing severe problems with their CRM database. Chief among the problems was that marketing data was poor and becoming increasingly unreliable. They cleaned up and updated the database using a SAS Data Management platform. On top of that, this platform is also being set up as a permanent watchdog to ensure the accuracy and consistency of both batch updates and manual data manipulations. The result has been an improved database with greatly enhanced contact information.

CASE

12

Securex is an HR company active in Belgium, France, and Luxemburg providing services for large busi-

nesses, SMEs, self-employed professionals, and private individuals. Services include Payroll, Staff and

Insurance Management, HR Consulting, and Health and Safety Services. Securex has a staff of approxi-

mately 1,600 throughout their nearly 30 offices, serving more than a quarter of a million clients.

Data inconsistencies lead to frustrations“We want to make sure that whenever anyone within the organization enters or modifies data, the

changes are automatically processed and rectified,” reports Securex Business Architect Jacky Decoster.

Any data inconsistencies invariably result in considerable frustration by everyone involved. Employees

are constantly updating client data, adding and changing contact information and contract data, all

while marketing teams are uploading foreign data for new campaigns and other client communi-

cations. “Each of these manipulations can produce small data errors or inconsistencies,” observes

Decoster. “Since the data is being manipulated by a multitude of persons and departments, problems

can easily arise such as duplicate entries, client records with incomplete contract information, and

missing contact information such as first name, gender, post and e-mail address, or phone number.

This is frustrating, especially for marketing teams running a campaign: many e-mails simply bounce,

some mail is sent twice to the same person, and others are based on wrong or missing information.

This sometimes severely damaged our reputation.”

Although a centralized SAP CRM database had been in place since 2004, the problems have been

growing worse in recent years. Decoster noted that complaints about data quality were coming in

from both staff and clients. “Obviously we had to do something about it and do it effectively and con-

vincingly.”

SAS Data Management clean up successfully launchedThe data quality issue was put high on the agenda when Securex launched its comprehensive Client+

project. This change project included the migration of the highly customized SAP CRM database into

the cloud-based, standardized, scalable Salesforce.com solution. Securex decided to deploy SAS Data

Management to facilitate that migration. Decoster explains that their reasoning proved to be spot on.

“SAS Data Management enabled us to meticulously clean the data before uploading it into our new

database. The data were normalized, duplicate entries were merged, and missing information was

automatically added whenever. SAS Data Management has built-in tools such as data dictionary defi-

nition, fuzzy matching, full name parsing, reliable gender determination, phone number standardiza-

tion, and e-mail address analysis that comprehensively covered all of our concerns. We have already

completed the migration of our enterprise accounts in record time and the marketing department

tells us they have virtually zero complaints about data quality. It is a huge improvement that would

have been unthinkable without SAS Data Management. We are now finalizing our self-employed and

individual accounts.”

A permanent watchdog for data qualityDecoster insists however, that improving data quality is not a one shot affair; it must be a continuous

concern within the organization. It is one reason why Securex opted for a comprehensive approach.

Their Client+ project includes the redefinition and streamlining of marketing and sales processes. Part

CASE

of this effort is sensitizing staff about the impact of their data manipulations and insisting that they

be both careful and precise. At the same time, SAS Data Management is being set up as a permanent

watchdog for data quality. Decoster explains why: “One can never be 100% sure that every single bit

of data will be entered correctly, even when people are fully trained and sensitized. That is why we

have SAS Data Management make consistency checks and updates on a regular basis, in fact every

week. Our next step will be to implement a near real time check. Whenever someone in the organiza-

tion enters or modifies data, the changed record is automatically processed and corrected by SAS Data

Management. This is a process that takes just a couple of seconds.”

Robust architecture and great flexibilityDecoster and the Securex staff have nothing but praise for the robust architecture and great flexibility

of the SAS Data Management platform. The system can be integrated into any software environment.

For example, SAS Data Management provides direct certified connectors to a variety of systems,

including Salesforce.com. This avoids the development of customized interfaces. Furthermore, all func-

tionality is offered in stored procedures, ensuring that every transaction is safe and reliable.

SAS Data Management is also easy to deploy. “It has a powerful data profiler, which enables us to

examine any available data and assess their reliability along with the risk involved in integrating

them into new applications. We use this profiler in particular to analyze all data we purchase.” The

software also provides a powerful tool to define batch jobs to clean and normalize the data, based on

the profiling statistics and information. Decoster then added a final plus. “The learning curve for SAS

Data Management is very short: after a two day training we were able to define all of the jobs we

needed.” ■

13

“Marketing says that the data quality has improved dramatically, an achievement that we previously considered impossible”

Jacky Decoster

CASE

14

INTERVIEW

“For me, Big Data does not exist as a volume concept.” This is a remarkable statement for a data integration expert to make. “Size is relative. It reflects where you come from.” As such, you cannot define a lower threshold for the ‘big’ in Big Data, but the phenomenon does touch on the field of data integration, which itself has practically become a commoditized specialization.

15

INTERVIEW

These doubts about the existence of ‘big’ data are voiced by Arturo Salazar. He is Data Management

Advisor Analytical Platform at SAS. Salazar explains how ‘big’ has a whole other meaning for a small

business than it has for a large corporation such as a financial institution. So he argues that there can

be no lower threshold for the ‘big’ in Big Data.

The Big Data trend certainly has major implications for the field of data integration, as this field is now

confronted with more data and more unknown data variables. Salazar explains that data integration

has existed for some time now and is considered almost a commodity today. However, this is not to

say that all organizations feel completely at home with data integration: the importance of using and

efficiently deploying data is not understood by all. But as a specialization it has now reached adult-

hood.

Outside the organizations comfort zoneThe ‘big’ in Big Data is indeed a relative dimension, he agrees. “Big Data is all data that falls outside

the comfort zone of an organization.” It also involves the comprehensiveness of the data sources

and, even more important, the speed with which the information can be integrated in order to derive

new and deployable insights from it. The question arises whether Big Data is related to the degree

of maturity of an organization. If the data in question is only just outside the comfort zone, isn’t it a

question of simply expanding, of growing up? “Yes, it is a question of reaching maturity,” says Salazar.

Monthly reporting is inadequateHe continues: “Businesses are currently going through a technology transformation with regard to

the way information used to be collected and used.” He mentions the striking example of how data

integration was introduced, already some years ago now. “Take the web world, for example, and logs

of web servers.” The machines recorded web page visits and click-throughs in their log files, including

the IP addresses of the origin servers, cookie data, et cetera.

DATA INTEGRATION EVOLVES THANKS TO BIG DATA AND OPEN SOURCE

How to deal with the explosion of data and the importance of analysis

16

“All those clicks; that is a lot of data.” And it’s all data that could be useful. Salazar puts it into

perspective: “Most log data can actually be thrown out, but you simply don’t know which data you

should keep.” Moreover, the valuation of data on web surfing has shifted. Data that used to be erased

may now actually prove to be valuable. While it used to be of little interest which other pages on the

same website a visitor clicked to, today that data could be crucial. “Take the real-time recommenda-

tions provided by modern online retailers.” Another example is tracking surfing habits while visitors are

logged on to a site. Profiling is now a spearhead in the customer loyalty campaigns of web companies.

Growing data mountains versus storage costsThe example of logs in the web world has brought a wider awareness of the usefulness of data inte-

gration. This has in turn fed a demand to deploy data integration for a wider range of applications.

There is added value to be had from connecting the website data to ‘traditional’ information sources

such as a CRM system or a data warehouse. It has long been recognized that such a connection makes

sound business sense. Continuously evolving insights have since ensured that this is not simply limit-

ed to the one-sided import of website logs in a CRM application, for example. Efficient use of the data

requires two-way traffic and a wider scope. This means the amount of data gets bigger and bigger.

In the first instance, the rapid growth of the data that companies collect, store and correlate may

not appear to be a major problem. The capacity of storage media continues to increase, while the

price per gigabyte is being forced down. As if hard drives abide by their own version of Moore’s law,

the exponential growth in the performance of processors. However, not only is the curve for storage

capacity increasing less steeply than that of processors, the increase is also insufficient to keep ahead

of the explosive data growth.

Extracting unknown nuggets of dataAn additional problem for data integration in the information explosion is the software, and more spe-

cifically database software. A very significant proportion of the looming data mountain cannot simply

be stored in a relatively expensive database or a costly data warehouse configuration. Although these

enormous mountains of data might contain gold, it is as yet unknown how much there is and where

it is. The SAS experts confirms that this is in essence a chicken and egg problem: the as yet unknown

value of the data versus the cost of finding it. But hope looms for this form of data mining. New tech-

nology is relieving the pioneers of the manual task of sieving for nuggets in the streams that flow out

of the data mountain. Nor do they have to laboriously dig mine shafts in the mountain any longer.

Going down the same road as LinuxThis is where the Hadoop open source software comes into play, a cheap software solution that runs

on standard hardware and that can store and process petabytes of data. How powerful is it? Hadoop is

based on technology developed by search giant Google to index the internet. “Hadoop is going down

the same road as Linux,” explains Salazar. The market is gradually adopting it for more serious appli-

cations. “No one wants to store their logs in an expensive database.” However, a problem for many

businesses is that Hadoop is still at the beginning of the road that Linux travelled long ago. Both stem

from very different worlds than what regular businesses are used to and both require quite some

technical knowledge, of users as well.

“Initially people were afraid of Linux too,” says Salazar. Since then, companies like Red Hat have

combined the system’s core software with business applications and offer the results as packages.

Hadoop has just started this packaging process. He points to Cloudera and Hortonworks; he thinks

INTERVIEW

these programs will do for Hadoop what Red Hat did for the adoption of Linux. “Many businesses still

consider Hadoop intimidating and too complicated,” says Salazar. They normally employ specialists for

such open source software, for installation and configuration as well as maintenance and even every-

day use. What skills are needed? Experienced programmers who have coding skills and administrative

ta lent, alongside the knowledge and expertise normally associated with data analysts. This is a rare

and therefore expensive combination of qualities.

Bringing Hadoop to the massesDespite its complexity, Hadoop is gaining popularity. “It offers so many advantages,” explains Sala-

zar. Business Intelligence vendor SAS is also responding to this trend. He says that the company uses

technology such as Hadoop “under the hood”. The complexity of this software is hidden within pro-

cesses and programs that the customer is familiar with. Businesses are able to focus on actually using

the tools for data integration, instead of first having to call on special experts with knowledge of the

underlying software.

In February 2015, SAS has introduced a new product to its data management range to increase the

user-friendliness of Hadoop under-the-hood. Salazar explains that the new web-based application,

called SAS Data Loader for Hadoop, will make it possible to delve even deeper into the data mountain.

This application can be used to prepare and then mine the data stored in Hadoop and can be used by

data analysts and even ordinary users. Soon we will all be able to mine for gold! ■

17

INTERVIEW

“Although these enormous mountains of data might contain gold, it is as yet unknown how much there is and where it is”

Arturo Salazar

Arturo Salazar

DATA MANAGEMENT

ADVISOR ANALYTICAL

PLATFORM AT SAS

18

DATA MANAGEMENT

19

20

Jacorien Wouters

PROGRAMME

MANAGER FOR THE

NETWORK MANAGEMENT

INFORMATION SYSTEM

AT RIJKSWATERSTAAT

CASE

21

RIJKSWATERSTAAT GAINS COMPREHENSIVE INSIGHT INTO ITS PERFORMANCE

Rijkswaterstaat, the executive agency of the Dutch Ministry of Infrastructure and the Environment, is responsible for the principal highway and waterway networks and the main water system in the Netherlands. In order to account to the Dutch Ministry of Infrastructure and the Environment and to the lower house of the Dutch Parliament, besides managing its own internal organization with regard to operational processes, Rijkswaterstaat needs to have the right information at the right time and to be able to access it internally and externally. For this purpose it developed the Network Management Information System (NIS).

Rijkswaterstaat began developing the NIS a number of years ago. The system was designed to

give an integrated insight into the performance delivered by Rijkswaterstaat, allowing tighter

control and providing a broad view of the overall achievement. In the tendering process,

Rijkswaterstaat selected SAS’ solutions because they were able to support the entire process

from source to browser. Rijkswaterstaat consequently uses SAS Business Intelligence and SAS

Data Management for the NIS.

“The NIS is now one of the most important information sources for the management of Rijkswa-

terstaat,” says Jacorien Wouters, Programme Manager for the NIS. “It has brought together the

two largest flows of information on our organization’s networks: the performances of the net-

works and data on assets such as roads and bridges, but also the Wadden Sea, for example. This

was preceded by an intensive data integration process.”

An integrated and clear view across highway and waterway networks

CASE

22

Better decisionsWhen the NIS was introduced in 2004, the data from various applications was spread across the

information systems of Rijkswaterstaat’s ten regional departments. Now, the NIS periodically obtains

data from over 40 source systems. The power of the system lies among other things in the possibility

of combining data and presenting it in charts and maps. This gives a fast and clear insight into the

performance of the individual departments and of Rijkswaterstaat as a whole. The figures in the NIS

have official status. That is very important internally, but also externally since Rijkswaterstaat reports

to the Ministry three times a year on the status of specific performance indicators, or PINs. As part of a

service level agreement, appointments have been made for a four-year budget period.

More complex analysesIn addition to improved control, access to the information has been greatly simplified, as is clear from

the increasing number of NIS users at Rijkswaterstaat. “Information which previously could only be

obtained by a few employees from a specific data source is now available through the NIS portal to all

employees at Rijkswaterstaat,” Wouters explains.

CASE

A single version of the truth“The insight into the underlying data helps us to operate more efficiently and hence ultimately to

save costs,” Wouters continues. “The clear reporting method also saves time. Now there is a single

version of the truth, so discussions on definitions or figures are a thing of the past. The fact that infor-

mation is more readily available in the NIS means we can also make faster, better adjustments. We

used to report on performance three times a year and matters came to light which we would have

preferred to tackle immediately. Now we can do just that.”

DevelopmentsIn implementing SAS Rijkswaterstaat took a step towards improving data quality. It also started to use

SAS Visual Analytics. “As we simply have more insight into our data, our management can take more

forward-looking decisions,” says Wouters. “We’re making constant progress in combining information,

highlighting connections which would not previously have been visible.” ■

23

“The insight into the underlying data helps us to operate more efficiently and hence ultimately to save costs. The clear reporting method also saves time. Now there is a single version of the truth”

Jacorien Wouters

CASE

Ronald Damhof

INDEPENDENT

CONSULTANT

INFORMATION

MANAGEMENT

24

INTERVIEW

25

“Make data management a live issue for discussion throughout the organization”

Independent information management consultant

Ronald Damhof developed the Data Quadrant Model

The data management field is awash with jargon. Most business managers have no idea what all those terms mean, let alone how to use them in understanding the precise value of particular data and how to handle it. To allow an enterprise-wide discussion on data, Ronald Damhof developed the Data Quadrant Model.

INTERVIEW

26

Damhof works as an independent information management consultant for major organizations such

as Ahold, De Nederlandsche Bank, the Dutch tax authorities, Alliander, and organizations in the finan-

cial and healthcare sectors. These are data-intensive organizations which share a growing realization

that the quality of their work is increasingly determined by the quality of their data. But how do you

move from that realization to a good data strategy? A strategy which everyone in the organization

understands, from the director in the boardroom to the engineer in IT? Damhof developed a quadrant

model to make data management a live issue for discussion.

To push or pull?Damhof starts by explaining a concept which everyone will have encountered in high school: the

‘Push Pull Point’. This concerns the extent to which demand impacts the production process. He takes

as an example the building of a luxury yacht, a process that does not start until the customer’s order

is known. The decoupling point is at the start of the production process. We can take matches as an

opposite example. If a customer wants matches, he or she goes to the supermarket and buys them.

Unless he wants black matches, then he is out of luck. The decoupling point is right at the end of

the production process. The production of a car, however, comprises standard parts and customized

parts. Customers can still state that they want a specific colour, leather upholstery or different wheel

rims. The decoupling point lies somewhere in the middle of the production process. “Similarly, in the

production of a report, dashboard, or analytical environment, the decoupling point lies somewhere in

that middle area,” Damhof explains.

The decoupling point divides the production process into two parts: a push and a pull side, also

referred to as a supply-driven and a demand-driven part. Push systems are aimed at achieving eco-

The Data Push Pull Point

INTERVIEW

Push/Supply/Source driven

• Mass deployment• Control > Agility• Repeatable & predictable processes• Standardized processes• High level of automation• Relatively high IT/Data expertise

All facts, fully temporal Truth, Interpretation, Context

• Piece deployment• Agility > Control• User-friendliness• Relatively low IT expertise• Domain expertise essential

Business Rules Downstream

Pull/Demand/Product driven

27

nomies of scale as volume and demand increase, while the quality of the product and the associated

data remains guaranteed. On the other hand there are pull systems which are demand-driven. Diffe-

rent types of users want to work the data to produce ‘their’ product, their truth, on the basis of their

own expertise and context.

Opportunistic or systematic development?On the y-axis Damhof projects the development style dimension. “By that I mean: how do you

develop an information product? You can do so systematically; the user and the developer are then

two different people and you apply defensive governance, aimed at control and compliance. This puts

into practice everything that engineers have learned in order to create software on a sound basis. You

often see this in centralized, enterprise-wide data, such as financial data and data which is reported to

regulators.” You can also use an opportunistic development style. “In that case the developer and the

user are often one and the same person. Take for example the data scientist who wants to innovate

with data, who wants to produce and test analytical models. Or situations in which speed of delivery

is essential. The governance in these cases is offensive, which means the focus is on flexibility and

adaptability.”

“A quote I have stolen from Gartner analyst Frank Buytendijk: in an average organization the car park or art collection is better managed than data”

Ronald Damhof

The Development Style

INTERVIEW

Systematic

• User and developer are separated• Defensive Governance; focus on control and compliance• Strong focus on non-functionals; auditability, robustness, traceability, ….• Centralised and organisation-wide information domain• Configured and controlled deployment environment (dev/tst/acc/prod)

• User and developer are the same person or closely related• Offensive governance; focus on adaptability & agility• Decentralised, personal/workgroup/department/theme information domain• All deployment is done in production

Opportunistic

28

Data Quadrant ModelThe combination of these two dimensions produces the following picture.

“Quadrant I is where you find the hard facts,” Damhof explains. “This data can be supplied intelligibly

to quadrants II and IV in its full, raw volume. Data in quadrant I is produced by highly standardized

systems and processes, so it is entirely predictable and repeatable.”

Diagonally opposite, in quadrant IV, is data that is characterized by innovation and prototyping. “This

is the quadrant in which the data scientists work, who actually have only three demands: data,

computer power, and cool software.” Increasingly, separate departments are set up as innovation

labs giving data scientists free rein to use the data for experimentation and analysis, with the aim of

innovation. “You need this type of data management to discover and test good ideas. When a concept

works, it needs to be raised from the fourth to the second quadrant, because you can only achieve

economies of scale with data if you can generate and analyse it systematically. You can then use it

enterprise-wide.

“I often talk to data scientists who obtain very sound insights in a kind of sandbox environment,”

Damhof continues. “But they forget or are unable to monetize those insights in a production situation.

They cannot bring their insights from quadrant IV to quadrant II. This is where governance comes into

play.” And therein lies the major challenge for many organizations, as Damhof knows only too well.

“If you explain this model to managers and ask where their priority lies, they will all say they first

have to get their foundations in order, the first quadrant. But if you ask what they are investing their

A Data Deployment Quadrant

INTERVIEW

Systematic

Opportunistic

Development Style

Data Push/Pull Point Pull/Demand/Product drivenPush/Supply/Source driven

IVIII

III

“Shadow IT, Incubation, Ad-hoc,

Once off”

Facts Context

Research, Innovation &

Prototyping Design

29

money in right now, where they are innovating, it is often in the fourth quadrant. It is great that they

are engaged in this more experimental and exploratory form of data management, but that is only

possible if your foundations are right. Otherwise it is like having a hypermodern toilet that is not con-

nected to the sewer system, so it turns into a total mess.” Ask the average data scientist what takes

up most of his or her time and he or she will answer getting the data to the right qualitative level: the

aim of quadrant 1. “Only a data scientist with powerful analytical software, a lot of computer power,

and high-quality data will genuinely make a difference.”

Reliability versus flexibility“Managers insist that systems must be reliable and flexible, but these qualities are inversely related.

A highly reliable and robust system is less flexible. And in an extremely flexible system it is necessary

to lower the requirements with regard to reliability,” says Damhof. “The Data Quadrant Model makes

this clear to managers. In quadrant I reliability takes precedence over flexibility and in quadrants II

and IV flexibility takes precedence over reliability.” Quite a few different types of expertise and com-

petence are therefore required in order to make optimum use of data.

Expertise and competencesYou often find that organizations require a single person to supply expertise and competences which

cover the entire quadrant. Such people do not exist. Employees in quadrant I have an engineering

profile. They are information and data engineers, trained in data architecture and data modelling.

“Note that this is not the classic IT profile. These are engineers who can carry out model-driven

development and have a solid understanding of the need for conceptual and logical modelling.” This

expertise is very scarce. Quadrants II and IV on the opposite side require people with expertise in the

respective business domain supplemented by Business Intelligence and/or analytical competences.

Facts and truthDamhof also calls quadrant I of the model ‘the single version of the facts’. Those facts are then made

available to employees in quadrants II and IV. That enables them to create their own thuths. Since the

same facts are used to create multiple truths in the right-hand half of the model – depending on the

“With organizations generating ever greater volumes of data, they can no longer be so

slapdash in the way they handle it. Now is the time to make sure your data manage-ment and the associated governance are

properly set up. The Data Quadrant Model helps you to achieve this”

Ronald Damhof

INTERVIEW

30

context and background of the data user – Damhof calls this half ‘the multiple version of the truth’.

You should bear in mind that the ‘truth’ quite often changes over time. “You often hear companies

talking about ‘the single version of the truth,’ but there is no such thing. After all, how you interpret

particular facts depends on the context, your outlook, background knowledge, and experiences.”

Quadrant IIISo far, Quadrant III has received little mention, even though it is incredibly important. It is the quad-

rant of data sources which are not under governance, like an ad hoc download which you obtain from

an open data provider, a list in Excel that you want to use, or a set of verification data which you have

received on a CD. “You may even want to combine governed data from quadrant I with your own

dataset in quadrant IV, that’s fine,” says Damhof.

The journey through the quadrantsIn order to get value from data, you can make various movements in the model. You can move from

fact-based data management towards a model in which the context is also important (from quadrant

I to II). “This actually is the classic journey of ‘unlock data and produce an information product,’” says

Damhof. This is often inefficient, however, because this process is based on known requirements and

wishes on the part of the user. “And the user does not really have that knowledge in advance.” Many

organizations opt for a more agile-driven form, such as from quadrant I to quadrant IV to quadrant II.

Have the employees in quadrant IV produce an information product in an iterative way using the data

in quadrant I/III. You then promote the product to quadrant II only if it is important to bring this under

management.

“People in the business world often talk about ‘the single version of the truth,’ but there is

no such thing. There is a ‘single version of the facts’ and there are multiple ‘truths’. After all, how you interpret facts depends on the

type of organization, your outlook, background knowledge, and experiences”

Ronald Damhof

31

It is also possible to move from quadrant III to quadrant IV. “You have your own datasets and you

want to try something? Great,” says Damhof. The only movement an organization must never make

is from quadrant III to quadrant II. “Because in that case you use data that you are not entirely sure

of, as it has not been subjected to good governance in the required way. An example is a compliance

report for the regulator which you want to produce using data which is not under governance. You

should not seek to do that.”

Make data management a live issue for discussionIn his day-to-day work Damhof finds that his Data Quadrant Model helps organizations to talk about

data management. “From my current customer, De Nederlandsche Bank, I regularly hear statements

such as, ‘I want to move this data product from quadrant IV to quadrant II;’ or, ‘We must put the data

in quadrant I first, but the submitter is really responsible for the data in quadrant I;’ or, ‘I want some

space to store data temporarily in quadrant III.’ Everyone understands what it means. That is new;

the organization has never thought about data in that way before. And that actually applies to almost

every data-intensive company. Organizations have long spoken of data as an ‘asset,’ but in practice

they handle data in a very unstructured way. As a result they never monetize that asset. With orga-

nizations generating ever greater volumes of data, they can no longer be so slapdash in the way they

handle it. Now is the time to make sure your data management is properly set up. The Data Quadrant

Model will help you to achieve this.” ■

How we produce, process variants

INTERVIEW

Systematic

Opportunistic

Development Style

DataPush/Pull Point Pull/Demand/Product drivenPush/Supply/Source driven

IVIII

III

“Shadow IT, Incubation, Ad-hoc,

Once off”

Facts Context

Research, Innovation & Design

32

CASE

Caroline Denil

PROJECT MANAGER

BELGIAN FEDERAL

PUBLIC SERVICE

IMPROVING INSIGHT INTO BELGIUM’S ECONOMIC SITUATION

Immediate access to easily comprehensible data

33

CASE

Vincent Vanesse

BUSINESS ANALYST

BELGIAN FEDERAL

PUBLIC SERVICE

IMPROVING INSIGHT INTO BELGIUM’S ECONOMIC SITUATION

The Belgian Federal Public Service (FPS) Economy committed itself to creating a more powerful and transparent presentation of the Belgian economic situation for the general public, statisticians, and university students, among many others. Together with SAS, it created a single web portal that offers visitors direct access to the principal indicators of the Belgian economic situation.

34

All indicators are visualized in graphs for better comprehension and are fully customizable so that

users can immediately consult the indicators in which they are interested. The portal not only created

a user-friendly statistical environment, it also opened up possibilities for new business opportunities

within other Directorate Generals within the Belgian federal government.

Scattered information creates time-consuming researchOne of the main missions of the FPS Economy is the generation and publication of statistics and

figures characterizing the Belgian economic situation. Until recently, this information was accessible

through various sources: Statbel, be.Stat, Belgostat, and the National Bank of Belgium. In such a situ-

ation, it is difficult for students, researchers, journalists, and the many other users to find the required

information to answer their specific questions and draw accurate conclusions. Hence, the FPS Economy

initiated a project to improve the user-friendliness of economic data.

Multi-departmental collaboration improves statisticsThe first goal of the project was to increase the value of information. This proved to be an intense,

but truly indispensible process bringing together FPS Economy business analysts and statisticians. The

process led to the development of graphs depicting economical information, as well as metadata that

users can consult to better understand the information being presented. “As a result, some twenty

graphs were selected and then subdivided into eight categories, including among others, energy,

gross domestic product, and consumer price index,” states Vincent Vanesse, Business Analyst at the

FPS Economy.

A single portal for all economic indicatorsNext, the FPS Economy teamed up with SAS in order to make the economic indicators accessible via

a user-friendly tool. “We have been working with SAS for quite a long time now. As a result, we are

thoroughly familiar with their competence. The exceptional statistical capabilities, robustness, and

extendibility of their solutions made our choice of SAS for this particular project obvious,” notes Caro-

line Denil, Project Manager at the FPS Economy.

The collaboration resulted in the launch of a single web portal (Ecozoom) where various users can find

all of the economic indicators they need in just a few mouse clicks. “From now on, finding information

on Belgium’s economic situation is easy,” observes Denil. “The Ecozoom tool on the FPS Economy

“From now on, finding information on Belgium’s economic situation is easy. The Ecozoom tool on the FPS Economy website gives immediate access to the twenty main economic graphs”

Caroline Denil

CASE

35

website gives immediate access to the twenty main economic graphs. Those who want more detailed

information can still click-through to the traditional Statbel, be.Stat, and Belgostat websites.”

Visualization facilitates comprehensionThe online portal presents the economic indicators as graphs that make the information much easier

to interpret quickly and accurately. Denil points out that deducing trends based on a graph is far easier

than using a table or a long series of figures.

In addition, the tool is able to visualize four graphs simultaneously. This facilitates comparisons

between various types of data to verify the magnitude of the effect of, for instance, the number of

company failures on the unemployment rate.

The old adage that “a picture is worth a thousand words” certainly holds true for the FPS Economy

confirms Denil. “Our graphs can often convey much more than a lengthy text or series of tables. In our

specific situation, the graphs certainly help users to more easily and precisely evaluate the economic

situation in Belgium.”

Customization enhances userfriendlinessVanesse is quick to point out that the four graphs that are depicted on the Ecozoom homepage are

fully customizable. “Users can select the indicators they are most interested in and save this infor-

mation. Each time they subsequently consult the tool, they will immediately start with their desired

information.”

Opening up new opportunitiesAlthough the Ecozoom tool has considerably increased the userfriendliness of economic data, the FPS

Economy is already looking into possibilities that will extend its userfriendliness even further. “We are

currently testing geo-visualization in order to visualize data for specific Belgian regions,” illustrates

Denil. “On top of that, we are also planning to make the tool accessible for mobile use on smart-

phones and tablets.”

The Ecozoom tool might potentially even open up new business opportunities. “The tool has gene-

rated interest in other Directorate Generals, up to and including the top management level. This could

intensify the collaboration between the various FPS, and even create a new type of service,” con-

cludes Denil. ■

“Users can select the indicators they are most interested in and save this information. Each time they subsequently consult the tool, they will immediately start with their desired information”

Vincent Vanesse

CASE

36

CASE

37

DSM is convinced of the value of good data quality. The global science-based company that operates in the field of health, nutrition and materials has already implemented data quality successfully and is building on that success with SAS Master Data Management (MDM).

DSM GAINS CONTROL OF MASTER DATA

DSM introduces MDM, building on its successes

with data quality

CASE

38

MDM is a method of managing business-critical data centrally for decentralized use. Errors and dis-

crepancies in the so-called Master Data are tackled: items such as customer names, material types,

suppliers and other data used across divisions and IT systems. Consistency in that critical business data

plays a vital part in supporting efficient operations. MDM gives DSM control of the ERP systems which

it has absorbed in the wake of major acquisitions over the past few years.

From state-owned mines to chemicals and the environment“We have 25,000 employees, a substantially higher number than five years ago due to acquisitions,”

says Bart Geurts, Manager Master Data Shared Services at DSM. Geurts cites the acquisition of Roche

Vitamins in 2003 as one of the major purchases. DSM is now the world’s largest vitamin maker, and

that involves different data requirements. “Good data quality is extremely important, for food safety

and health as well as flavour. It is less critical for bulk chemical products.” Geurts alludes to DSM’s

origins in the state-owned mines of the Netherlands. “Old businesses know that they have to reinvent

themselves in order to survive.”

DSM has reinvented itself several times, from mining to petrochemicals, and in recent years from fine

chemicals to life and material sciences. In its current form DSM focuses on emerging markets and cli-

mate & energy. Geurts cites lighter materials such as a replacement for steel in cars that reduce their

weight and make them more economical. The group also develops products that are manufactured

using enzymes rather than oil. These are different activities and different markets, so the company

has different requirements in terms of company data.

More complete organization overviewThe many acquisitions involved in this transformation brought not only new activities and people, but

also many new IT systems. Geurts explains that these included a large number of ERP systems. In the

new organization, the many different IT systems were found to contain errors. Not serious errors, but

discrepancies which only came to light as a result of the combined use of data and systems.

Geurts mentions the example of a staff celebration marking the launch of the new company logo.

When sending out invitations to the company-wide event, 800 people were ‘forgotten’. This was due

to an incomplete overview in the HR environment. And he says there were more inconsistencies.

There was contamination of supplier data, for example. The same supplier may use different names in

different countries, with the result that the various systems in use in a multinational may show it as

different businesses.

“Good data quality is extremely important, for both health and safety”

Bart Geurts

CASE

39

Bart Geurts

MANAGER MASTER

DATA SHARED

SERVICES DSM

CASE

40

Linking MDM to business processesBuilding on previous experiences and results with data quality, DSM moved to a central MDM

approach. Geurts says the business data is good enough for transactions taking place within a par-

ticular silo, such as a country or business division. “But as soon as operations become cross-divisional,

problems are liable to emerge.” Leading ERP suppliers offer MDM solutions, Geurts says, but put too

much focus on the individual silos. That is why DSM chose the MDM solution from SAS.

Geurts stresses the importance of the link between MDM and the business processes. It highlights

the benefit for the organization as a whole and for the individual divisions which operate efficiently

within their respective silos. Key issues are who in the organization should be the owner of the MDM

process, who plays what role, and which KPIs (key performance indicators) are used. A possible

company-wide KPI for MDM is measuring how long it takes for one customer order to be processed,

delivered and invoiced.

Think big, act smallEstablishing the MDM process and addressing the issues involved was the easiest part according to

Geurts. He describes that as ‘devised on the sofa’. Then came the implementation phase, with the

deliberate choice of a relatively small-scale start. “We conducted a pilot in the sourcing department

based on the think big, act small precept.” The term ‘small’ needs to be put in context, however.

Worldwide, DSM has six sourcing front offices and two back offices. In this small-scale pilot, the incon-

sistencies in supplier data were tackled first. The diverse vendor data, which included duplicates, was

cleaned among other things by using different language algorithms in the SAS MDM product. “The

complexity lies in the details,” says Geurts from experience.

What data is critical for your business?As well as tackling the contamination of supplier data, steps were taken to deal with other master

data components. The offices concerned were asked to state which data was critical to their business.

“Because we couldn’t analyse all the data in the business.” That would be too large an operation and

place too heavy a burden on those systems. By answering the question which data was critical, the

gap between the MDM initiative and the involved business units was bridged. After all, they them-

selves specified the selection of data that is crucial for their own processes. Such a selection is neces-

sary due to the range of master data. DSM defines master data as everything that underlies processes

and transactions. At first sight, any error can cause inefficiency, but the extent to which it actually

does so depends on the type of data. “If the telephone number of a supplier’s salesperson is incorrect,

you may be able to send an email instead,” Geurts explains. But no such back-up is available if a bank

account number or a supplier’s address is incorrect.

Preventing errors On the basis of this data which the units defined as critical, data rules were drawn up. That took

around six months, after which the implementation was completed in around three weeks. A clear

benefit which MDM has delivered for DSM is the avoidance of errors. Geurts cites the example of an

order entered in the name of the wrong department. DSM is also introducing an improvement in the

inputting of supplier data, as people sometimes make errors when searching for an existing supplier

CASE

41

or entering a new one. If the search is unsuccessful, a new entry is created that is essentially a dupli-

cate. An algorithm is now linked to the data input, which checks and then asks the person who enters

the data: “Is this the supplier you are looking for?”

Master Data Management is a continuous processThe above advantages of MDM relate to internal matters such as the staffing overview, supplier data

deduplication and error prevention. But MDM also offers external benefits for DSM. “Suppose there’s

an error in a product or in material. We want to know immediately which products are affected.”

Speed is of the essence in such cases. It is also important to continue the data checks after the initial

MDM implementation. “Keep on checking! Otherwise you’ll have new problems two or three months

down the line,” warns Geurts. After all, MDM is a continuous process that remains active to prevent

new errors that would have to be fixed later. “You don’t want that, because it would disrupt your

business process.” Making sure that all the relevant people in the organization understand this is

instrumental in ensuring success. ■

CASE

42

INTERVIEW

WHO IS YOUR DATA GOVERNOR?

How Data Governance can facilitate future data mining

43

An outsider could conclude that data quality and data governance amount to the same thing. This is

not the case, however, even though there is a strong relationship between the two data disciplines.

“Data quality involves the implementation of quality rules; data governance goes much further,”

explains Bas Dudink, Data Management Expert at SAS Netherlands. “Who is responsible for the quality

of data? Which responsibilities are involved? What agreements have been made?”

Where does your data come from?Data quality concerns the accuracy of postal addresses and databases, for example. In order to ensure

lasting quality improvements in this area, agreements will need to be made and enforced. As such,

data quality can be implemented as a component of data governance, but the two are not inextrica-

bly linked. Data governance can come from various directions.

It may be based on a wider need felt by the organization, or it could be required by legislation and

regulations. Dudink gives the Basel agreement and the ensuing regulations for financial institutions as

an example. Banks are now required to answer the question: “Where does your data come from?” In

practice, the same applies to factories, which apply or are required to apply certain standards for the

materials they use.

Metadata, file formats and standardsData governance goes further than the origin of data. It also encompasses metadata and standards

for the formats in which data are delivered, processed and passed on to other organizations, including

external partners, internal departments, as well as future customers. It could apply to information

applications that are presently unknown or unseen.

As such, data governance transcends departments and business processes. Data management is now

still often encapsulated in the silo of a particular business activity or division. The use of the data is

based on the current state of affairs. “The primary, everyday processes usually work just fine,” Dudink

describes the practical situation. However, modifications have to be made to enable new activities or

company-wide functions such as risk management.

Management consultancy rather than technology Good data governance mainly concerns non-IT issues such as business processes, organizational pro-

cedures, working agreements and the enforcement thereof. There is a security component too: who

has access to which data and how is it protected? “It’s more like management consultancy: drafting

procedures,” the SAS expert explains. He estimates the technology component to be a modest 10 to

20 percent of the total work of a data governance project.

INTERVIEW

The world of data goes much further than simply collecting and using it to make money. Only good quality data can make money and good quality data often entails good management. Not traditional IT management, but data governance. Do you already have data governance strategy in place? Who is your data governor?

44

The key question is how an organization treats its data, for example in order to manage risks. This is

a matter that needs to be arranged for the whole organization, not just a single activity or division.

Good data governance takes account of the future, so that any future customers can easily be sup-

plied using consistent data, for example for new partnerships or new business activities.

It would seem logical to link this to data quality and this can indeed produce longer term advantages.

However, solely improving data quality without involving data governance can reduce the project to

an incidental, short-term improvement. It will then turn into an operation that either does not pro-

duce a lasting result, or one that has to be constantly repeated to get a result. A data governor can

prevent this from happening.

Taking shortcuts to achieve KPIsIf an employee, supervisor or department is held accountable for certain targets then they will obvi-

ously focus on these. If these targets involve responsibility for an activity, but no accountability, then

obviously this activity will not be given priority when things are busy or resources are limited. Such

logical business and career decisions are not always good for the company.

Take the example of the call centre. Their KPI was their customers’ waiting time in the queue. In order

to keep these scores high during a particularly busy Christmas period, they decided to skip a number

of fields in the CRM system, ‘only until things have calmed down again’. Is this a smart shortcut to get

results, or a lax attitude to the company’s business processes?

If no consequences are attached to this behavior, there is a risk that this measure will become perma-

nent sooner or later. After all, it saves time and so boosts the performance of the relevant department

and its manager. Even if this somber scenario does not come to pass, the damage has already been

done. Because who is going to check and possibly enter the missing data afterwards? This means

there is a gap; nuggets are missing from the data treasure chest. Good data governance would have

prevented the organization from making this error in its own call center.

Address errors upstreamAnother practical example is the invoicing and payment process of a healthcare institution. There

was a long delay between the receipt of the invoice and payment, and this period was getting even

longer. An investigation revealed the cause: the invoices contained erroneous transactions. The health

insurer exposed these errors, and so the healthcare institution became embroiled in the time-con-

suming task of repairing them. Every invoice had to be double-checked, which substantially delayed

payment.

The institution decided to tackle the problem upstream rather than fixing the erroneous invoices

downstream, after they had already been generated. Now every patient-related medical procedure

is subject to various data quality tests, and the healthcare professional that enters the data is given

direct feedback so they can fix any mistakes immediately. The result is that the payment term has

been reduced considerably. An additional benefit is that management reports, analyses and the

relationship with the health insurer have all improved.

INTERVIEW

Get more than your money backIt’s actually a childishly simple, universal principle: if you invest in improvements upstream, you will

get more than your money back downstream. But if this so obvious, why is it so rarely applied to data

management? Why do we start by building complex data warehouses as purification plants, rather

than purifying the water at the source?

Data governance can prevent gaps occurring in the data treasure chest and the resulting time-con-

suming task of repairing the errors. So how should you implement data governance? The implemen-

tation of data governance is like a tap dance, says Dudink. Step one is generally data quality, because

this is the source of the problem. But even before that first step, awareness needs to be raised within

the organization. If the awareness is there, and data quality is improved, then you still require insight

into the broader perspective. “Data quality is a step-by-step process,” continues the SAS consultant. A

holistic approach to data governance is recommended. It is, after all, about so much more than ‘sim-

ply’ IT and data management.

The timeliness of valueSo data governance is not easy to implement. “It’s a complex process,” says Dudink. It’s theory ver-

sus practice. In theory, data is always important; both in terms of quality and the control thereof. In

practice, this importance is not always reflected throughout the organization. It has not always been

identified who is the ‘governor’ of which data. This does not only concern the responsibility for the

data, but also the value that is attached to it.

Or rather: whether the value of the data is recognized in time. “Imagine your roof has a leak while

the sun is shining,” Dudink explains the timeliness aspect. The leak becomes a problem when it starts

raining, but by that time it’s too late. Data governance is a complex affair; a company-wide operation

that transcends IT and affects matters such as business operations, the corporate culture and Human

Resource Management. But in the end, good data governance offers major company-wide advantag-

es in today’s data-driven world. ■

45

INTERVIEW

“If you invest in improvements upstream, you will get more than your money back downstream. But if this so obvious, why do we start by building complex data warehouses as purification plants, rather than purifying the

water at the source?” Bas Dudink

46

I’ve noticed other people have this tendency too. It doesn’t matter whether I’m talking to clients

about analytics, CRM, data, or digital, the question always comes up: “Who should own that, the busi-

ness or IT?”

The question of ownership pervades the day-to-day at companies worldwide. It seems everyone is

focused on everyone else—who should own it? Who should manage it? Who should take credit for it?

Who should fund it?

But in watching the companies that were effective at bridging the proverbial busi-ness-IT divide, I noticed three common traits:

» The successful companies had leaders who realized that appointing people or changing orga-

nizational structures wasn’t enough. New ways of doing business were key, and new processes

needed to be practiced in order to ensure change adoption.

THE NEW IT

Jill Dyché VICE PRESIDENT

SAS BEST PRACTICES

Someone once classified the world into two types of people: those who like categorizing people into two types and those who don’t. I used to be one of those people, the kind that saw executives as either business-focused or technology-focused.

COLUMN

47

» These companies met their cultures where they were, working

within the strictures of top-down or bottom-up and ensuring that

these new processes and rules of engagement were new enough

to be compelling but not so disruptive that they would encourage

inertia or sabotage.

» Leaders at these companies didn’t embrace these changes for

their own sake. Rather they were (and are) considering how

trends like digital business are forcing fresh approaches to long-

standing business functions.

Using the trend of the digital business and innovation as the key drivers for make-or-break changes to

IT, I wrote about practices that successful leaders have embraced to not only transform IT, but to le-

verage technology in new ways for business benefit. ‘The New IT: How Business Leaders are Enabling

Strategy in the Digital Age’ features change agents who have emerged from the trenches to tell their

stories.

What I’ve learned from these leaders is what I write about in the book, including: » If your IT only has two speeds, you’re in big trouble.

» The question “What type of CIO are you?” misses the point. The real question is, “What type of

organization are you leading, and what should it look like?”

» Collaborating by getting everyone in a room isn’t good enough anymore. (In fact, it’s dange-

rous.)

» Corporate strategy and IT strategy can be aligned on one page.

» Hierarchy is being replaced with holocracy, homogeneity with diversity.

» Innovation shouldn’t be run by an elite SWAT team in a separate building with sushi lunches and

ergonomic desk chairs. Everyone should be invited to innovate!

» More people are talking about digital than doing it. Except maybe for you, if you can circum-

scribe digital delivery.

» You don’t have to be in Silicon Valley to join the revolution. In fact you might not want to be!

The leaders profiled in ‘The New IT’ - including leaders from Medtronic, Union Bank, Men’s Wear-

house, Swedish Health, Principal Financial, and Brooks Brothers, to name a few - have shown that it’s

no longer about business versus IT. Rather, it’s about business enabling IT. And vice versa. ■

It’s no longer about business versus IT. Rather, it’s about business enabling IT

COLUMN

48

CASE

Crédito y Caución is the leading domestic and export credit insurance company in Spain and has held

this position since its founding in 1929. With a market share in Spain of nearly 60 percent, for over 80

years the company has contributed to the growth of businesses, protecting them from payment risks

associated with credit sales of goods and services. Since 2008 Crédito y Caución is the operator of the

Atradius Group in Spain, Portugal and Brazil.

Crédito y Caución’s insurance policies guarantee that its clients will be paid for invoices issued during

their trading operations. The corporate risk analysis system offered by Crédito y Caución processes more

than 100 million company records updated on an ongoing basis. It carries out continuous monitoring

of the solvency performance of the insureds’ client portfolio. Its systems study more than 10,000 credit

transactions every day, setting a solvency limit for the client which is binding on the company. To

determine that risk, Crédito y Caución requires comprehensive and accurate information on its clients’

clients. The efficiency of the service that Crédito y Caución provides to its clients largely depends on the

quality of the data contained in that information.

The ability to assess the risk that invoices are not paid by the customer is of vital importance to credit insurance companies. But what if you cannot rely on accurate information? At Crédito y Caución, data quality spearheads the implementation of long-term strategies. A look in their kitchen.

CRÉDITO Y CAUCIÓN ADDS DATA QUALITY TO ITS MANAGEMENT MODEL

Credit insurance company integrates quality ratio’s in risk assessment

49

CASE

50

Adapting to new regulationsLike all European insurance companies, Crédito y Caución must comply with the risk-based superviso-

ry framework for the insurance industry. The framework consists of the Solvency II Directive and its

Implementing Technical Standards and Guidelines. Besides financial requirements, Solvency II includes

requirements on the quality of the data handled by insurance companies. Under Solvency II, the accura-

cy of the information is no longer optional. Data quality is essential for decision-making and for certify-

ing compliance with the requirements of the new regulatory framework.

Crédito y Caución has approached its Solvency II compliance by pursuing a strategic vision that reaches

far beyond the contents of the EU directive. “Information is our greatest asset,” says Miguel Angel

Serantes, IT Development Manager at Crédito y Caución. “We are experts in locating it, storing it and

analysing it, as well as obtaining business intelligence from this information for our activities. The chal-

lenge posed by Solvency II created the opportunity to incorporate quality ratios into the information

management and to integrate these into our procedures. We do not simply meet the requirements,

but rather we are committed to instilling the highest quality in all our data management operations.

We have transformed Solvency II into a company process, integrating it into our business intelligence

environment.”

The first step: assessing data qualityThe first step in taking on this challenge required performing an assessment of the quality of the data

handled by Crédito y Caución. “We took advantage of the assessment option provided by the SAS

solution, to perform an analysis of the quality of our own data,” adds Serantes. “The results showed us

that there was still a way to go. Keep in mind that much of our data such as company names, phone

numbers and tax codes come from public, third-party sources with varying degrees of inaccuracy. From

there we decided to design and implement a data management quality model and opted for SAS,

which we integrated into our management system.”

Miguel Ángel Serantes and his team developed the foundations for the data management policy of the

company, by establishing the essential criteria to be met by the data managed: it had to be accurate,

complete and appropriate to the operations of the company. They determined the various data owner

levels that would be responsible for its content, definition, use and administration. They established

compliance ratios for each category of data, so that it would be possible, through a system of indica-

tors, to obtain an immediate overview of the quality level of each piece of data.

A constantly evolving process“We decided on SAS for various reasons,” says Serantes. “SAS has been our information management

solutions provider for many years. The relationship is very smooth. They have a resident team at

Crédito y Caución that works closely with our IT department. All of this has aided us in the efficient

integration of SAS Data Management into our information management system. It is a solution that

fits our needs. It makes it possible to set criteria and attributes to define the quality of data; it has

options for its assessment; it identifies problems with quality and helps to resolve inaccuracies. The

solution aids the implementation of long-term strategies and enables permanent monitoring of the

quality of our data.”

CASE

51

The deployment of the data quality control system at Crédito y Caución took around a year. Although,

as Serantes says, it is a never-ending process. Data quality control is constantly evolving. The advanta-

ges and benefits of this strategy and the technology solution implemented have been obvious from

the outset. “For starters, we have a data policy that is well-defined and known throughout the compa-

ny. We know what to do with the data and who is responsible for each area of data. We know where

the weaknesses are in the data and how to correct them. In addition, SAS gives us information on the

cause of inaccuracies in the data. We expect to obtain further qualitative benefits, such as the definition

of quality objectives for each piece of data. This allows us to focus on those controls that are relevant to

the business.”

The aim of Crédito y Caución is to achieve 100 percent quality in the management of the data gene-

rated by the company itself, over which it is able to establish rigorous controls. For data from external

sources over which Crédito y Caución has no direct control, the aim is to establish standardization crite-

ria in order to achieve maximum quality. ■

CASE

Miguel Angel Serantes

IT DEVELOPMENT

MANAGER AT

CRÉDITO Y CAUCIÓN

“Information is our most important

asset”Miguel Angel Serantes

52

INTERVIEW

Dudink knows why the data that businesses collect and use is often of such poor quality. It is because

everybody uses the same data. This may sound contradictory, but it’s not. On the one hand, it means

that poor quality data is continually used and reused, complete with errors, omissions and a lack of

context. Dudink is adamant on the latter point: “Data without a context is useless.”

On the other hand, using the same data for everybody entails that the same data source is used for

different uses and purposes. This may sound like a sensible approach for efficient data administration

and entry, but different applications require different quality levels of different data fields. This con

cerns IT applications and systems, as well as the use of data in various business activities.

The devil is in the detailsIf you are sending an order to a customer you will need address details but no bank account number.

If you are paying invoices, it’s the other way around. This is a simple example, with data fields that

DATA DEPENDENT ON QUALITY IMPROVEMENTS

Data usefulness hinges on quality

Time is money, and nowadays data is money too. Data is worth its weight in gold, so it’s odd that the quality of data is so often overlooked. In light of the importance of data today, including metadata and Big Data, you would think that data quality should no longer be an issue. Surely this is self-explanatory? “You would think so,” replies Bas Dudink, Data Management Expert with SAS Netherlands. “But data quality is poor by definition.”

53

INTERVIEW

54

are relatively recognizable and carefully administrated – or, in any case, they should be. It becomes

more complicated when smaller details are involved that can still have large consequences. Think of

deliveries between companies with several divisions, whereby the business units buy and sell servi

ces from each other. Subtle sales and/or purchasing discrepancies can result in major variations. Does

the one division or subsidiary offer a special discount, or does it have different delivery and payment

conditions? The data on prices, conditions and terms can vary and, as such, represent poor quality.

The danger of ignoranceDudink puts his bold statement, that data quality is poor by definition, into perspective: “It’s all in the

eye of the beholder. What’s good enough for one person may not be good enough for someone else.

It’s about awareness.” The fact that the data are not in good order is not the worst part. The problem

is being unaware of this. Dudink quotes a wellknown saying: “Ignorance is bliss.” The biggest danger

is when you think your data are accurate and up to date and take action and make decisions on this

basis. An additional problem is that many companies mistakenly think that they are not data compa

nies. In fact, in these days, every company is a data company. This even applies to the seemingly sim

ple case of a production company that processes physical goods and turns them into new products.

Both its manufacturing process and production line design are based on data!

More of a business issue than an IT matterDesign is usually followed by optimization on the basis of practical experience. The data accrued along

the way will be important if the company wants to open a new factory or optimize an existing pro

duction facility. Repeatability and improvement, summarizes Dudink, are core actions that depend on

good data.

And good data leads to good business processes. Only then is automation at the business level pos

sible. This requires awareness of data quality within the organization. “They have to know that there

is something they don’t know.” IT suppliers that offer data solutions are thus more involved in the

consultancy business than they are in software. “We don’t only provide software, we also provide

advice.”

More than just rubbing out errorsStill, the early stages of data improvement have little to do with tools and systems. It is mainly a

question of analysis: what data is concerned, how did the company acquire it, who provided it and

how? These are quite fundamental questions that could affect established processes and procedures,

or even partners. For who dares to openly complain that a supplier or customer has provided poor

data? Yet this is quite often the case. “If the data is erroneous, then it’s not sufficient to just fix the

errors. You need to comb through the entire process: where was the error generated?” All too often,

data quality is seen as a way to rub out errors. This is sometimes egocentric, stemming from an

organization’s wish to sell itself as a welloiled machine. Dudink calls this “keeping up appearances”,

but the right approach to data quality goes much further and much deeper.

INTERVIEW

55

The need for a championData quality is not very popular in practice. The shining examples in data quality improvement that

Dudink can think of have been forced into it. “Thanks to regulations. If data quality is not compulsory,

then most people would rather leave things as they are.” There is a concrete argument behind the

rules and laws that make better data compulsory: risk management. It will be no surprise that banks

and financial institutions, including insurers, are among the frontrunners.

Without the regulatory stick, an organization needs to have a data quality champion. “Someone who

can see what information is going to waste and who sees the added value in doing something about

it.” The difficulty is that such a person has to have a broad overview of the organization as well as the

insight and power to implement the change. “Preferably a CIO,” Dudink summarizes succinctly.

Basic tip: start smallDespite the necessity of having someone in a senior management position to push through data

quality improvements, it is still advisable to start with a small project. ‘Small’ is, of course, a relative

term, depending on the size of the organization. Dudink recommends that the first project is used to

germinate the idea of data quality improvement; as a breeding ground and learning school. The high

placed manager that initiates the project will need to combine a topdown vision with a bottomup

approach. This combination is required to achieve the desired improvement in business data quality.

“I think CIOs will see this too.” ■

“What’s good enough for one person may not be good enough for someone else. It’s about awareness”

Bas Dudink

INTERVIEW

56

INTERVIEW

DATA-DRIVEN DECISIONS MAKE THE DIFFERENCE

Analysing data streams for operational decision-making

Fast, faster, fastest; that is the motto of these modern times. IT has made things faster, but now the flood of data is threatening to inundate us. Although thorough data analysis can be time-consuming, Event Stream Processing (ESP) and Decision Management provide a way to make it faster and more efficient.

INTERVIEW

We are generating, storing and combining more and more data and we want to conduct ever more

complex data analyses. “In certain environments, the sheer volumes of data can be overwhelming,”

says Andrew Pease, Principal Business Solutions Manager with SAS Belgium’s Analytical Platform

Center of Excellence. “In some cases, it may not be practical to store it all, despite the declining costs

of data storage.” And there is more than just the storage costs at stake. Analysis costs can be pro-

hibitive as well, especially if the analysis takes place retrospectively. In some cases, the analysis will

come too late to be of any use.

The solution is to filter real-time data by means of Event Stream Processing and then apply an auto-

mated decision-making process using Decision Management functionality. This automation is carried

out on the basis of business processes which have a built-in link to the tactical and strategic deci-

sion-making levels. Decisions that affect the work floor or even direct interfaces with the customer

can now be taken automatically while keeping in line with the organization’s strategic goals.

Trends and anomalies on drilling platformsPease provides an example where ESP can be of use: sensor data input on drilling platforms. Some

of these sensors detect vibrations, which can be used to identify anomalies. Anomalies can point to

problems, so it is important to get the results of the data analysis fast.

A key aspect here is focus and scope. “You need to focus on what you want to know beforehand.” The

SAS expert explains that it would be pointless to analyse the vibrations every second or every

minute. A fixed time range per hour may be sufficient. The main thing is that trends and anomalies

are identified, so that prognoses can be made. ESP can process both scoring code generated by po -

werful predictive analytics as well as expert-written business rules to allow for these kinds of analyses

to determine if pre-emptive maintenance is warranted.

Mobile plans and extra cell towersAnother example is the use of call data by telecom operators. “Telecom operators sit on huge stock-

piles of data,” explains Pease. Each telephone conversation involves some 20 to 30 call detail records.

“But they aren’t all important for analysis.” However, the analysis of the right pieces of data can

reveal a lot of important information. The obvious example is the clearly useful data on the timing of

any customer’s call, so you can offer a better plan. Calling behaviour need not be tracked extremely

closely; it will usually be sufficient to identify the general patterns. “If the customer pays too much for

too long, the risk that he or she switches to a new operator will increase.”

Further, such analysis can indicate gaps in network coverage. The operator may also decide to install

extra cell towers on the basis of the number of dropped calls. By analysing the dropped calls, the

telecom operator can accurately identify the gaps in coverage or capacity. This means they can erect

an extra cell tower in the exact location where the most dropped calls are.

6157

The added value of Event Stream Processing is the ability to monitor and analyze large quantities of, for example, transaction or sensor data in real time, which results in immediate reactions to specific situations or the required interference.

Event Stream Processing

58

Catching stock exchange swindlers red-handedThe financial world works at a faster pace than most other sectors. Stock trading is automated and

fast as lightning. In this fast-paced world, ESP can be used to detect stock market fraud early on, says

Pease. The speed of the trading systems makes it impossible to store all the data that goes through

them, let alone analyse all these data streams.

“The trick is to filter out the anomalies; the suspicious transactions.” Pease mentions a book that was

released this year called ‘Flash Boys’ which is about the so-called flash traders or high-frequency

traders. Author Michael Lewis paints a very sombre picture of the international money markets,

whereby the speed of data is of the essence. The faster the data, the better the price of a purchase or

sale. Although this nonfiction book has also been criticized, it does contain many interesting lessons

and truths.

Pease relates the story of a Canadian trader’s sly trick. He split up his share trading activities and

found a second server that was some two kilometres closer to the stock exchange. He then set up

a perfectly timed delay so that his various orders arrived at different clearing houses at exactly the

same time. This enabled him to trade in bulk while ensuring that his sales did not lead to a lower

stock price and hence lower yields. Cunning and almost undetectable.

Keeping up with fast flowing data streams“These days, some data streams flow so fast that it is impossible to analyse all the data they contain,”

says Pease. This is because it takes time to process the data. Just as falling storage costs do not com-

pletely compensate for the data explosion, the advances in computing power cannot always keep up

with the analysis requirements. The cunning share trading trick using server timing is almost impos-

sible to detect, unless ESP is deployed. This is because ESP turns fast streams of data into reasonably

bite-size chunks.

There is a rising demand for fast, real-time analysis of data. This is because of the new applications,

increasing numbers of sensors that collect data and an increasing range of business models that all

depend on reliable data analysis. Where the quality of the analysis used to be paramount, speed has

become equally important. The developments affect the financial and telecoms sectors and the manu-

facturing industry the most, Pease explains.

INTERVIEW

Decision Management aims to improve organizations by enabling faster, smarter (oper-ational) decisions. This can be done by utilizing all available data in combination with au-tomated application of analytical models and derived business rules, weighing the time required and possible risks.

Decision Management

59

Beer brewing and the right cheese with your wineManufacturing companies are also installing more and more sensors in their production environments.

This is obviously an advantage for high-grade manufacturing of technologically complex products, but

other applications are being found as well. Beer brewers, for example, use sensors to detect wet hops

so the processing of the raw material can be timed on the basis of the sensor readings. Moreover, it

also enables them to schedule maintenance of the kettles more efficiently.

ESP is also implemented in other sectors, such as retail. The same applies to Decision Management.

Pease points to the interesting opportunities for scanner use by supermarket customers. If a customer

puts a bottle of wine in his trolley, then why not suggest a cheese to pair with this wine? The super-

market could even offer a discount to make it more attractive. The IT system knows that there is cur-

rently a surplus of cheese that will spoil in an x number of days, so it’s worthwhile selling it at a lower

price. This does require a reliable stock inventory system, including information on use by dates. It is

not the terabytes that count, but the speed: this all has to happen before the customer reaches the

checkout. “You need to make the right offer at the right time.”

The business strategy is paramountThe offer to the customer must also harmonize with the organization’s business strategy. The core

of Decision Management is to centralize and standardize this strategy. Every decision must concur

with the business strategy. Is a discount offered by the marketing department in line with the

policy of the risk management department? If a customer buys in bulk, then the discount could be

increased. This also has to do with the longer term value that such a customer can have for the

company. Clearly defined business rules are required if this is to be managed adequately. “The

business rules need to be entered and administered from a single central location.” The rules also

need to be consistently applied across the organization’s systems and activities. This is the only way

to guarantee that the organization’s various silos can work as a whole in order to facilitate effective

Decision Management. ■

“Where the quality of the analysis used to be paramount, speed has become equally important”

Andrew Pease

INTERVIEW

60

CASE

Data can seem deceptively simple. Take a basic data entity such as a supplier: the company name may vary per country, business process and/or activity. Master Data Management (MDM) is used to resolve these discrepancies and can prevent a lot of unnecessary work and follow-up.

CASE

It seems only logical that supplier data are accurate and up to date. After all, suppliers are important

business partners; goods and services are received from them and payments are made to them. But

still the data on such entities often leaves much to be desired. Alongside simple data entry errors and

data corruption, there is another cause of data disparities: data entry differences, potentially across

different systems.

Supplier A = Supplier 1 (= B2)One business system may refer to ‘Supplier A’, while a back-end application may use the descriptor

‘Supplier 1’. Automatically finding and replacing ‘A’ with a ‘1’ or vice versa seems the obvious solu-

tion. The problem is that this can lead to unwanted side effects, such as references in the relevant

systems or from other applications which then cease to work.

Then there is the additional risk of historical discrepancies occurring. For example, an invoice sent by

Supplier A has gone missing. The reason is that it has been renamed. In theory, an interlayer could

exist that has monitored the renaming process and is able to trace it back to its origin. This leaves the

question of the impact on the system performance, because Supplier A and 1 is really too simple an

example.

6561

Getting a grip on your data

MASTER DATA MANAGEMENT AS A FOUNDATION OF YOUR BUSINESS

62

Discrepancies due to uncontrolled growthDiscrepancies in these so-called Master Data are typically caused by the uncontrolled growth of IT sys-

tems, as Bas Dudink, Data Management Expert at SAS Netherlands, has learned from experience. The

growth is often driven by the ERP systems, the critical business environments for resource planning.

These complex, extensive systems can fulfil their invaluable roles only after a lengthy implementation

project. This means that they will only be expanded, customized or phased out if it is really necessary.

The explosive growth of IT systems can be part of an organic process, but it could also be the conse-

quence of takeovers. Company X buys company Y and Suppliers A, 1, B2, etc. encounter each other

in the combined IT environment of the newly formed division. It is often not possible to completely

integrate all systems, or it may concern a megaproject that is scheduled to happen ‘later on’, after the

impact of the takeover has been absorbed and after the business opportunities have been seized.

As many as 50 ERP systems“Some companies have as many as 50 ERP systems,” explains Dudink. “For various countries, for

various business processes, et cetera.” This means that the Master Data can be stored in 50 different

ways and used in 50 different ways. These are not theoretical differences: Dudink sees this in practice

all too often.

“Some companies have as many as 50 ERP systems. This means that the Master Data can be stored in 50 different ways and used in 50 different ways”

Bas Dudink

CASE

63

Initial steps: centralization and deduplicationDudink recommends involving the people and departments who will reap the benefits of MDM at an

early stage. This will help make the value of the project tangible. MDM can have an enormous impact,

because those 50 different forms of data storage in 50 ERP systems cause a great deal of inefficiency.

The first step is centralized implementation, which leads to considerable administrative cost savings.

You would think the logical next step would be to make the data available to the wider organization,

but it is not time for that yet. First the second, essential step in MDM must be taken.

The second step is data deduplication. This entails finding a supplier who has two different labels in

the IT environment and removing one of these, for example. This is how to make sure that Master

Data discrepancies cannot lead to an invoice getting lost for three months. Inconceivable? It actually

happened to a major corporation. An invoice for a considerable amount of money had simply ended

up in the wrong tray – except that no one knew which ‘tray’ in the immense digital environment the

invoice was in.

Who is allowed to change what?Discrepancies are a fact of life but they do need to be prevented as much as possible. This could be

considered step three of the MDM implementation: prevention of new data discrepancies. The key

question is: who controls the Master Data? Who has the authority to change a supplier’s bank account

number? The ownership of the data in relation to its use is important. Who owns the data that were

entered in the Netherlands and which are primarily used for business in the United States?

The identification of data ownership and use should lead to a logical review of that ownership. Once

cleaned up, the Master Data will be more deployable for the divisions and business processes that

benefit the most. The data will initially be deployed from the (now clean) central storage location and

can then be made available to the ‘lower’ levels of the organization.

MDM is more than detecting errorsGood Master Data Management should lead to more than just the detection and resolution of errors.

It can also be used to detect and prevent fraud, such as the so-called ‘ghost invoices’. These invoices

are sent by swindlers and made to appear as if they have been sent by bona fide suppliers. MDM pre-

vents these swindlers from slipping through the cracks in the system.

Discrepancies, errors and fraud can occur because companies and their systems have become more

and more complex. When a supplier or customer registers for the first time, the transaction will nor-

mally take place without any problems. This changes after several years and thousands of transac-

tions, processed by hundreds of people across several departments, whereby the organizational struc-

ture may have also changed. The registration of this one supplier or customer has since been subject

to countless changes, actions and exceptions. “It is quite a challenge to ensure that every transaction

is correctly processed,” as Dudink knows from experience. This challenge stands or falls with good

Master Data Management. ■

CASE

ABOUT SAS

SAS understands that almost everything is data driven. We want to help you make sure that this takes place

correctly. Is your data easily accessible, clean, integrated and correctly stored? Do you know which types of

data are used in your organization and by whom? And do you have an automated method that validates

incoming data before it is stored in your databases?

Take better decisions

Thousands or maybe even hundreds of thousands of decisions are taken daily in your organization. Everyday

decisions taken as part of a process: Can we grant this loan to this customer? What offer should I make to a

customer who calls our contact centre? Tactical decisions, such as: What is the optimum balance between

preventive and responsive maintenance of machinery? If we want to scrap five of the fifteen flavours we

offer, which should they be? But also strategic decisions on your organization’s direction. For example,

in which product-market combinations do we want to be present? Information plays a role in all these

decisions. The better the quality of the underlying data, the better the decisions you take.

Get control of your data

Making sure data is complete, accurate and timely can be a time-consuming job. Fortunately, the task can

be largely automated. Spend less time gathering and maintaining information and more time running your

business with SAS Data Management. This solution has been built on a unified platform and designed with

both the business and IT in mind. It is the fastest, easiest and most comprehensive way of getting your data

under control. SAS Data Management brings in-memory and in-database performance improvements which

give you real-time access to reliable information.

Proper process set-up

To succeed in today’s data-driven world, you’ll need more than just a robust data management platform.

Processes and behaviour also play an important role when it comes to master data management, data inte-

gration, data quality, data governance and data federation. We can help you to set these up properly too.

Because getting your data in order is not a one-off activity, but a continuous process. Whether you have a

large or not-so-large volume of data, you transform it into great value and possibilities.

Want to know more? Visit our website www.sas.com/dm, or contact us at [email protected]

64

66

Realization: SAS Nederland

Editor: Ilanite Keijsers

Photography:Eric Fecken

Authors:Jasper BakkerMirjam Hulsebos Chantal Schepers

Cover:Philip van Tol

Design:Alain Cohen

Project management:SAS Nederland

The book, Future Bright – A Data Driven Reality, was commissioned by SAS Netherlands. Content from the book can only be either copied or reproduced onto print, photo copy, film, Internet and any other medium, with the explicit permission of SAS Netherlands and its management, with proper acknowledgement. SAS Netherlands is not responsible for the statements made by the inter-viewed parties in this book.

COLOPHON

Documents

Futurebrightdatamanagement engels