14
Dimitrios Apostolopoulos [email protected] 25/01/2016 @ New York College - Thessalonikis 238, Tavros

Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

Embed Size (px)

Citation preview

Page 1: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

Dimitrios [email protected]

25/01/2016 @ New York

College - Thessalonikis 238,

Tavros

Page 2: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

Dimitrios Apostolopoulos was born in 18/08/1978.

Lives in Athens.

Is the father of two boys, six and two and a half years old.

Studied in Athens University of Economics and Business (Department of Informatics).

Is a candidate certified Data Scientist (John Hopkins University).

Works since 2005 in the banking industry (Bank of Cyprus, Piraeus Bank) as a full stack Business Intelligence Developer with participation in many projects regarding data warehousing, reporting, OLAP, analytics, credit controls (Pimco, Black Rock), etc.

https://gr.linkedin.com/in/dapostolop

[email protected]

https://twitter.com/dapostolopoylos

Lecturer’s profile

2

Page 3: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

What is a Data Warehouse?

• Late '80s, Barry Devlin and Paul Murphy ("The business data

warehouse").

• It's founding concept was to create an architectural model for the

flow of data from the operational systems to the decision support

environments.

• A system used for reporting and data analysis.

• Central repository of integrated data from one or more disparate

sources.

• Current and historical data.

• ETL (Extract, Transform, Load).

• The main goal of creating and maintaining a DWH is to have data

that is cleaned, transformed, cataloged and available for use by

business professionals for data mining, online analytical

processing, market research and decision support.

3

Page 4: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

What is a Data Warehouse?

4

Page 5: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

What is Business Intelligence, what can you do with it?

• Business intelligence (BI) is often described as "the set of

techniques and tools for the transformation of raw data into

meaningful and useful information for business analysis

purposes".

• BI technologies are capable of handling large amounts of

unstructured data to help identify, develop and otherwise

create new strategic business opportunities.

• The goal of BI is to allow for the easy interpretation of these

large volumes of data. Identifying new opportunities and

implementing an effective strategy based on insights can

provide businesses with a competitive market advantage and

long-term stability.

5

Page 6: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

A few basic concepts: Data, Metadata, Data Cube, Data Marts

• Data is a set of values of qualitative or quantitative

variables.

• Data is measured, collected and reported, and analyzed,

whereupon it can be visualized using graphs or images.

• As a general concept refers to the fact that some existing

information or knowledge is represented or coded in

some form suitable for better usage or processing.

• Closely related to the concepts of information and

knowledge but not the same.

• Data is collected and analyzed to create information

suitable for making decisions, while knowledge is

derived from extensive amounts of experience dealing

with information on a subject.

6

Page 7: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

A few basic concepts: Data, Metadata, Data Cube, Data Marts

• Metadata is "data that provides information about other

data".

• Structural metadata is data about the containers of

data (File size, location, etc).

• Descriptive metadata uses individual instances of

application data or the data content (Purpose of the

data, author of the data, etc).

• One of the first forms of metadata were the card catalogs

in libraries.

• The main purpose of metadata is to facilitate in the

discovery of relevant information.

• Helps organize electronic resources, provide digital

identification, support archiving and preservation of the

resource.

7

Page 8: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

A few basic concepts: Data, Metadata, Data Cube, Data Marts

• A data cube (or OLAP cube) is a term that typically

refers to a multi-dimensional array of data, a multi-

dimensional generalization of a spreadsheet.

• For example, a company might wish to summarize

financial data by product, by time-period, and by city to

compare actual and budget expenses. Product, time, city

and scenario (actual and budget) are the data's

dimensions.

• Each cell of the cube holds a number that represents

some measure of the business, such as sales, profits,

expenses, budget and forecast.

• The elements of a dimension can be organized as a

hierarchy, a set of parent-child relationships, typically

where a parent member summarizes its children.

• Basic operations: slice, dice, drill down, roll up, pivot.

8

Page 9: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

A few basic concepts: Data, Metadata, Data Cube, Data Marts

Let’ s see how an OLAP cube looks like! The online OLAP example in RadarSoft’ s site will give you a

pretty good taste of what all these terms like dimensions, measures, slice and dice, really mean!

9

Page 10: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

A few basic concepts: Data, Metadata, Data Cube, Data Marts

• A data mart is a simple form of a data warehouse that is

focused on a single subject (or functional area), such as

sales, finance or marketing.

• Data marts are often built and controlled by a single

department within an organization.

• Given their single-subject focus, data marts usually draw

data from only a few sources.The sources could be

internal operational systems, a central data warehouse, or

external data.

• Data marts improve end-user response time by allowing

users to have access to the specific type of data they need

to view most often by providing the data in a way that

supports the collective view of a group of users.

10

Page 11: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

What is OLAP?

• 1993, Edgar F. Codd

• OLAP (Online Analytical Processing) performs

multidimensional analysis of business data and provides

the capability for complex calculations, trend analysis

and sophisticated data modeling.

• MOLAP (Multidimensional OLAP)

• ROLAP (Relational OLAP)

• It is the foundation for many kinds of business

applications for Business Performance Management,

Planning, Budgeting, Forecasting, Financial Reporting,

Analysis, etc.

• OLAP enables end-users to perform ad hoc analysis of

data in multiple dimensions, thereby providing the

insight and understanding they need for better decision

making.

11

Page 12: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

Real life project example

DWH migration in Bank of Cyprus

• Business need Daily update of the DWH and data

enrichment.

• Existing situation Outdated technology, long loading

time, difficult to apply any changes due to company

policies and technical boundaries, low rate of the

information utilization.

• Solution ETL and database migration, complete

modification of the architectural model, use of advanced

technics for table partitioning, database mirroring, user

security policies, etc, new reporting and analytical

environment (BI portal).

• Aftermath High rate of end user satisfaction, small

loading time, high query performance, easy and fast ad

hoc reporting, ability to create more sophisticated BI

projects, better insights/decision making.

12

Page 13: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

Real life project example

Cash Flow Monitoring in Piraeus Bank

• Business need Because of the financial situation there

was the need to monitor and analyze the cash flow around

the clock.

• Existing situation The information existed in many

different applications and systems and it wasn’ t analyzed

as it should be or it wasn’t at all.

• Solution Creation of a data mart with all the data from

various sources integrated. Data analysis, trend lines,

alerts. Three times a day a full multi page analysis with

charts and reports is sent automatically to a group of

users who belong in the C level.

• Aftermath Management had a clear view of the cash

available, better decision making, better customer

service.

13

Page 14: Visiting Lecture @ NYC (2016-01-25) - Introduction to DWH and BI

Career paths in DWH/BI

• There is a broad range of career paths due to the nature

of the subject.

• According to someone’s preferences you can be:

• Database specialist, administrator, developer or

architect.

• Data Warehouse / ETL specialist, a DWH architect or

a DWH developer.

• Analytics specialist (Performance Management)

• Reporting specialist

• Web specialist

14