21
Enterprise Business Processes and Reporting (IS 6214) MBS MIMAS `17 th Feb 2010 Fergal Carton Business Information Systems

Enterprise Business Processes and Reporting (IS 6214) MBS MIMAS `17 th Feb 2010 Fergal Carton Business Information Systems

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Enterprise Business Processes and Reporting

(IS 6214)

MBS MIMAS

`17th Feb 2010

Fergal CartonBusiness Information Systems

Last week

• Feedback on presentations

• Data cubes

• Types of data– Dynamic / Static– Cucina example

• Soft information

• Data recording

This week

• Data storage

• Deciding what information to collect

• Exploiting data warehouses

• Cucina and real time information

• Extract, transform, load (ETL)

• Real time data

• Refresh rates and response times

Data storage

• Series of ad-hoc systems manual and computer-based (spreadsheet, filed forms…)

• Dedicated databases for manufacturing data (QC, shipping etc…)

• Process Control Systems (technical parameters)• Other specialised proprietary systems

(integration may not be easy).• ERP system with its own data structure or fed by

existing systems

Deciding what information to collect

• Information cost + overload mean not all data are useful

• Some framework can be used – e.g. Critical Success Factors (CSF)

• Questions that must be answered:– How is it measured and broken down?– How often should it be measured?– Who should know about it?– Where can the data be found?– How should it be presented?

ExtractionCleaningTransformationLoading

Relational Databaseon a dedicated Server

De normalised, data

Static Reporting

Scrutinising

MultidimensionalData CubesOLAP tools

Data Warehouse

Source Systems

Discovering

Data Mining…….

Data StagingArea

Exploiting the DW data

Think about real-time for Cucina

• What information is required real time?

• Can you differentiate between report types– Static– Scrutinising– Discovery

ETL Tools

• Extraction, Transformation, and Loading

• Specification based

• Eliminate custom coding

• Third party and DBMS based tools

Data extraction and transformation

• Getting data out of legacy applications

• Cleaning up the data

• Enriching it with new data

• Converting it to a form suitable for upload

• Staging areas

Data Quality Problems

• Multiple identifiers• Multiple field names• Different units• Missing values• Orphaned values• Multipurpose fields• Conflicting data• Different update times

Example 1 – the supplier file

Sup code Sup name Sup address City Phone4 digits

Sup code Sup name Sup address… Phone Cat3 letters + 1,2,3 depending4 digits on total purchases

last year

OLD

NEW

New supplier code to include city where firm is basedAssignation of category based on amounts purchased

Example 2: merging files

• Complete customer file based on Accounts and Sales and Shipping

OLD (finance)

CustID name address city account number credit limit balance

OLD (sales)

OLD (Shipping)

CustID* name address city discount rates sales_to_date rep_name

CustID** name address city Preferred haulier

Refreshing databases

• Timing

• Criticality of information

• Volume of data

• Response time

• Real-time requirement

• Level of aggregation / granularity

Life cycle of the DW

Operational DatabasesOperational Databases Warehouse DatabaseWarehouse Database

First time loadFirst time load

RefreshRefresh

RefreshRefresh

RefreshRefresh

Purge or ArchivePurge or Archive

Real time information

• Up to date

• On-line

• Actual data

• Live feed

• Decisions made on what basis?

Real time requirement?

• Historical sales or accounting data, not real-time

• Sales as quarter end approaches

• Inventory levels for MRP

• Exchange rates, when is Visa rate calculated?

• Real-time processing: card transactions down

Response times

• Response times are a function of :– response time, – Infrastructure elements, – Database sizing– Transaction processing – Interfaces – Reporting– Other processing demands– Peak times– …

Example

• Revenue reports from EMC Data warehouse

• Report can grow to >1million lines at quarter end

• Should not be run on ERP server

• Poorly designed?

Manager’s view

• Volume has been increasing at a huge pace compared to … like, you go talk to Jonathan, … my answer to it will be, get used to it, it’s not going to go away, I don’t care what you do, it’s not my problem, I want the reports, you deal with the volume of records, it’s not going to go away, you deal with it.

Refresh Optimization

Accounting

UnknownProcesses

ExternalData Sources

InternalData Sources

DataWarehouse

ETLTools

Valid Time Lag

Load Time Lag

Fact andDimensionChanges

PrimarilyDimensionChanges

Determining the Refresh Frequency

• Maximize net refresh benefit

• Value of data timeliness

• Cost of refresh

• Satisfy data warehouse and source system constraints