Upload
ruchika-rawat
View
218
Download
0
Embed Size (px)
Citation preview
7/31/2019 finalpresentation-111220200340-phpapp01
1/18
Data Warehousing
A data warehouse is a subject-oriented,integrated, time-variant, and nonupdatablecollection of data in support of managementsdecision-making process.
Subject-Oriented High level Entities like Customers,Patients, Students, Products andtime.
Integrated Data gathered from severalinternal system of records or fromsources external to the
organization.
7/31/2019 finalpresentation-111220200340-phpapp01
2/18
Time-Variant Time dimension is used in DataWarehousing to study the trends and
changes.
Nonupdatable New data is always added as asupplement to DB, rather thanreplacement. The DB continuallyabsorbs this new data, incrementally
integrating it with previous data.
Data warehousecan be more than onedatabase
7/31/2019 finalpresentation-111220200340-phpapp01
3/18
In Simple Words
A data warehouse is simply a single,
complete, and consistent store of data
obtained from a variety of sources and
made available to end users in a way they
can understand and use it in a business
context.
7/31/2019 finalpresentation-111220200340-phpapp01
4/18
Problem: Heterogeneous
Information Sources
Heterogeneities are
everywhere
Different interfaces
Different data representations
Duplicate and inconsistent information
Combined research results from different bioinformatics repositories
PersonalDatabases
Digital Libraries
Scientific DatabasesWorldWide
Web
7/31/2019 finalpresentation-111220200340-phpapp01
5/18
Goal: Unified Access to Data
Integration System
Collects and combines information
Provides integrated view, uniform user interface
Supports sharing
World
Wide
Web
Digital Libraries Scientific Databases
Personal
Databases
7/31/2019 finalpresentation-111220200340-phpapp01
6/18
The Need for Data Warehousing
1. A business requires an integrated,
companywide view of high quality
information.
2. The information systems department
must separate informational from
operational systems( system of records)
to improve performance dramatically inmanaging company data.
7/31/2019 finalpresentation-111220200340-phpapp01
7/18
Why a Warehouse
For analysis and decision support, end users
require access to data captured and stored in an
organizations operational or production
systems.
This data is stored in multiple formats, on
multiple platforms, in multiple data structures,
with multiple names, and probably created using
different business rules
7/31/2019 finalpresentation-111220200340-phpapp01
8/18
Why should we consider Data
Warehousing solutions ?
When users are requesting access to a large amount of
historical information for reporting purposes, you
should strongly consider a warehouse. The user will
benefit when the information is organized in an
efficient manner for this type of access.
7/31/2019 finalpresentation-111220200340-phpapp01
9/18
An Example to look at the need of
Data Warehousing
7/31/2019 finalpresentation-111220200340-phpapp01
10/18
Data Warehouse Components
CombinedData
Warehouse
DecisionSupport Tools
Management ReportingSales/Marketing
Customer RelationsReserve Analysis
Risk Analysis
Data WarehouseComponents
Customers
Policies
PremiumsClaims
Reserves
Rates
Extract ProgramsData Cleansers/ScrubbersTranslators/Transformers
Timing ToolsData LoadingFile Transfer
MainframeAppli cations
PCAppli cations
DB2/2
ExternalSources
???
Midrange
DB/6000
DB/400
IMS
VSAMDB/2
7/31/2019 finalpresentation-111220200340-phpapp01
11/18
7/31/2019 finalpresentation-111220200340-phpapp01
12/18
Administration and Management
Tools
a data warehouse requires tools to support theadministration and management of suchcomplex enviroment.
for the various types of meta-data and the day-
to-day operations of the data warehouse, theadministration and management tools must becapable of supporting those tasks:
monitoring data loading from multiple sources
data quality and integrity checksmanaging and updating meta-data
monitoring database performance to ensure efficient queryresponse times and resource utilization
7/31/2019 finalpresentation-111220200340-phpapp01
13/18
auditing data warehouse usage to provideuser chargeback information
replicating, subsetting, and distributing
data maintaining effient data storage
management
archiving and backing-up data implementing recovery following failure
security management
7/31/2019 finalpresentation-111220200340-phpapp01
14/18
In computers, the path of data from source
document to data entry to processing to
final reports. Data changes format and
sequence (within a file) as it moves fromprogram to program.
Is known as Data flow
7/31/2019 finalpresentation-111220200340-phpapp01
15/18
Data Flow
Inflow- The processes associated with the extraction,cleansing, and loading of the data from the source systems into thedata warehouse.
upflow- The process associated with adding value to the datain the warehouse through summarizing, distribution of the data.
downflow-The processes associated with archiving andbacking-up of data in the warehouse.
outflow- The process associated with making the dataavailabe to the end-users.
Meta-flow-The processes associated with the managementof the meta-data.
7/31/2019 finalpresentation-111220200340-phpapp01
16/18
Architectures
Many database architectures has been implemented
2 architectures need to be quoted:
1. OLTP (OnLine Transaction Processing)
2. Data Warehouse (OLAP)(online analytical processing)
OLTP is used to store data and query it frequently andis based on normalized schemas.
Data warehouse is used to store data history and is
based on fact tables and dimension tables.
7/31/2019 finalpresentation-111220200340-phpapp01
17/18
Difference between
OLTP and DataWare House
OLTP OLAP
users clerk, IT professional knowledge worker
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-datedetailed
historical,
summarized, multidimensional
integrated
access read/write
index/hash on prim. key
lots of scans
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 100MB-GB 100GB-TB
7/31/2019 finalpresentation-111220200340-phpapp01
18/18
Special Thanks to
Google.comand other sites.