Upload
jesus-flores
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Data Warehousing and Data Mining
J. G. ZhengMay 20th 2008
MIS Chapter 3
2
Types of Information Processing
Transactional Processing Focus on data collection, update and
simple calculation
Analytical Processing Focus on data analysis and decision
support
3
Data Warehouse Data warehouse is a
special kind of database that stores data from many
operational (or transactional) databases
supports analytical processing and decision making
4
Why Data Warehouse? Traditional database facilitates data
management and transaction processing
Two limitations with databases in practice They are transaction oriented and not optimized
for complex data analysis Individual databases usually manage data in very
different ways, even in the same organization (heterogeneity)
5
Data Warehouse for OLAP Data warehousing approach to satisfy the
need for knowledge generation
Transaction Processing
Analytical Processing
6
Data Warehousing: a Complete View
Figure 3.1 on page 127
Should we invest more on our e-business? (fuzzy question need high level analysis for decision making)
How do advertising activities affect sales of different products bought by different type of customers, in different regions? (synthesizing)
What is the reason for a decrease of total sales this year? (reasoning)
7
What’s the Difference? Data warehouse is (often) multi-dimensional
Figure 3.10 on page 145
8
Multi-dimensionality in Depth
Star structure
Time
Sales DataCustomer
Product
Location
Fact Table
Dimensions
9
An Example in Relational ModelTimeTimeKeyHourDateWeekMonthQuarterYear
ProductProductKeyProductBrandCategoryManufacturerCategory
LocationLocationKeyStoreCityStateRegionCountry
CustomerCustomer keyCustomerAgeGroupGenderCareerGroup
SalesTimeKeyCustomerKeyProductKeyLocationKeyAmountQuantityAveUnitPrice
10
Data Mining Data mining (also called knowledge
discovery in database, KDD): process and techniques for seeking knowledge (relationship, trends, patterns, etc) from a large amount of data non-trivial, non-obvious implicit knowledge Extremely large datasets
11
Data Mining Tasks What does data mining do?
Estimation/prediction Classification/clustering Association/Affinity grouping
Market basket analysis in retail
12
Data Mining Techniques Multidimensional analysis (MDA) tools
OLAP (online analytic processing) Slice-and-dice
Statistical tools Apply mathematical and statistical models, for
example, time serials analysis for trend
Artificial Intelligence (more in chapter 4)
13
Summary Business intelligence/knowledge
comes from data and information
Data warehousing is a popular approach to support OLAP and data mining
Data mining is a concept of seeking knowledge from large amount of data
14
Good Resources
A practitioner's views on data warehousing http://www.dwinfocenter.org/