View
225
Download
0
Category
Preview:
Citation preview
7/30/2019 Data Mining & Warehousing- Presentation
1/25
7/30/2019 Data Mining & Warehousing- Presentation
2/25
NADEEM . A 111A1012
JAYADURGA .S 111A1009
RAGINI .S 111A1013
7/30/2019 Data Mining & Warehousing- Presentation
3/25
7/30/2019 Data Mining & Warehousing- Presentation
4/25
WHICH ARE OURLOWEST/HIGHESTMARGIN
CUSTOMERS ?
WHO ARE MY CUSTOMERSAND WHAT PRODUCTS
ARE THEY BUYING?
WHICH CUSTOMERS
ARE MOST LIKELY TOGO TO THECOMPETITORS ?
WHAT IMPACT WILLNEW PRODUCTS ORSERVICESHAVE ON REVENUE
AND MARGINS?
WHAT PRODUCT
PROMOTIONSHAVE THE BIGGESTIMPACT ON REVENUE?
WHAT IS THE MOSTEFFECTIVEDISTRIBUTION
CHANNEL?
7/30/2019 Data Mining & Warehousing- Presentation
5/25
I CANT GET THE DATA I NEED NEED AN EXPERT TO GET THE DATA
I CANT UNDERSTAND THE
DATA I FOUND
AVAILABLE DATA POORLYDOCUMENTED
I CANT USE THE DATA I FOUND RESULTS ARE UNEXPECTED DATA NEEDS TO BE TRANSFORMED FROM
ONE FORM TO OTHER
I CANT FIND THE DATA I NEED DATA IS SCATTERED OVER THE NETWORK
MANY VERSIONS, SUBTLE DIFFERENCES
7/30/2019 Data Mining & Warehousing- Presentation
6/25
1960S: DATA COLLECTION, DATABASE CREATION, IMS AND
NETWORK DBMS
1970S: RELATIONAL DATA MODEL, RELATIONAL DBMS
IMPLEMENTATION
1980S: RDBMS, ADVANCED DATA MODELS (EXTENDED-
RELATIONAL, OO, DEDUCTIVE, ETC.) AND APPLICATION-ORIENTED DBMS (SPATIAL, SCIENTIFIC, ENGINEERING,ETC.)
1990S2000S: DATA MINING AND DATA WAREHOUSING, MULTIMEDIA
DATABASES, AND WEB DATABASES
6
7/30/2019 Data Mining & Warehousing- Presentation
7/25
The data warehouse is that portion of anoverall Architected Data Environment thatserves as the single integrated source of
data for processing information.
7/30/2019 Data Mining & Warehousing- Presentation
8/25
Data explosion problem
Automated data collection tools and mature database
technology lead to tremendous amounts of data stored
in databases, data warehouses and other information
repositories
We are drowning in data, but starving for knowledge!
Solution: Data warehousing and data mining
Extraction of interesting knowledge (rules, regularities,
patterns, constraints) from data in large databases
8
7/30/2019 Data Mining & Warehousing- Presentation
9/25
DATA
WAREHOUSE
SubjectOriented
Integrated
NonVolatile
Time
variant
Accessible
Process
Oriented
7/30/2019 Data Mining & Warehousing- Presentation
10/25
DATA MART
STAGING AREA
OLAP
OLAP TOOLS
7/30/2019 Data Mining & Warehousing- Presentation
11/25
Data
Acquisition
Warehouse
Design
Analytical
Data Store
Enterprise
Warehouse
Data Marts
Metadata
Directory
Metadata
Repository
DA
TA
MANAG
EMENT
M
ETADATA
MA
NAGEMENT
Data
Analysis
Web
Information
Systems
Operational,
External &otherDatabases
7/30/2019 Data Mining & Warehousing- Presentation
12/25
7/30/2019 Data Mining & Warehousing- Presentation
13/25
The data warehouse is distinctly different from theoperational data used and maintained by day-to-day operational systems. Data warehousing is notsimply an access wrapper for operational data,where data is simply dumped into tables for
direct access.
7/30/2019 Data Mining & Warehousing- Presentation
14/25
OPERATIONAL DATA
Application oriented
Detailed
Accurate, as of the moment ofaccess
Serves the clerical community
Performance sensitive
(immediate response required
when entering a transaction)Flexible structure; variable
contents
Small amount of data used in aprocess
DATA WAREHOUSE
Subject oriented
Summarized
Represents values over time
Serves the managerial
community
Performance relaxed(immediacy not required)
Static structure
large amount of data used in aprocess
7/30/2019 Data Mining & Warehousing- Presentation
15/25
Data mining (knowledge discovery indatabases): Extraction of interesting (non-trivial, implicit,
previously unknown and potentially useful)information or patterns from data in largedatabases
Process of finding different patterns or co-relationsamong the data in large relational databases.
Popular and highly used in the INFORMATIONINDUSTRY
15
7/30/2019 Data Mining & Warehousing- Presentation
16/25
Market Analysis And Management
Corporate Analysis And RiskManagement
Fraud Detection And Management
16
7/30/2019 Data Mining & Warehousing- Presentation
17/25
Where are the data sources for analysis? Credit card transactions, loyalty cards, discount coupons,
customer complaint calls, plus (public) lifestyle studies
Target marketing Find clusters of model customers who share the same
characteristics: interest, income level, spending habits, etc.
Determine customer purchasing patterns over time
Conversion of single to a joint bank account: marriage, etc. Cross-market analysis
Associations/co-relations between product sales
Prediction based on the association information
17
7/30/2019 Data Mining & Warehousing- Presentation
18/25
18
Customer profiling
Data mining can tell you what types of customers buywhat products (clustering or classification)
Identifying customer requirements
Identifying the best products for different customers
Use prediction to find what factors will attract newcustomers
Provides summary information
Various multidimensional summary reports
Statistical summary information (data central tendencyand variation)
7/30/2019 Data Mining & Warehousing- Presentation
19/25
Finance planning and asset evaluation Cash flow analysis and predictionAnalysis of trends, financial ratios and market value
Resource planning: Summarize and compare the resources and spending
Competition: Monitor competitors and market directions
Group customers into classes and a class-based pricingprocedure Set pricing strategy in a highly competitive market
19
7/30/2019 Data Mining & Warehousing- Presentation
20/25
Applications widely used in health care, retail, credit card services,
telecommunications (phone card fraud), etc.
Approach use historical data to build models of fraudulent behavior
and use data mining to help identify similar instances
Examples auto insurance: detect a group of people who stage
accidents to collect on insurance money laundering: detect suspicious money transactions
(US Treasury's Financial Crimes Enforcement Network) medical insurance: detect professional patients and ring
of doctors and ring of references20
7/30/2019 Data Mining & Warehousing- Presentation
21/25
21
Detecting inappropriate medical treatmentAustralian Health Insurance Commission identifies that in
many cases blanket screening tests were requested (saveAustralian $1m/yr.).
Detecting telephone fraud Telephone call model: destination of the call, duration,
time of day or week. Analyze patterns that deviate froman expected norm.
British Telecom identified discrete groups of callers withfrequent intra-group calls, especially mobile phones, andbroke a multimillion dollar fraud.
7/30/2019 Data Mining & Warehousing- Presentation
22/25
Sports IBM Advanced Scout analyzed NBA game statistics
(shots blocked, assists, and fouls) to gain competitiveadvantage for New York Knicks and Miami Heat
AstronomyJPL and the Palomar Observatory discovered 22 quasars
with the help of data mining
Internet Web Surf-Aid IBM Surf-Aid applies data mining algorithms to Web
access logs for market-related pages to discovercustomer preference and behavior pages, analyzingeffectiveness of Web marketing, improving Web siteorganization, etc.
22
7/30/2019 Data Mining & Warehousing- Presentation
23/25
Other Applications
Text mining (news group, email,documents)
Stream data mining
Web mining.
DNA data analysis
7/30/2019 Data Mining & Warehousing- Presentation
24/25
7/30/2019 Data Mining & Warehousing- Presentation
25/25
Recommended