Upload
shekar-chandra
View
220
Download
0
Embed Size (px)
Citation preview
8/6/2019 14 Data Mining
1/18
8/6/2019 14 Data Mining
2/18
8/6/2019 14 Data Mining
3/18
What is Data Mining..?
Data mining is the process of extracting knowledge hidden from large
volumes of raw data.
8/6/2019 14 Data Mining
4/18
The importance of collecting data that reflect your
business or Scientific activities advantage is widely recognized now. Powerful
systems for collecting data and managing it in large databases are in place in all
large and mid range companies. However the bottle neck of turning this data
into your success is the difficulty of extracting knowledge about the system youstudy from the collected data.
Human analysts with no special tools can no longer make sense of
enormous volumes of data that require processing in order to make informed
business decisions.
Data Mining automates the process of finding relationships and patterns in
raw data and delivers results that can be either utilized in an automated decision
support system or assessed by a human analyst.
8/6/2019 14 Data Mining
5/18
What goods should be promoted to this Customer..?
What is the probability that a certain customer will respond to a planned
promotion?
Can one predict the most profitable securities to buy/sell during the next
trading session?
Will this customer default on a loan or pay back on schedule?
What medical diagnose should be assigned to this patient?
How large the peak loads of a telephone or energy network are going to be
?
Why the facility suddenly starts to produce defective goods.?
Some of the Questions?
8/6/2019 14 Data Mining
6/18
These are all the questions that can probably be answered
if information hidden among megabytes of data in your database can be found
explicitly and utilized.Modelling the investigated system, discovering
relations that connect variables in a database are the subject of data mining.
Modern computer data mining systems self learn from the previous
history of the investigated systems ,formulating and testing hypotheses about
eh rules which this system obeys.When concise and valuable knowledge about
the system of interest had been discovered, it can and should be incorporatedinto some decision support system which helps the manager to make wise and
informed business decisions.
8/6/2019 14 Data Mining
7/18
Why use Data Mining.?
Data might be one of the most valuable assets of your corporationbut only if you
know how to reveal knowledge hidden in raw data . Data mining allows you to extract
diamonds of knowledge from your historical data and predict outcomes of future
situations.
It will help you to optimize your business decisions, increase thevalue of each customer and communication , and improve satisfaction of customer with
your services.
Data that require analysis differ for companies in different industries.
Examples include :
Sales and contacts histories.
Call support data.
Demographic data on your customers and prospects.
Patient diagnoses and prescribed drugs data.
8/6/2019 14 Data Mining
8/18
In Overall, we use Data Mining Bcoz.
To improve Customer acquisition and retention.
To Reduce Fraud.
To Identify internal In-efficiencies and then revamp operations.
To Map the unexplored terrain of the Internet.
8/6/2019 14 Data Mining
9/18
In all these cases data mining can help you
reveal knowledge hidden in data and turn this knowledge into a crucial
competitive advantage. Today increasingly more companies
acknowledge the value of this new opportunity and turn to Megaputer
for leading edge data mining tools and solutions that help optimizing
their operations and increase your bottom line.
8/6/2019 14 Data Mining
10/18
Tools used in Data Mining
Neural networks.
Decision trees.
Rule Induction.
Data Visualization.
8/6/2019 14 Data Mining
11/18
What Data Mining Yields.?
Data mining usually yields five types of information.
Association.
Sequences.
Classifications.
Cluster.
Forecasting.
8/6/2019 14 Data Mining
12/18
Association
Associations happen when occurrences are linked in a singleevent.
For example, a study of supermarket baskets might reveal that when corchips are purchased , 65 % of the time cola is also purchased , unlessthere is a promotion, in which case Cola is purchased 85% of the time.
Sequences
In Sequences , events are linked over time.
For example if a house is bought , then 45% of the time a new oven willbe bought within once month and 60% of the time a new refrigerator willbe bought within two weeks.
8/6/2019 14 Data Mining
13/18
Classification..
Is probably the most common data mining activity today.Itcan help you discover the characteristics of customers who are likelyto leave and provides a model that can be used to predict who theyare. It can also help you determine which kinds of promotions havebeen effective in keeping which types of customers, so that youspend only as much money as necessary to retain a customer.
Clustering
Using this ,the data mining tool discovers differentgroupings with the data. This can be applied to problems as diverseas detecting defects in manufacturing or finding affinity groups forbank cards.
Continued
8/6/2019 14 Data Mining
14/18
Potential Applications
Retail / Marketing.
Banking.
Insurance.
Health Care.
Transportation.
Medicine.
8/6/2019 14 Data Mining
15/18
Process of Data Mining Consists of .
It consists ofThree stages :-
The Initial Exploration.
Model Building or Pattern identification with Validation/Verification.
Deployment (i.e.the applications of the model to new data in order
to generate predications.
8/6/2019 14 Data Mining
16/18
Explorations
This stage usually starts with data preparation which mayinvolve cleaning data, data transformations, selecting subsets of records
and in case of data sets with large numbers of variablesperformingsome preliminary feature selection operations to bring the number ofvariables to a manageable range (depending on the statistical methodswhich are being considered.
Model building and validation
This stage involves considering various models and choosing the bestone based on their predictive performance (i.e.,explaning the variabilityin question and producing stable results across samples.)
8/6/2019 14 Data Mining
17/18
Deployment.....
That final stage involves using the model selected as best in
the previous stage and applying it to new data in order to generatepredictions or estimates of the expected outcome...
Continued
8/6/2019 14 Data Mining
18/18