14 Data Mining

Embed Size (px)

Citation preview

  • 8/6/2019 14 Data Mining

    1/18

  • 8/6/2019 14 Data Mining

    2/18

  • 8/6/2019 14 Data Mining

    3/18

    What is Data Mining..?

    Data mining is the process of extracting knowledge hidden from large

    volumes of raw data.

  • 8/6/2019 14 Data Mining

    4/18

    The importance of collecting data that reflect your

    business or Scientific activities advantage is widely recognized now. Powerful

    systems for collecting data and managing it in large databases are in place in all

    large and mid range companies. However the bottle neck of turning this data

    into your success is the difficulty of extracting knowledge about the system youstudy from the collected data.

    Human analysts with no special tools can no longer make sense of

    enormous volumes of data that require processing in order to make informed

    business decisions.

    Data Mining automates the process of finding relationships and patterns in

    raw data and delivers results that can be either utilized in an automated decision

    support system or assessed by a human analyst.

  • 8/6/2019 14 Data Mining

    5/18

    What goods should be promoted to this Customer..?

    What is the probability that a certain customer will respond to a planned

    promotion?

    Can one predict the most profitable securities to buy/sell during the next

    trading session?

    Will this customer default on a loan or pay back on schedule?

    What medical diagnose should be assigned to this patient?

    How large the peak loads of a telephone or energy network are going to be

    ?

    Why the facility suddenly starts to produce defective goods.?

    Some of the Questions?

  • 8/6/2019 14 Data Mining

    6/18

    These are all the questions that can probably be answered

    if information hidden among megabytes of data in your database can be found

    explicitly and utilized.Modelling the investigated system, discovering

    relations that connect variables in a database are the subject of data mining.

    Modern computer data mining systems self learn from the previous

    history of the investigated systems ,formulating and testing hypotheses about

    eh rules which this system obeys.When concise and valuable knowledge about

    the system of interest had been discovered, it can and should be incorporatedinto some decision support system which helps the manager to make wise and

    informed business decisions.

  • 8/6/2019 14 Data Mining

    7/18

    Why use Data Mining.?

    Data might be one of the most valuable assets of your corporationbut only if you

    know how to reveal knowledge hidden in raw data . Data mining allows you to extract

    diamonds of knowledge from your historical data and predict outcomes of future

    situations.

    It will help you to optimize your business decisions, increase thevalue of each customer and communication , and improve satisfaction of customer with

    your services.

    Data that require analysis differ for companies in different industries.

    Examples include :

    Sales and contacts histories.

    Call support data.

    Demographic data on your customers and prospects.

    Patient diagnoses and prescribed drugs data.

  • 8/6/2019 14 Data Mining

    8/18

    In Overall, we use Data Mining Bcoz.

    To improve Customer acquisition and retention.

    To Reduce Fraud.

    To Identify internal In-efficiencies and then revamp operations.

    To Map the unexplored terrain of the Internet.

  • 8/6/2019 14 Data Mining

    9/18

    In all these cases data mining can help you

    reveal knowledge hidden in data and turn this knowledge into a crucial

    competitive advantage. Today increasingly more companies

    acknowledge the value of this new opportunity and turn to Megaputer

    for leading edge data mining tools and solutions that help optimizing

    their operations and increase your bottom line.

  • 8/6/2019 14 Data Mining

    10/18

    Tools used in Data Mining

    Neural networks.

    Decision trees.

    Rule Induction.

    Data Visualization.

  • 8/6/2019 14 Data Mining

    11/18

    What Data Mining Yields.?

    Data mining usually yields five types of information.

    Association.

    Sequences.

    Classifications.

    Cluster.

    Forecasting.

  • 8/6/2019 14 Data Mining

    12/18

    Association

    Associations happen when occurrences are linked in a singleevent.

    For example, a study of supermarket baskets might reveal that when corchips are purchased , 65 % of the time cola is also purchased , unlessthere is a promotion, in which case Cola is purchased 85% of the time.

    Sequences

    In Sequences , events are linked over time.

    For example if a house is bought , then 45% of the time a new oven willbe bought within once month and 60% of the time a new refrigerator willbe bought within two weeks.

  • 8/6/2019 14 Data Mining

    13/18

    Classification..

    Is probably the most common data mining activity today.Itcan help you discover the characteristics of customers who are likelyto leave and provides a model that can be used to predict who theyare. It can also help you determine which kinds of promotions havebeen effective in keeping which types of customers, so that youspend only as much money as necessary to retain a customer.

    Clustering

    Using this ,the data mining tool discovers differentgroupings with the data. This can be applied to problems as diverseas detecting defects in manufacturing or finding affinity groups forbank cards.

    Continued

  • 8/6/2019 14 Data Mining

    14/18

    Potential Applications

    Retail / Marketing.

    Banking.

    Insurance.

    Health Care.

    Transportation.

    Medicine.

  • 8/6/2019 14 Data Mining

    15/18

    Process of Data Mining Consists of .

    It consists ofThree stages :-

    The Initial Exploration.

    Model Building or Pattern identification with Validation/Verification.

    Deployment (i.e.the applications of the model to new data in order

    to generate predications.

  • 8/6/2019 14 Data Mining

    16/18

    Explorations

    This stage usually starts with data preparation which mayinvolve cleaning data, data transformations, selecting subsets of records

    and in case of data sets with large numbers of variablesperformingsome preliminary feature selection operations to bring the number ofvariables to a manageable range (depending on the statistical methodswhich are being considered.

    Model building and validation

    This stage involves considering various models and choosing the bestone based on their predictive performance (i.e.,explaning the variabilityin question and producing stable results across samples.)

  • 8/6/2019 14 Data Mining

    17/18

    Deployment.....

    That final stage involves using the model selected as best in

    the previous stage and applying it to new data in order to generatepredictions or estimates of the expected outcome...

    Continued

  • 8/6/2019 14 Data Mining

    18/18