42
Boğaziçi University Department of Management Information Systems MIS 463// Decision Support Systems for Business// Fall’15 PROJECT FINAL REPORT B2B Customer Clustering “Taking right action to the right customer” Project Team No: 10 Akın Güldere Mustafa Kırca Veysel Kurhan Numan Soysal İrem Belkıs Topçu Instructor : Aslı Sencer

Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

  • Upload
    lenhi

  • View
    220

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Boğaziçi UniversityDepartment of Management Information Systems

MIS 463// Decision Support Systems for Business// Fall’15

PROJECT FINAL REPORT

B2B Customer Clustering “Taking right action to the right customer”

Project Team No: 10Akın Güldere Mustafa Kırca Veysel Kurhan Numan Soysal

İrem Belkıs Topçu

Instructor : Aslı Sencer

İstanbul - December, 2015

Page 2: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Table of Contents

I. INTRODUCTION............................................................................................................... 3I.1 The Decision Environment....................................................................................................... 3I.2 Mission of Project....................................................................................................................... 4I.3 Scope of Project........................................................................................................................... 5I.4 Methodology................................................................................................................................. 5

II. LITERATURE SURVEY................................................................................................ 5

III. DEVELOPMENT OF THE DSS........................................................................................ 8III.1. DSS Architecture.................................................................................................................... 8III.2 Technical Issues....................................................................................................................... 8III.3 Data Source and Flow Mechanisms.....................................................................................9 III.3.1 Data Handling and Data Preperation..........................................................................9 III.3.2 ER Diagram.................................................................................................................... 10 III.3.3 Data Flow Diagram- Level 0.......................................................................................11 III.3.4 Flow Mechanism............................................................................................................ 12III.4. Model and Algorithms........................................................................................................ 12III.5. User Interface and Reports................................................................................................13

IV. ASSESSMENT............................................................................................................... 22IV. 1. Project Plan.......................................................................................................................... 22IV. 2. Master Plan.......................................................................................................................... 23

V. CONCLUSION................................................................................................................ 23

REFERENCES..................................................................................................................... 25

APPENDIX............................................................................................................................ 26New Customer Datamart..............................................................................................................26Customer Data taken from Oyepa............................................................................................. 26Transaction Data taken from Oyepa.........................................................................................27Invoice Data taken from Oyepa.................................................................................................. 28Products Data taken from Oyepa...............................................................................................28Clustering Results-1....................................................................................................................... 29Clustering Results-2....................................................................................................................... 29Clustering Results-3....................................................................................................................... 30Certain SQL codes......................................................................................................................... 30Certain Algorithm Codes..............................................................................................................31

Page 3: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

I. INTRODUCTION

I.1 The Decision Environment

The spare parts industry, which was established and has been grown by the technical and economical supports of the main manufacturers, nowadays, this industry is operating with a big potential. Turkish spare parts industry have some strong points, such as, exporting success to about 100 countries. Also, there are a lot of worker who are good in "know-how", and Turkey has a lot of globalized firms, etc. Nevertheless, unfortunately, there are some negative aspects. The significant deficiencies are that the lack of strategy for the sector develop in, and the lack of communication between firms that may establish a cooperation. The main reason of these two weaknesses is that not knowing the characteristics of firms or customers having business in the spare parts industry. That is, describing its customers and knowing the hidden knowledge of this industry bring a giant advantage to the firm. Moreover, for a company in this sector as in others, taking right actions to its customers provides competitive advantage on other firms in the same sector. Oyepa is one of the leading companies in automotive spare parts business with more than 45-year experience of import and export. Having a total area of 24.000 sqm, currently it serves with more than 160 qualified staff at 4 different locations in Turkey. Having long years of experience, Oyepa is now doing business with more than 90 globally well-known brands with a considerable high stock range. Currently, parallel to the range of vehicles in the market the company has enhanced its business to supply spare parts for the key brands including the followings; for heavy duty vehicles of Mercedes Benz, Man, Volvo, Scania, Daf, Rvi, Iveco, Khd and for passenger cars and light commercial vehicles of Mercedes Benz, Bmw, Audi, Volkswagen, Seat, Skoda, Porche, Opel, Land Rover.Oyepa is a firm having B2B relations, that is, when its customers are mentioned, they are referred to retailers or wholesalers. Its major product range includes body parts, engine parts, suspension parts, electrical parts and accessories. Its aim is to provide a wide range of products worldwide to meet its long term customers' expectation, with a high quality of products and service. That’s why, to achieve its goals, Oyepa wants to identify its customers and their behaviors to take certain decisions. Within the company, such decisions in terms of giving promotions, deciding the amount of products to sell a certain customer are already taken. Nevertheless, the efficiency of those decisions are not that much high since it does not know customers and their current segmentation properly. It’s a well-known fact that knowing the customer and their needs helps a firm to sell more to those specific customers. Therefore, firm cannot decide effortlessly which customers are more profitable and which are risky to sell in huge amounts. On the other hand, firm checks the selling amounts and types of spare products to customers and take some decisions such as deciding amount of the products in a future sale according to those informations. It also uses information in order to invest or to buy which kind of products from suppliers in future. Moreover, certain associated decisions are also taken regarding selling abroad and countryside. All those decisions are firstly taken by head managers of different departments as well as C level managers; namely CEO and CFO; and in the second plan, some salesmen give those decisions.

3

Page 4: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Companies give a huge importance to their business relations both with its customers and suppliers. Especially for intermediary companies like Oyepa, significance level is higher than other firms since many things rely on these ties. That is; relations are keys for those companies to survive in their business area because if they do not make powerful or have relations, they cannot find products to sell from suppliers or find customers to sell. For instance, if a company’s relation or credibility with its supplier is strong, it can provide products immediately or in the amount that company want and company can serve its customers in a better way, even it can be resulting in having new customers because of good service. Although the firm analyzes its customers yearly to gain some data about their behaviors, it is in need to know and take advantage of data in every activity affecting customer relations. To illustrate, when limited edition products are bought, first, customers who pay regularly are offered for those limited products. Therefore, collected data should be up to date. As abovementioned, Oyepa has national and international customers and it has to deal with hundreds of customers together with their information. Moreover, there are 3 particular categories and many sub-categories to classify customers and choosing right action for different customer is really intricate. This makes decisions more complex. Beside its complexity, sometimes even firm finds out the most proper way to reach its customers and to sell more, firm may have some constraints like customers’ own thoughts. For instance, customer may not like the salesman and this can break their relations or as a drawback of word of mouth, customer may hear something about the firm and stop buying from the firm.When it comes to the system that firm currently uses, to label customer, a new column is added to the customer database and employees in the firm label customer manually. Moreover, these manually entered data are not updated frequently and this situation can result in missing the current level of customer. In some cases data are entered years ago and do not indicate new position of the customer, leading taking wrong decisions to customers. For example, mismatching of customers or suppliers or tagging them to wrong categories can cause implying inappropriate strategies, weaker B2B relations, and even worse, losing an important customer. Business relations are lifeblood of the firm so these decisions are extremely sensitive and improper applications after those decisions cost remarkably.All in all, in today’s world, almost all firms make customer based investments since relations with customers are hugely significant and identifying customer is getting more and more important. However, firms have big data and usage as well as analysis of this huge growing data cannot be done manually. That’s why, a computerized system is required for such situations.

I.2 Mission of Project

Companies desire to know more about their customers to have better strategic plans for sales since if a company does not identify its customers inside out, there is always a chance that its competitors might. And if they do, the company will not stand the chance. Therefore, the main mission of the project is providing company to know its customers; where they stand or which group they are in. And sub-goals are recognizing which type of customers are most profitable or potentially profitable and identifying the group of customers they might churn.

“Taking right action to the right customer”

4

Page 5: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

I.3 Scope of Project

This project requires some information entered by end user such as choosing attributes, assigning weights to them and filtering data to be clustered. It will use some programs working on background however user will just see the clustered customer data, certain graphics and some notes about customer and its cluster. For this project, the transaction data of Oyepa, which is about 1,5 million rows including some missing values, are taken. The data consist of which customer buys which products, the total amount of expenditure, which salesman sells to which customers, date, discounts to selected customers on selected amounts (Appendix 1). This transaction data, having relations with customer and invoice data, is used to create a new customer data-mart with selected attributes. These attributes include date of invoices, net sales, total discounts, total expenditure and they are transformed to create new attributes, namely; frequency, recency, monetary and discount rate. As it mentioned, there will be some checkboxes on the screen that allow user to choose with which attribute to cluster. It is planned that in this project, there will be only segmentation of customers. Program will connect to the database, gather data, process it and show the outcome.

I.4 Methodology

Customer data is a gold mine of awe-inspiring insights. By mining data, the beneficial information can be found and used for increasing customer loyalty, unlocking hidden properties, and reducing customer churn. That’s why, in this project, it is planned to use data mining techniques to deduce meaningful information. Specifically, clustering method will be used among other data mining techniques. Clustering is a mean to group a set of data in such a way that intra-group similarities and inter-group differences increase. By using this method, customers can be identified easily, their expectations together with needs can be estimated, customer profiles can be set and decisions can be giving according to these. For this model, there was a need of customer, sales and supplier data as an input. It was planned that the data required for this project would obtain from the company mentioned above. The data that is needed are collected to imply the method.

II. LITERATURE SURVEYCustomer segmentation certainly cannot be seemed as anew business practice in many ways, however, this practice of segmentation has been developed and renewed in such a way that freshly introduced technologies have made analyzing huge amount of related data more effectively, quicker and easier (Stuntebeck, 2012). It is an undeniable fact that companies rely on their customers, without them, companies loose their main resource of profit (Cheng and Chen, 2009). This is in the heart of every good marketing strategy, satisfying customers profitably. That is, not wasting time or resources on less profitable customers and treating the key targets not as one homogenous group but as distinct groups with distinct needs (Hague and Harrison, 2015). It's also not a bad way to gain a strategic edge over competitors that may be treating customers in a more "one size fits all" fashion. To meet profitable satisfaction of customer needs, companies should identify their customers by means of segmenting them in terms of having similar

5

Page 6: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

behaviors and behaving customers according to their group classifications. Knowing customers and how they differ, provides company to have clear proposition lighting their path. Even though business-to-business (B2B) companies often times don't face quite the same avalanche of customer data as their business-to-consumer (B2C) counterparts, the process of partitioning business customers into groups with similar needs or characteristics continues to represent an important way to generate additional value for their customers (Stuntebeck, 2012) since in B2B companies, personal relations are more important than B2C markets. Moreover, the cream of having segmented business customers is that traversing many functional domains within the company, including both the customer side and the supply side. For instance, a company can make reasonable assumptions with respect to the greater market at large by using key insights from customer segmentation studies. When supply side is considered, customer segmentation can support decision-making within fulfillment (Stuntebeck, 2012) and other operational processes, like further buying decisions of the company.It can clearly be seen that B2B segmentation is at least significant as B2C segmentation and the reason why B2B segmentation is wanted to use is that main idea of this project is generated from the problems of a real company; Oyepa which has B2B relations and in need of identifying its customers better to make more profit.As in general, all segmentation works and analytical projects start with data. Data stored in various systems such as transactions, sales interactions, and product supply are gathered from Oyepa. And in order to segment customers into specific classes using this data, there are certain ways in literature. The explosive growth in databases has created a need to develop technologies that use information and knowledge intelligently. Therefore, data mining techniques have become an increasingly important in research area (Fayyad, Djorgovski, & Weir, 1996). These techniques are the foremost ones when having huge amount of data regarding customers and transactions. And also these techniques help to find hidden information treasuries from large databases (Rygielski, Wang, and Yen, 2002). That’s why, mining data fits this project’s aim. Even using data mining, dealing with massive data is not a piece of cake. Before analyzing, data should be cleaned and prepared to be use. According to Cheng and Chen (2009) computing process should be start with data preprocessing, which is deleting the records including missing values or inaccurate values, eliminating the redundant attributes to process effectively for clustering. And also Birant (2011) states that data preparation operations such as reduction in number of attributes, outlier detection, normalization, discretization, concept hierarch generation significantly improve the model; in fact a further increasing the prediction accuracy and saving in elapsed time. Among other data mining techniques such as decision tree, classification, association and regression analysis, clustering is chosen since for grouping both B2B and B2C customers, clustering method of data mining used in literature (Kashwan and Velu, 2013; Dhandayudam and Krishnamurthi, 2012; Hosseini, Maleki and Gholamian, 2012). Clustering analysis is basically a statistical technique that are used to identify a set of groups that both minimize within-group variation and maximize between-group variation based on a distance or dissimilarity function, and its aim is to find an optimal set of clusters (Witten & Frank, 2005). That is, this exploratory data-analysis technique attempts to reveal natural groupings within a dataset by partitioning the data observations—in this case, business customers —into homogeneous groups based on their proximity to each other. To put it simply, how similar are customers to each other given the behaviors and characteristics available in the dataset (Stuntebeck, 2012)? There are several clustering techniques and various clustering algorithms are available in existing software packages such as IBM SPSS, Rapid Miner and SAS Enterprise

6

Page 7: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Miner, and new ones frequently appear in the literature. These methods and algorithms vary depending on how the similarity between observations is defined or on other assumptions about shapes of clusters, distributions of variables, etc (Garla, Chakraborty, Gaeth and Iowa, 2011). In this project, K-means algorithm will be used since Stuntebeck (2012) indicates that for the usage of giant dataset, using the nonhierarchical method called K-Means clustering makes sense and it has been used extensively in various fields including data mining, statistical data analysis and other business applications (Cheng and Chen, 2009). In this iterative method, user specifies the number of clusters beforehand and each observation belongs to one and only one cluster, however, during the iteration process, an observation can shift from one cluster to another. This situation will end when there are no different belongings for each observation. The advantage of this method, as mentioned before, is its capability to handle large data sets and can work with compact clusters. The major limitation of this technique is the requirement to specify the number of clusters beforehand (Garla, Chakraborty, Gaeth and Iowa, 2011) and this method is also sensitive to outliers and noise in the data. Therefore, to minimize the drawbacks of k-means algorithm, data is preprocessed in this project and noises are eliminated. After that, deciding on which attributes to use comes secondly. For choosing attributes, literature is searched and it is found that data mining applications based on RFM concepts widespread among articles. It has also been proposed not only for business area but also for different areas such as for the computer security (Kim, Im and Park, 2010), for automobile industry (Chan, 2008) and for the electronics industry (Chiu, Kuo and Chen, 2009). RFM analysis has been used in marketing area for several decades since this technique identifies customer behavior and represents customer behavior characteristics by three variables as follows; recency, frequency and monetary. The definitions of RFM analysis are described as follows (Cheng & Chen, 2009);

. Recency is the interval between the time that the latest consuming behavior happens where shorter intervals have bigger values. Therefore, for this project reverse of recency is used to be in the same page with frequency and monetary.

. Frequency refers to the number of transactions in a particular period, for example, two times per one year, two times per one quarter or two times per one month.

. Monetary value means the amount of money is consumed in a particular period.

7

Page 8: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

III. DEVELOPMENT OF THE DSS

III.1. DSS Architecture

Figure-1: Context Diagram

In our system, there is a customer data-mart, which has some attributes such as total expenditure of customer, amounts of customers’ invoice, location information and so on. The user who is sales manager or general manager enters our system to learn segment of his/her customers. There are five attributes that are recency, frequency, net total expense, discount rate and average spending of per transaction in the system. Sales manager or general manager chooses two of these attributes or three of them or all of them. After selecting attributes s/he can filter regarding country or city of customers. However, this part is optional that s/he may not do this. Then, manager enters the cluster number that how many segments s/he wants to see. And also manager can give weight to some attribute. For instance, user might think that monetary value of customer is more significant than other attributes and might increase weight of monetary value. After gathering these informations, system returns detailed reports and information of customer clusters to the manager.

III.2 Technical Issues

This project will include several types of technologies for developing a web based decision support system. For the main structure of the web part, HTML will be used and also in order to create an attractive view for the website, CSS technology will be used. For the main part of the system, since these technologies can work together, C# will be used, and will be worked on .Net platform. As an IDE for developing DSS, Microsoft Visual Studio 2013 will be used. This environment will allow to build graphic user interface applications. Also, for managing database, Microsoft SQL Server will be used. Instead of using existing programs and

8

Page 9: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

tools, for the clustering structure, clustering algorithm that embedded in codes will be generated.

III.3 Data Source and Flow Mechanisms

III.3.1 Data Handling and Data Preperation

The data that is received from Oyepa consists of their customers’ information, their products information and their transactions with their customers in different tables. These tables have many row data like that transaction table has 1.5 million one. ER diagram in Figure-2 shows the tables, their attributes and relationships with each other.To conduct clustering method of data mining, a new customer data-mart is created (see Appendix) based on the row data tables. In this new table, it is in needed that some attributes which describe customers’ purchasing behaviors to enable clustering customers. Firstly, certain columns are created to indicate customers’ monetary value that means how much they spend in the period of last quarter, last year and as whole retrieved from transaction data. It is also created that customers’ frequency value that means how often they purchase in the period of quarter, last year and as whole retrieved from transaction data. Recency value is set by calculating how many days passed from costumers’ last purchase. This recency value column is reversed since in this column, minimum values show high recency and for other attributes the situation is opposite. For instance for frequency, high values show high frequency and in the clustering process, in order to have these attributes on the same page, recency should have that its high values indicate high recency. For each different product type such as heavy vehicle, light commercial and passenger car, frequency and monetary values are recalculated and new columns are created. Since using total discount column, as itself, does not make sense, discount rate is calculated by dividing total discount value taken from transaction table by monetary value. For each customer, average expenses for their transactions are calculated by dividing their total spending by their total transaction number. All in all, to prepare data, certain processes are done including data cleaning to get rid of missing values and to correct inconsistencies; data integration that is merging data from multiple data sources; data transformation that is transforming data into an appropriate format for any given data mining algorithm; and data reduction.

9

Page 10: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

III.3.2 ER Diagram

Figure-2: ER Diagram

This ER diagram indicates the relations between different data files and by relating them, new customer datamart was created as its part shown in Table-1.

Table-1: Part of Customer Datamart

10

Page 11: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

III.3.3 Data Flow Diagram- Level 0

Figure-3: DFD Level 0

General structure of the system is shown in the context diagram; with data flow diagram (DFD) detailed flow of the data can be seen. Under the B2B clustering system, there are three processes including gathering info from user system, clustering system and report system; an entity that is manager; and a customer datamart that is created. As seen in the Figure-3, customer datamart does not have any entry coming from any process or any entity since datamart is created once and data transformation is done just at the beginning. Therefore, because there is no second data entrance, customer datamart in the system is just read-only.

2

3

1

11

Page 12: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

III.3.4 Flow Mechanism

Figure-4: Flow Chart

B2B clustering system starts with entrance of the manager. In home page, manager is informed about the system like how to use system. Manager continues with going to ‘Try’ page. S/he filters customers according to certain criteria and selects the attributes that s/he wants to cluster with. Then s/he attaches weights to those selected attributes, enters iteration number and selects number of clusters. By clicking ‘Cluster’ button, s/he runs the system and views reports. For some clusters that s/he chooses, manager can get detailed reports about those clusters. System provides print & save for those reports and ends.

III.4. Model and Algorithms

When clustering data, it is noticed that data distribution is right skewed because attributes measured at different scales do not contribute equally to the analysis. Therefore, we applied standardization procedures to equalize the range and/or data variability. Standardized values (z-values) of each attribute are calculated by subtracting mean score from value of attribute, divided the standard deviation. For clustering with these attributes, as it mentioned before, k-means technique is chosen. Algorithmic steps for k-means clustering;

Let X={x1,x2,x3,……..,xn} be the set of data points and V= {v1,v2,…….,vc} be the set of centers.

1) Randomly select ‘c’ cluster centers.2) Calculate the distance between each data point and cluster centers.3) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers..4) Recalculate the new cluster center using:  

12

Page 13: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

where, ‘ci’ represents the number of data points in ith cluster.

5) Recalculate the distance between each data point and new obtained cluster centers.6) If no data point was reassigned then stop, otherwise repeat from step 3). The advantages of k-means clustering; - Fast, robust and easier to understand.- Relatively efficient- Gives best result when data set are distinct or well separated from each other.

The disadvantages of k-means clustering; - The learning algorithm requires apriori specification of the number of  cluster

centers.- The use of  Exclusive Assignment - If  there are two highly overlapping data then

k-means will not be able to resolve  that there are two clusters.- The learning algorithm is not invariant to non-linear transformations i.e. with

different representation of data, different results occur (data represented in form of cartesian co-ordinates and polar coordinates will give different results).

- Unable to handle noisy data and outliers.- Algorithm fails for non-linear data set.

III.5. User Interface and Reports

Home Page: When the web site is opened, below page will be seen. From home page of B2B Clustering, you can see general information about the system. At the header of home page, you will find 4 options. These are ‘HOME PAGE’, ‘HOW DOES IT WORK’, ‘TRY’, ‘ABOUT US’. At the end of the explanation, by clicking ‘ See How It Works’ link, user will be directed to the ‘HOW DOES IT WORK’ page.

Figure-5: Home Page

13

Page 14: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

How Does It Work: This page is prepared to guide user to use B2B Clustering System easily. In this page, every step is explained to user.

Figure-6: How Does It Work

14

Page 15: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Try: This page will be open when you click ‘TRY’. This screen helps you to filter customers before clustering, to select attributes of customers together with putting weight on them, to change iteration and cluster number. There is an option that you can choose some filters or all filters and you can also pass this part by not selecting any filter. In that case, system will use all customers in dataset. As seen in the picture, default value for iteration is 30, for cluster is 3 and for attributes, if any of them is chosen, default value of weight is 1.

Figure-7: Try Page

When Region is selected from Filter part, user will be able to choose specific region shown in combo box.

Figure-8: Region Filtering

15

Page 16: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

When Time is selected from Filter part, user will be able to choose specific time period shown in combo box.

Figure-9: Time Period Filtering

If there is no selection of any attributes and user tries to cluster, system will return an error message shown in above picture.

Figure-10: Error Message

Results and Reports: For selected filters and attributes as seen in below, results will start with Figure-12;

16

Page 17: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Figure-11: Clustering with Selected Attributes and Filters

This below figure indicates the distribution of clusters of customers among all customers. For instance, cluster 1 has 81.45% of all customers.

Figure-12: Distribution of Clusters

After scrolling down, the general results will be shown. For every cluster, this report shows member number, averages of selected attributes.

17

Page 18: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Figure-13: Results Table

For detailed information about those clusters, please click on ‘Cluster x’. When ‘Cluster 1’ is chosen as in this example, following reports will be shown;

18

Page 19: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Figure-14: Detailed Result-1

At the left hand-side of the picture, you will see members of selected cluster. Members are in shown with their Customer ID and unique Code. At the right hand-side of the picture, Top 3 cities of these customers are shown in 3D bar chart. In that case, because it is chosen that customers are filtered according to city- Istanbul, as city only Istanbul is seen.

Figure-15: Detailed Result-2

This picture is as a continuation of previous picture. At right hand-side, the percentage of quantity is shown distributed to different vehicle types namely, passenger cars, light commercial, and heavy-duty vehicles. For instance, in Cluster-1, many of the customers with the percentage of 61.01, bought spare parts of passenger cars.

19

Page 20: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Figure-16: Detailed Result-3

This picture is as a continuation of previous picture. At right hand-side, the percentage of money that is spent for specific vehicle type is shown. For instance, in previous picture, it is seen that many of the customers buy spare parts of passenger cars but in this picture, the money that they pay has highest amount in heavy-duty vehicles. This can be because of that spare parts of heavy-duty cars are more expensive than others. As seen in the picture, by clicking ‘Back’ button, you can go back general result page and see detailed reports of other clusters.

20

Page 21: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

About Us: This page describes Kulustur Team, team members, and at the end of the page there is information about Oyepa.

21

Page 22: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

IV. ASSESSMENTAkın Güldere M.Mustafa Kırca Veysel Kurhan Numan Soysal İrem Belkıs Topçu

Team Creation X X X XProject Definition X XProject Proposal1st Presentation XLiterature ReviewDevelopment of Model X XMid Report X X X X2nd Presentation XData Collection XData Handling X X X XData Preparation X X X XData Integration X X X XDatabase Development X X X XGUI Design X XWebsite Design X XWeb Contents Preparation

X X X

System Development XFinal Report X X X XFinal Presentation X X X X

Handling data took too much time that is more than we expect. Since data preparation has huge significance, during this period, Mr. Bertan Badur and Miss Gonca Gülser lead us to the right path. Thanks to them, we were able to see what should be done for certain data mining and coding methods. Steps are done in an order mentioned in project plan, some delays can occurred because of situation of working friends and some unexpected situations, however, project continued step by step to the end. Finally, coding the algorithms and building the web page are done. In that phase, clustering algorithm is coded and an interface is constructed for this clustering tool. Designing web page was an important issue for us because we wanted webpage to be attractive and user-friendly. We stay connected more than ever with each other every time because this was the final step that we take. WhatsApp group and Google Drive folder were really helpful for us during all the time. Whenever something is changed regarding the project, we always kept in touch. Meeting place and time stay same except one or two meetings. Our team was powerful in terms of motivating each other. We always tried to do our best and whenever someone got down, others helped him/her to stand up. Our biggest challenge was having home in different parts of Istanbul. Therefore, when an extra meeting is needed, it was hard for us to gather at an early time. Managing the project was not a problem since every one of us knows what should be done in a structured way. For upcoming projects, it’s better live in a close neighborhood and having same classes. If we had a new DSS project, we would allocated too much time to data handling.

IV. 1. Project Plan

Meeting Place: Ağaç EvMeeting Time: Every Tuesday at 10.00 amCoordinator: İrem Belkıs TopçuTask Allocation:

22

Page 23: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

IV. 2. Master Plan

Project Group No: 10Project Title : B2B Customer ClusteringTeam Members : Akın Güldere M. Mustafa Kırca Veysel Kurhan Numan Soysal İrem Belkıs Topçu

PhasePlanned Actual

Complete%Problems

Start Finish Start Finish

Team Formation 1 Oct 9 Oct  1 Oct 7 Oct  100%

Project Proposal  21 Oct 24 Oct  21 Oct 25 Oct 100%

Presentation 25 Oct 27-28 Oct 25 Oct 26-28 Oct 100%Literature Review (Library, Web, former studies)

2 Nov  10 Nov   8 Nov  18 Nov 100 %

Data Collection and Organization  5 Nov  17 Nov  5 Nov 20 Nov  100%

Development of the model  10 Nov  23 Nov  13 Nov  21 Nov 100%

Midreport  7 Nov 23 Nov   15 Nov 22 Nov 100%

Presentation 17 Nov 23-25 Nov 23 Nov 24 Nov 100%Coding interfaces  29 Nov 15 Dec  29 Nov  19 Dec  100%

Final Report  24 Nov 18 Dec  10 Dec 20 Dec   100%

Presentation  10 Dec 18 Dec  19 Dec 22 Dec 100%

V. CONCLUSIONIn this project ‘B2B Customer Clustering’, a Decision Support System is developed by using k-means clustering which is the most common data mining technique to segment Oyepa’s customers. After literature review was done in order to deeply understand clustering together with other data mining techniques and other researches in B2B market as well as customer segmentation, data gathered from Oyepa and prepared to be used. Since this project use data mining techniques, it has huge significance to handle with data. Data pre-processing took time and effort when handling big data, which had almost 1,5 million row in dataset. Data are processed by starting with data cleaning and continues with data integration, data transformation and data reduction. When data are prepared for processing, new datamart is created and get ready for clustering. Firstly, RapidMiner is used to process data and to get clustering logic because two of group members wrote codes and algorithms, results of RapidMiner clustering can be seen in Appendix. Having path almost clear leads to start of website preparation. Web site is shaped by applying different templates. An appropriate template related with the

23

Page 24: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

business sector is selected and customized to this project. In order to have a user-friendly web site, ‘How Does It Work’ part is created and instructions are added in. Moreover, filters and some options in ‘Try’ part enable user to make changes for clustering system. Although it has default values, clustering system in this project does not work with constant choices. Therefore, user can filter customers, chooses attributes with putting weights on them, changing iteration number and cluster number. Among other programs and systems, there are some applying clustering method like IBM SPSS and RapidMiner, however they are not specified to one area and give detailed reports as this system does. That’s why, this project can be thought unique. That is, certain systems should be created in order to take strategic decisions easily in certain business areas. This B2B Clustering System enables Oyepa to see clusters of its customers and detailed information about them. Beside all these beneficial sides, because Oyepa does not allow relating customer dataset directly to the system, deficiency of this project is that customer data should be entered manually and updated in a specific time period. That’s why, for future systems, specific programs for specific areas will be beneficial.

24

Page 25: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

REFERENCES

Birant, D. (2011). Data Mining Using RFM Analysis. INTECH Open Access Publisher.

Chan, C.C.H. (2008). Intelligent value-based customer segmentation method for campaign management: A case study of automobile retailer, Expert Systems with Applications.

Cheng, C. H., & Chen, Y. S. (2009). Classifying the segmentation of customer value via RFM model and RS theory. Expert systems with applications.

Chiu, C-Y.; Kuo, I-T. & Chen, P-C. (2009). A market segmentation system for consumer electronics industry using particle swarm optimization and honey bee mating optimization, Global Perspective for Competitive Enterprise, Economy and Ecology, Springer London.

Dhandayudam, P., & Krishnamurthi, D. I. (2012). An improved Clustering Algorithm for customer segmentation. International Journal of Engineering Science and Technology.

Fayyad, U., Djorgovski, S. G., & Weir, N. (1996). Automating the analysis and cataloging of sky surveys. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining (pp. 471–494). Cambridge, MA: MIT Press.

Garla, S., Chakraborty, G., Gaeth, U. G., & Iowa, U. S. (2011). Comparison of K-means, Normal Mixtures and Probabilistic-D Clustering for B2B Segmentation using Customers’ Perceptions. In Conference Proceedings of the SAS Global Forum.

Hague, P. & Harrison, M. (2015). Segmentation in B2B Markets

Hosseini, S. M. S., Maleki, A., & Gholamian, M. R. (2010). Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty. Expert Systems with Applications.

Kashwan, K. R., & Velu, C. M. (2013). Customer Segmentation Using Clustering and Data Mining Techniques. International Journal of Computer Theory and Engineering.

Kim, H. K.; Im, K. H. & Park, S. C. (2010). DSS for computer security incident response applying CBR and collaborative response, Expert Systems with Applications.

Rygielski, C., Wang, J. C., & Yen, D. C. (2002). Data mining techniques for customer relationship management. Technology in society.

Stuntebeck, V. A. (2012). B2B customer segmentationImportant considerations when segmenting business customers. IBM Corporations

Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). USA: Morgan Kaufmann Publishers.

25

Page 26: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

APPENDIXNew Customer Datamart

26

Page 27: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Customer Data taken from Oyepa

Transaction Data taken from Oyepa

27

Page 28: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Invoice Data taken from Oyepa

Products Data taken from Oyepa

28

Page 29: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Clustering Results-1Before data handling, with selecting 5 clusters, using RapidMiner

Clustering Results-2After data handling, BEFORE standardization, with selecting 5 clusters, using RapidMiner

29

Page 30: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewPROJECT FINAL REPORT . ... Volvo, Scania, Daf, Rvi, Iveco ... customer may

Clustering Results-3After data handling, AFTER standardization, with selecting 5 clusters, using RapidMiner

30