ISI 2009: August 16-22, Durban, South Africa
Consumer-based market segmantation based on Association Rule and RFM
JongHoo Choi, ChunKyung Cha*
(Korea University)
1. Background and Purpose
2. Analyzed Dataset
3. RFM
4. Association Rule
5. Association Rule based on the RFM
6. Conclusion and discussion
1. Background and Purpose
Recently, statistical approaches are required to set up marketing strategies founded
on the database of buying history
Customer segmentation is necessary for CRM(Consumer Relationship Management)
Association rules based on RFM method can give good strategies of differentiated
marketing along with segment’s characteristics
The purpose of this study is to give a new method based on RFM(Recency-
Frequency-Monetary) and association rules for customer segmentation marketing.
Dataset used for this study is the buying history of solution products which are
manufactured by company ‘I’ from 2003 to 2005
Dataset is composed of ‘information of buying company’ and ‘product list of purchase’
Information of buying company consists of ‘recent buying quarter’, ‘sales’ and
‘frequency of buying item’
Total size of dataset is 3,886
The 6 variables, buying items list of each company, and 12 buying items list used for
analysis is represented in the table 2.1, 2.2 and 2.3, respectively
2. Analyzed dataset
Variables Description
ID Identification number
Emp_count Number of employee of a company
Zip
Region of a company1: Seoul 2: KangWon 3: Daejeon, ChungCheong 4: Incheon, KyungGi5: Kwangju, JeonLa 6: Pusan ,Ulsan, KyungNam, JeJu 7: Daegu, KyungBuk
Recency Buying quarter of 12 items
Revenue Buying amount from 2003 to 2005
Product Buying frequency of 12 items
Table 2.1: 6 Variables used for analysis
2. Analyzed dataset
Table 2.2: Buying items list of each company (part of a list)
2. Analyzed dataset
S/W
Information Management Data management and integrationLotus Portal/Cooperation/Business Messaging
Rational Software developmentTivoli Secure/System/Storage management
WebSphereApplication server and business integration
DB2 Database managementITS System environment optimization serviceAIM Infra software for e-Business
H/W
pSeries Unix serveriSeries Integration serverxSeries Intel based serverStorage Storage
Table 2.3: 12 Buying items
2. Analyzed dataset
3. RFM
RFM is the most generalized method for customer segmentation
Scoring for customer segmentation in the RFM is proceeded by linear combination
of three indicators, which are ‘Recency’, ‘Frequency’ and ‘Monetary’
We quantify the ‘Recency’, ‘Frequency’, ‘Monetary’, respectively and then add up the
three indicators weighting R, F and M values
where A,B,C are weighted value
It is a critical for assigning weighted values in the RFM
RFM = A×Recency + B×Frequency + C×Monetary
3.1 Scoring
We use the Pareto’s rule 1) , to solve the assigning problem of weighted values to RFM
We obtain the weighted values from the R,F and M ratios of the upper 20% customers
in the sense of total sales amount
1) Pareto rule : Customers belonging to the upper 20% are theoretical that we gained 80% of total sales
3. RFM
3.1 Scoring
Customer’s score = 0.3×Recency + 0.2×Frequency + 0.5×Monetary
The weights according to ratios of R, F and M values using the upper 20% customers based on the buying amount of money
Upper 20%
Remainder 80%
The buying period of upper 20% customers
Total buying period
The buying frequency of upper 20% customers
Total buying frequencyThe buying amount of upper 20% customers
Total buying amount
53.6
53.6+37.2+90.2= 0.3• Weight of R value =
37.2
53.6+37.2+90.2= 0.2• Weight of F value =
90.2
53.6+37.2+90.2= 0.5• Weight of R value =
3. RFM
3.1 Scoring
The RFM model is not to induce new customers but to efficiently operate by
segmenting existed customers
The RFM supports that we can execute a concentrative marketing action to the loyal
customer
Basically, the RFM method is more useful for creating profits than increasing sales
amounts
3. RFM
3.1 Scoring
3.2 Set up R,F and M values
Recent buying period R-value Freq % Cum.%
1, 2, 3Q 1 203 5.2 5.2
4, 5, 6Q 2 349 9.0 14.2
7, 8Q 3 408 10.5 24.7
9, 10Q 4 894 23.0 47.7
11Q 5 551 14.2 61.9
12Q 6 1,481 38.1 100.0
Total 3,886 100.0
Table 4.2: Set up R-value and distribution based on the recent buying period(Q:Quarter)
3. RFM
Total buying frequency F-value Freq % Cum.%
1 times 1 1,344 34.6 34.6
2 times 2 821 21.1 55.7
3 times 3 502 12.9 68.6
4 times 4 347 8.9 77.6
5~7 times 5 642 16.5 94.1
more than 8 times 6 230 5.9 100.0
Total 3,886 100.0
Table 4.3: Set up F-value and distribution based on the recent buying frequency
3. RFM
3.2 Set up R,F and M values
Total buying amount M-value Freq % Cum.%
~ $10,000 1 824 21.2 21.2
$10,000 ~ $30,000 2 583 15.0 36.2
$30,000 ~ $70,000 3 554 14.3 50.5
$70,000 ~ $150,000 4 536 13.8 64.3
$150,000 ~ $500,000 5 706 18.2 82.4
$500,000 ~ 6 683 17.6 100.0
Total 3,886 100.0
Table 4.4: Set up M-value and distribution based on the recent buying amount
3. RFM
3.2 Set up R,F and M values
As we see from table 4.2 to table 4.4, R,F and M-values are classified with 6 egments
based on the ‘Recency’ of customers’ buying data, ‘Frequency’ of customers’ buying
frequency and ‘Monetary’ of customers’ buying amount, respectively
Consequently, R,F and M values can be quantified numeric values between 1 to 6
It can be converted into standardized value and finally figured out as RFM score.
More general equation is presented as follows.
3. RFM
3.3 Customer segmentation using RFM
Customer’s score = 0.3×Recency + 0.2×Frequency + 0.5×Monetary
RFM score = (Customer’s score×100)/6
ex> If R=6, F=6, M=6 (R,F,M)=(6,6,6)
then Customer’s score = 0.3×Recency + 0.2×Frequency + 0.5×Monetary
= 0.3×6 + 0.2×6 + 0.5×6 = 6
RFM score = (Customer’s score×100)/6
= (6×100)/6 = 100
, (R,F,M) = (1,1,1) ~ (6,6,6)
3. RFM
3.3 Customer segmentation using RFM
RFM score Frequency % Cum.%
42 77 2.0 17.7
: : : :
50 120 3.1 23.7
52 21 0.5 23.8
: : : :
60 90 2.3 49.2
62 85 2.2 49.4
: : : :
70 92 2.4 64.3
72 86 2.2 64.5
: : : :
80 84 2.2 79.5
82 13 0.3 79.6
: : : :
90 28 0.7 84.9
92 42 1.1 86.0
: : : :
98 14 0.4 97.1
100 112 2.9 100
The cutoff point of a best group
The cutoff point of a better group of the upper 20% customer
3. RFM
3.3` Customer segmentation using RFM
Support(%) Confidence(%) Frequency Association rule
23.91 71.89 931 ITS → ITS → ITS
15.79 75.37 615 AIM → ITS → ITS
13.59 76.56 529 Storage → ITS → ITS
12.33 76.92 480 ITS & AIM → ITS → ITS
Table 4.1: Output of the association rule
4. Association Rule
Table 4.1 shows the results of the first 4 association rules
They are selected by the ‘support’ value which is an evaluation criteria of association rule(Jiawei Han and Micheline Kamber, 2006)
Association rule is a data mining technique for finding interesting association, pattern and/or relationships from sequential and replicative events
Therefore, it is useful to discover relationships such as arrangement of product and promotions
Support : Pr(A∩B)
The number of total transaction
(The number of transaction including A and B)Support of 'A→B’=
Confidence : Pr(B|A)
The number of total transaction including A
(The number of transaction including A and B)Confidence of 'A→B’=
4. Association Rule
5. Association rule based on the RFM method
Support(%) Confidence(%) Frequency Association rule
59.48 75.26 295 ITS → ITS → pSeries
56.05 82.74 278 AIM → ITS → ITS
54.84 80.00 272 Storage → ITS → pSeries
Table 5.1: Association rule for best group
Table 5.1 shows the result of best group.
Table 5.1 shows the first 3 association rule that are selected by ‘Support’
From the table 5.1, we can find that the company buy an ‘ITS’ also purchase an ‘ITS’ and then buy a ‘pSeries’
The company buy an ‘AIM’ also purchase an ‘ITS’ and then buy an ‘ITS’ and so on…
Support(%) Confidence(%) Frequency Association rule
23.38 48.70 94 ITS → ITS → xSeries
23.13 71.54 93 Storage → ITS → ITS
20.90 67.74 84 AIM → AIM → ITS
Table 5.2: Association rule for better group
Table 5.2 represents the result about better group’s products
From the table 5.2, we can find that the company buy an ‘ITS’ also purchase an ‘ITS’
and then buy a ‘xSeries’
The company buy a ‘Storage’ also purchase an ‘ITS’ and then buy an ‘ITS’
5. Association rule based on the RFM method
6. Conclusion and discussion
Customer segmentation using the RFM method and association rule is helpful to
construct the differentiated marketing strategies for segmented customers
As we see in the previous chapter, buying patterns are represented by segmented
customers
It becomes an useful information to establish marketing strategies and consumer
relationship management
In the further research, we can proceed the customer segment specified marketing
strategies and identify new customers with similar buying patterns, selectively and
concentratively
[1] Don Peppers and Martha Rogers (1999). Enterprise One to One, First Edition, New York : Currency Doubleday.
[2] Jiaewi Han and Micheline Kamber (2006). Data Mining: Concepts and Techniques, Second Edition, San Francisco : Morgan Kaufmann.
References
upper 20% customers
100.03886Total
::::
20.10.01418,881
0.13
20.00.01422,000
::::
0.30.0126,516,190
0.30.0126,516,190
Cum.%%Freq.Total buying amount
Cutoff point of the upper 20%
The number of the upper 20% customers is 799 companies among 3886 companiesTotal buying amounts is all the 90.2% occupancy
420,000 20.0
1 2 3 4 5 6 7 8 9 10 11 12
2003.01 ~ 2005.12 36 Months 12 Quarters
ID Buying period Upper 20% customer
0001 12 O
0002 1 O
0003 4 X
0004 6 O
0005 3 O
0006 6 X
… … …
3886 9 O
Total 23312 O = 799
Sum of the period of upper 20% customers
Total buying period• Weight of R value =
2003
(Quarter)
2005
period