21
Marketing Analytics “FGV Italy Cross-product Sales Analysis: a B2B case” Student: Lorenzo Nicolò Formenti ID Number: 12-216-354 Course name: Marketing analytics and data mining with inductive fuzzy classification (Seminar) Home institution: Università di Pavia Faculty of Economics via San Felice 7 27100 Pavia, Italy Host institution: Universitè de Fribourg Facultè de sciences èconomiques et sociales Bd. de Perollès 90 1700 Fribourg, CH Examiner: Prof. Andreas Meier Supervisor: Dr. Michael Kaufmann Date of submission: 6th December 2012

“FGV Italy Cross-product Sales Analysis: a B2B case” · Marketing Analytics “FGV Italy Cross-product Sales Analysis: a B2B case” Student: Lorenzo Nicolò Formenti ID Number:

Embed Size (px)

Citation preview

Marketing Analytics

“FGV Italy Cross-product Sales Analysis: a B2B case”

Student: Lorenzo Nicolò Formenti

ID Number: 12-216-354

Course name: Marketing analytics and data mining with inductive fuzzy

classification (Seminar)

Home institution: Università di Pavia

Faculty of Economics

via San Felice 7

27100 Pavia, Italy

Host institution: Universitè de Fribourg

Facultè de sciences èconomiques et sociales

Bd. de Perollès 90

1700 Fribourg, CH

Examiner: Prof. Andreas Meier

Supervisor: Dr. Michael Kaufmann

Date of submission: 6th December 2012

2

TABLE  OF  CONTENTS  

 

1. Partner .......................................................................................................................................... 3  1.1 History ...................................................................................................................................... 3  1.2 Mission ..................................................................................................................................... 4  1.3 Relevant Market ....................................................................................................................... 4  1.4 Network .................................................................................................................................... 5  1.5 Sales Strategy .......................................................................................................................... 5  

2. Research path .............................................................................................................................. 7  2.1 Raw data .................................................................................................................................. 7  2.2 Data improving and selection ................................................................................................... 7  2.3 Research purposes .................................................................................................................. 7  2.4 Methodology ............................................................................................................................. 8  

2.4.1 Data mining ....................................................................................................................... 8  2.4.2 Correlation analysis ........................................................................................................... 8  2.4.3 Cluster Analysis ................................................................................................................. 9  

3. Data Analysis ............................................................................................................................. 13  3.1 Cross-product Correlation analysis ........................................................................................ 13  

3.1.1 Preliminary test ................................................................................................................ 13  3.1.2 Pearson correlation matrix ............................................................................................... 14  

3.2 Cluster analysis ...................................................................................................................... 17  LITERATURE ............................................................................................................................... 21  INDEX OF FIGURES ................................................................................................................... 21  ACKNOWLEDGMENTS ............................................................................................................... 21  

3

1.  Partner  

1.1  History  

The business of Formenti & Giovenzana SpA was started in 1949 in Veduggio con Colzano, Italy, 20

kilometers south of Como Lake and 30 kilometers north of Milan, by an opportunity-seeking team of young

workers who tried to move forward after being directly involved in the second World War’s catastrophic

events.

To produce budget-friendly furniture hardware was the company’s primary orientation: shiny but scarce raw

materials like gold and silver were highly demanded by European consumers as well as massive-brass

drawer knobs used to represent high fashion in the upper class society houses of the time.

To meet the masses’ increasing demand of such fittings, by developing a valuable but more affordable

product, FGV launched into the market a cleverly-made hollow brass. This enabled the company to exploit

cost savings on one side while offering a consumer-oriented competitive product on the other, by means of

pragmatism and cleverness.

Constantly researching new opportunities to offer clever solutions, the company completed its product range

step by step to achieve a portfolio covering what is today a wide range of furniture-systems solutions: hinges,

slides, drawer systems, wall hanging brackets and wireworks, all meant to make quality products affordable

to the majority. Over the years, the company has consolidated its image and become synonymous of cost-

innovation, at both the product and process level.

In the late 1970’s, the company looked for the first time at starting a gradual penetration of foreign markets of

other European countries (France, Spain, Germany, UK), northern and southern America, the Middle East

and eventually Asia. In a growing market characterized by strong import-tariffs, a manufacturing unit was

opened for the first time in Portugal; afterwards, two sales branches were started in Germany and the USA,

specifically South Carolina, respectively in 1986 and 1993.

In the 1990s, the increasing production costs of the national context compelled the company to look even

further at low labour-cost countries: an Equity Joint Venture (E.J.V.) in Slovakia for the production of drawer

slides and the acquisition of a Brazilian slides producer.

During the first years of the 21st century the emergence of Chinese competitors in the furniture fitting

business. These Asian businesses were also cost-leaders; low prices (often accused of dumping) and

acceptable quality levels were the weapons of such manufacturers who began to make their way throughout

Europe by relying upon local dealers.

Even large distributors, the most relevant customer for FGV, could not ignore the temptation to buy from

China, often displacing on-site purchasing departments. Thus, FGV decided to create its own representative

office in Hong Kong to directly trade in the area. Customers pushed for more and the next step was the

acquisition of a Chinese company, which was, as in the case of Brazil and Slovenia, a drawer slides

producer.

In 2004, after only two years, the company by means of significant investments in technology became a

reference point for the local industry. Other products are now being manufactured other than drawer slides,

such as steel drawers, a part of which is exported abroad. In the international context, FGV Asia, through a

synergic action with the Hong Kong rep-office is today able to supply customers with a wide range of

4

competitive products, guaranteed by a well-known and now established brand.

1.2  Mission  

The approach behind the first products has persisted over time and has turned out to be the core of the

current company mission: "Value for many" as a result of a portfolio based on what the company calls Good-

Better-Best solutions. “Value” is today what customers are looking for, from the entry level to the top of the

line product range, showing personal levels of willingness to pay for such value.

Trying to match such a variety of customers’ needs by providing added-value at the lower price, by means of

constant research and innovation, has also become a standard part of the company’s mission, relying upon

mass-volume manufacturing and scale-cost advantages.

1.3  Relevant  Market  

Until the 1990’s, the market was dominated by German and Austrian large-scale players (1000 million€

turnover) with a relevant share taken by smaller but efficient Italian enterprises (100mln€).

The recent entry of Asian cost-leaders that was discussed above has shifted this scenario into a new double-

set: today German and Austrian companies find themselves dominating only the furniture manufacturers’

side of the market.

Italian firms such as FGV are facing harsh competition from Chinese manufactures to capture the residual

distribution supply side: within this scenario, British and French large-volume distributors, representing a

huge share of the overall demand, seem to be European-oriented but deal sometimes at untenable

conditions, leaving to such challenge fertile ground to sharpen.

Innovation and investment choices are mostly related to the company strategy that is evaluated and revised

periodically, taking into consideration the evolving market conditions and its tradition of flexibility. Hence the

innovation takes place at both the product and process level, incrementally, by improving existing products

and manufacturing lines, as well as more radically, by means of new investments implemented over the

market’s dominant technological trajectories and the introduction of new product ranges.

Exit barriers of a modest size are present at the manufacturing level, since in the event of line failures,

machinery can be converted into new use with some, more or less, expensive alterations and improvements.

In other cases they are designed to the company’s delocalised production units abroad or even to

contractors.

On the entry side, barriers consist in huge start-up capital requirements and economies of scale exploited by

the market leaders. These are out of reach for new players, at least in the first business phase, thus strongly

limiting competitiveness.

Looking at margins, the product mark up changes across products and according to the customer’s business

type and traded volumes. In general, for premium products (top range) the average mark-up is around 30-

35%, while for entry levels and more widespread items (commodities) it drops to 20-25%, due to the harsh

Asian price-competition. Often the big bargaining power of large distributors oblige the company to keep very

low margins, but such fact is counterbalanced by regular huge demand flows, guaranteed with on-time

payments.

On the product side, the company’s traditional flexibility results in a broad range of products (see section

1.5), both standardized and custom. In general, big DIY (Do It Yourself) groups like Ikea, Brico, B&Q and

5

Conforama are supplied mostly with customized solutions, as tailor-made packaged kits ready for reselling.

Distributors, on the other hand, are more likely to buy more standardized items of the range, made of several

variants. Finally, smaller enterprises and furniture’s manufacturers behave more specifically, according to

their product ranges and final targets.

1.4  Network  

FGV today is an Italian-based company active in the furniture hardware industry, facing modern market

challenges through a worldwide-established network. Trading as a B2B player in several countries,

operations are carried out through exclusive out-sourced distribution and a broad range of wholly owned

activities. With respect to the second, today the most relevant are:

- Italy: Headquarter, manufacturing, R&D and sales

- Germany: wholly owned subsidiary, distribution

- Poland: wholly owned subsidiary, distribution

- Spain: branch office, sales

- China Dongguan: wholly owned subsidiary, manufacturing

- China Canton: branch office, sales

- Hong Kong: branch office, sales

- Brazil: wholly owned subsidiary, manufacturing and sales

- Slovakia: wholly owned subsidiary, manufacturing

For my purpose, the focus will be on FGV Italy, the Italian division and headquarters of the group, who

provided me with data over the 2011 sales period and also some operative support.

1.5  Sales  Strategy  

The company strategy takes most of its relevant features from the company mission presented above: cost-

advantages, budget friendly solutions under a “value for many” perspective, as the mission’s key features

are the basic elements taken into account to adress sales operations.

The products are all designed at the Italian headquarter’s R&D labs, manufactured in Italy, China, Slovakia

and Brazil and sold all over the world in a standardized way, as the related final customer needs and uses

are supposed to be similar across the countries.

Instead, the difference lies in the customers’ willingness to pay for such products, since purchasers operate

in different economic backgrounds, manufacture or supply a wide range of products and sell them to different

purchasing-power determined markets.

For these reasons, the attempts to offer price-competitive products and guarantee a good level of quality at

the same time, are pursued by making what a micro-economist would call price descrimination: by following

the company’s cost structures, the same product is sold at different prices depending from its origin:

customers are charged with different prices when buying the same product from China or from Italy.

In this way, a quality-price trade off becomes the main point in customer’s purchasing choices and so the key

imput of this analysis.

Even if products are sold in a standardized way, elements of diversification are present within the product

6

range and must be taken into account. As specificed in section 2.2, our focus will be on three main product

categories: hinges, slides and drawers, manufactured and supplied in different variants. With respect to

them, the considered sales strategy take the following pattern:

1. HINGES

- Hinges – The section is composed by various types of equipment depending on the fastening

system and the box size. Among them, Genios, a dampered-closing hinge, is supposed to represent

the premium product of the line. It is sold for a higher price and designed over innovative concepts. It

represents the most profitable opportunity for future market developments.

- China-manufactured hinges – Chinese contractors today are able to offer the whole product range at

competitive prices and reasonable quality levels. The main differences from “made in EU” items are

only related to small product components and less efficient assembly methods. Outsource material

manufactured in China, are sold to customers by FGV Asia, after the product quality has been

assessed by a quality control team within the firm.

2. DRAWERS

- “Ten” drawers – considered as the premium product of this category, It is 100% manufactured on the

Italian premises of the group by a “lean” manufacturing proceeding. Able to compete even with the

top-level European drawers, it is the product on which the company is betting to increase revenues

in 2013.

- “Prime” drawers – in-source manufactured in Slovakia and mainly sold in huge volumes to large

French and British distributors. There is a good price to quality ratio.

- “Prime” China manufactured drawers – same variant, manufactured in China (outsourcing)

- China drawers – in-source manufactured in China by FGV Dongguan. This is an entry level product,

at a higly competitive price. It is either imported in Italy to serve the European market or locally sold

by FGV Asia. It accounts for a large proportion of the company’s overall sales.

- Drawers – entry level drawer sold at a competitive price. Same as “China drawers” but manufactured

in Italy.

3. SLIDES

- “Excel” slides – in-source manufactured in Italy, top level, high quality standards combined with

damper closing tecnology.

- Slides – in-source made in China (FGV Dongguan). This is an entry level product (commodity). It is

either imported by Italy to be sold in Europe or locally supplied by FGV Asia. It is known as a budget-

friendly product.

- China slides – outsourced to Chinese contractors, even more basic the previous.

In general, the company places itself at a middle-to-low market level, where profitability mainly comes from

large business volumes instead of huge unitary margins. The recent global economic downturn has further

expanded this target area and it is for this reason that the company, offering competitive but valuable

products, holds up well today serving its traditional targets while getting the attention of other new segments.

7

In addition, by means of premium level products (Ten and Genios), FGV today is able to compete with the

top market players, attempting to steal parts of their own shares.

2.  Research  path  

2.1  Raw  data  

The data-set provided by the market and sales management of the company is related to 2011 sales for

individual customers (rows, 255 observations), divided by product (columns, 48 variables) and expressed as

percentage shares over the global sales generated by each product. Business type (manufacturing,

distribution, agent) and nationality for each customer are also included.

2.2  Data  improving  and  selection  

For my purposes, I found it useful to enlarge the provided data set with additional information and make a

selection among the ones provided, so to focus on the relevant dimensions only.

Considering distributors, it is reasonable to think that they show similar behaviours around the world. By

definition, a distributor serves a broad range of customers by providing a more or less widespread array of

products so that would be less attractive to add specifications with respect to its final target and look for

significant patterns. A further investigation was made for furniture manufacturers: by searching the internet,

six new attributes were defined under the label “final”: “bedroom”, “living”, “kitchen”, “bathroom”, “office” and

“other”. Such variables take values 0 or 1 whenever the considered manufacturer is “active” (1) or “non

active” (0) in that particular business, where “other” stands for any other business area not referable to the

main ones, perticularly: wood or rustic items, hotel furniture, hardware, constructions, or aluminium.

With respect to the raw data selection, as the provided data set was boundless and related to the whole of

the company’s activities, I decided to undertake a first manual “mining” phase, for the selection of information

and variables relevant to my purpose, in order to adress the research rationally. Particulalrly, I focused on

the three main product categories described above: hinges, slides and drawers, respectively accounting for

30%, 20% and 30% of the overall turnover.

Moreover, I have been taking into account sideburns, a product which use is strictly complementary to

hinges, as an operative benchmark: being concious of their “natural” sales correlation, it was possible to use

such relation to test for methodological purposes and data consistency.

2.3  Research  purposes    

The main objective of this research is to make an in-depth analysis of such data, “mining” out hidden

relevant informations and look for the existence of similarities between different customers’ behaviours and

willingness to buy specific products, The main purpose is to look at how they change across different

business types and categories, in order to identify, accordingly, different customers’ classes (“clusters“), with

similar behaviours, by the initial support of cross product correlations. The research findings will then be

helpful for the market-intelligence Head to double-ceck his actual knowledge of the market and target

future’s sales more consciously.

8

2.4  Methodology    

2.4.1  Data  mining  

The methods used in this research are mainly based on data mining techniques in marketing analytics. Over

a huge amount of data, relative to any kind of phenomenon, be them panel, time series or cross-sectional

structured, a large volume of information is provided. Reasonably, the quantity of such information that

people understand is only a small part of it. Lying behind this data stands potentially useful information and

patterns, rarely evident or taken advantage of at a first look. The extraordinary growth of databases in recent

years, concerning basic everyday activities such as customer choices and behaviours, take data mining to

the leadership of new business technologies.

As stated in [Witten & Frank 2005, p.5], as the world grows in complexity, overwhelming us with the data it

generates, data mining becomes our only hope for elucidating the patterns that underlie it. Intelligently

analyzed data is a valuable resource. It can lead to new insights and work as a brilliant link between

statistics and computer science.

Data mining is a term used to describe knowledge discovery in databases. Formerly “the term was used to

describe the process through which undiscovered patterns in data were identified. However, over time, the

original definition has been enlarged to include most types of (automated) data analysis…the process of

finding mathematical patterns from usually large sets of data. These patterns can be rules, affinities,

correlations, trends or prediction models” [Turban, Aronson, Liang & Sharda 2007, p.305].

For our purposes such techniques will be exploited by a combination of correlation and cluster analysis, in

order to identify cross-product sales similarities (and differences), customer behaviours and market

peculiarities over a set of selected relevant variables.

2.4.2  Correlation  analysis  

In statistics, correlation is a single number that describes the degree of relationship between two variables.

To give a general overview over cross-product purchasing behaviours of customers and the resulting sales

patterns, the relations between n=10 product’s sales will be evaluated by computing a Pearson’s correlation

coefficient for each possible bivariate relation. Intuitive and easy to evaluate, this is the correlation measure

that best fits for our goals.

Given two variables, sales of product X and the sales of product Y, the Pearson’s product-moment

correlation coefficient (ρ) is defined as:

𝜌!,! =cov  (𝑋,𝑌)𝜎!𝜎!

where:

 cov  (𝑋,𝑌): covariance of X and Y,

 𝜎!𝜎!: X,Y standard deviations

9

For the evaluation of cross-product sales relations, notice that the coefficient maps in the interval [+1;-1],

where:

𝜌!" > 0  : positive correlation

𝜌!" = 0  : no correlation (X,Y independent)

𝜌!" < 0 : negative correlation

and

0 < 𝜌!" < 0,3    : weak correlation

0,3 < 𝜌!" < 0,7    : moderate correlation 𝜌!" > 0,7    : strong correlation

Looking at the coefficients, the reader will be able to identify the degree of relation existing between different

product sales for both manufacturers and distributors customers, getting a first immediate picture of such

dynamics. Afterwards, the correlations will be considered to narrow-define the findings of the cluster analysis

2.4.3  Cluster  Analysis  

As one of the most known data mining techniques, clustering’s scope is to partition a (more or less) large

database into smaller subsets (“clusters”), whose members share similar features. Cluster algorithms

classify objects based only on information found in the data that describes the objects and their relationships.

Since such methods aim to “mine out” relevant hidden patterns and information in data, as stated in [Tan,

Steinbach & Kumar 2005, p.490], the final goal is that the objects within a group be similar (or related) to

one another and different from (or unrelated to) the objects in other groups. The greater the similarity (or

homogeneity) within a group and the greater the difference between groups, the better or more distinct the

clustering, defined as a collection of clusters. This suggests that there does not exist an a-priori better way to

cluster a set of data, but it is subject to an iterative process of trial over trial, that will lead the researcher to

the best solution according to the matter at hand and the type of available data.

In this analysis, the focus will be over exclusive or non overlapping cluster algorithms.

Within this group the main trade off in choosing how to undertake the analysis is the one between

hierarchical and non hierarchical (partitional) techniques. A partitional clustering is simply a division of

the set of data objects into non-overlapping subsets (clusters) such that each data object belongs to exactly

one subset. If instead we permit clusters to have sub-clusters, then we obtain a hierarchical clustering, which

is a set of nested clusters that are organized as a tree [Tan, Steinbach & Kumar 2005, p.492]. In both the cases, the goal is to minimize the distance within the clusters and maximize the distance between

them.

10

Figure 1: Different ways of clustering the same set of points (Tan, Steinbach & Kumar 2005)

Hierarchical algorithms, in their associative mode, work in the following way:

1. First, each unit, by its multidimensional connotation, is assigned to a cluster, so that for “n” items, “n”

clusters are created. Let the distances (similarities) between the clusters be the same as the

distances (similarities) between the items they contain.

2. The closest (most similar) pair of clusters (items) is merged into a single cluster, so that now there is

one cluster less.

3. Distances (similarities) between the new cluster and each of the old clusters are computed.

4. Steps 2 and 3 are repeated iteratively until all items are clustered into a single cluster of size n.

Figure 2: Hierarchical algorythm, the Dendogram

11

What is important to notice is that when using such algorithms, the number of final clusters is not defined a

priori. Instead, after choosing a proper measure, given a sample of “n” observations, an nxn distance matrix

will be generated containing the distances, taken in pairs, of that set of points. Then, the closest objects will

be associated according to the 4 steps above.

Afterwards, a particular kind of hierarchical three-called dendongram is generated to illustrate the iterative

association steps in an immediate way (figure 2). Each node of the tree is the union of its “children” (sub-

clusters), and the root of the tree, laying at the “x” axis level, is the cluster containing all of the objects.

The researcher will then decide where to “cut” the pattern, looking for a good compromise between the

number of groups and the homogeneity within them, depending on how the data is distributed and on the

problem at hand.

For our purposes, it is enough to know that the number of clusters at the cut-off level, corresponding to the

red straight line in the graph, is the one adopted to define the final clusters, composed accordingly by the

associated observations.

By contrary, when using non-hierarchical algorithms, the main difference is that the number of final

clusters does not come out as a result, but has to be arbitrarily chosen when starting. In fact, the set of

multidimensional units are grouped into a defined number of clusters, built over a pre-defined number of

centers called centroids. After setting, as before, a proper distance measure, using an iterative algorithm that

optimizes a chosen criterion, each unit is assigned to its closest centroid.

Starting from an initial classification, units are transferred from one group to another or swapped with units

from other groups, until no further improvement can be made, and the out-come partition is supposed to be

optimal, at least locally.

Advantages and disadvantages of the two approaches are summarized in table 1:

Hierarchical Non Hierarchical

- more informative hierarchical

structure of output

- no decision about the

number of clusters

- complex algorithms

(computations may take

long)

- great influence of initial

decisions

- great distortions due to error

components

- easy to read and interpret

- more reliable but less

informative

- number of clusters to be

specified ex ante

- iterative adjustments give

more flexibility

- not easy to read and interpret

- simple and intuitive algorithms

- more defined final clusters

Figure 3: Hierarchical vs. Non hierarchical clustering.

12

Within non-hierarchical criteria, the present analysis will be carried out by means of the so-called K-means

algorithm, which is broadly used, simple and reliable.

Figure 4: Cluster building process by K-means algorithm [Tan, Steinbach & Kumar 2005]

Basically, k initial centroids are chosen, where k is a user-specified parameter, namely, the number of

clusters desired. Each point is then assigned to the closest centroid, and each collection of points assigned

to a centroid is a cluster. The centroid of each cluster is then iteratively updated based on the points

assigned to the cluster. The assignment and update steps are repeated until no point changes clusters, or

equivalently, until the centroids remain the same [Tan, Steinbach & Kumar 2005, p.498]. Disjoined subsets

are thereby formed.

Effectively, some steps have to be followed to carry out a K-means cluster analysis:

1. Choose a valid proximity (distance) measure. For the sake of simplicity, we choose as a distance

measure of clusters the Euclidean distance, defined as:

𝐷!" = (𝑥!" − 𝑥!")!!

!!!

where:

Dij: distance between cases i and j

xki: value of variable Xk for case i

xkj : value of variable Xk for case j

Intuitively, it corresponds to the linear segment connecting two arbitrary points “i” and “j” on the plane, and is

commonly used for data points in the Euclidean space

13

2. Selects relevant variables (data matrix columns), to be included into the analysis over which the

clusters will be defined

3. Set the number of desired clusters, approximated as:

𝑘 ≈ 𝑛 2

4. Run the analysis setting different values of “k”. Check which one maximizes the distance between

the centroids (mean values) and minimizes the distance between the points and the centroids (SSE:

sum of the squared error, i.e. squared Euclidean). Choose accordingly.

5. Interpret the results of the final output, where a final cluster is represented by the mean value that

each selected column variable takes within that particular cluster. Such values have to be compared

with the mean value that the variable takes over the whole distribution, to describe the elements

belonging to the cluster.

6. Describe the output to form homogeneous elements classes.

The most common marketing applications of cluster analysis look at a defined set of customers’ attributes

(age, gender, religion, level of education etc.) grouping them into similar classes and build, according to their

preferences, a coherent market segmentation analysis. Some others are devoted to the search for new

product opportunities, by clustering similar brands to describe the competitive environment and identify

eventual unserved segments.

As a business-to-business, the FGV case has different peculiarities, and this is what makes the matter at

hand attractive. Since attributes of the type described above do not exist for enterprises, we identify as

customer “attributes” their purchasing behaviours, considering 2011 percentage sales over 10 products as a

proxy. Under this perspective, consumers with similar behaviours and characteristics are clustered to identify

common purchasing patterns and finally homogeneous customers classes. This will be carried out

distinguishing between two different categories: furniture manufacturers and distributors, the purchasing

baskets of which are different by definition.

3.  Data  Analysis  

3.1  Cross-­‐product  Correlation  analysis  

Following the methodology explained above, the cross product correlation coefficients are computed and

summarized in a matrix, to allow the reader to give a first general evaluation of cross product sales patterns.

The coefficients will then be compared with the final clusters results as a benchmark, to give the results

some consistency and more narrowly define the picture.

3.1.1  Preliminary  test  

Before starting we do a simple operation, to test for the coherence of this method and the quality of our data:

compute the Pearson cofficient for Hinges and Sideburns. It is necessary to point out here that sideburns are

a fundamental component of hinges, the sales of which are split by the company for accounting purposes.

For this reasons, we expect a strictly positive and strong correlation occurring between the two.

14

Hinger/Sideburns test

51 52

51 Hinges 1 0,824**

52 Sideburns 0,824** 1

**. Correlation is significant at the 0.01 level (2-tailed).

Table 1: Correlation analysis. A preliminary test.

As expected, the coefficient takes value 0,824 and shows that a strong positive sales correlation exists.

3.1.2  Pearson  correlation  matrix  

Used in the same way as the test, the percentage sales per customer of the considered products are

analysed here with respect to their cross relations. A 10x10 Pearson correlation matrix reports the

coefficients for each possible bivariate relation. Here, no distinction is applied for manufacturers and

distributors.

Pearson correlation matrix: global (n=255)

Products 51 51C 54 54B 54C 58 58C 58E 58F 58X

51 Hinges 1 0,128* 0,236** 0,629** 0,452** 0,426** 0,284** 0.097 0.017 0,324**

51C China Hinges 0,128* 1 0,003 0,003 0,022 0,006 -0,003 -0,015 -0,011 -0,024

54 Slides 0,236** 0,003 1 0,292** 0,412** 0,161** 0,093 0,046 0,009 0,013

54B Excel Slides 0,629** 0,003 0,292** 1 0,593** 0,164** 0,211** 0,338** 0,043 0,537**

54C China Slides 0,452** 0,022 0,412** 0,593** 1 0,107 0,112 0,054 0,077 0,157**

58 Drawers 0,426** 0,006 0,161** 0,164** 0,107 1 0,547** 0,348** -.003 0,014

58C China Drawers 0,284** -0,003 0,093 0,211** 0,112 0,547** 1 0,588** 0,013 0,088

58E Prime Drawers 0.097 -0,015 0,046 0,338** 0,054 0,348** 0,588** 1 0,544** 0,099

58F China Prime Drawers 0.017 -0,011 0,009 0,043 0,077 -0,003 0,013 0,544** 1 0,052

58X Ten Drawers 0,324** -0,024 0,013 0,537** 0,157** 0,014 0,088 0,099 0,052 1

*. Correlation is significant at the 0.05 level (2-tailed). **. Correlation is significant at the 0.01 level (2-tailed).

Table 2: Pearson's cross-product correlation coefficients: global.

Globally, what is evident is a consistent level of intra-category correlation: for instance, who is buying

Prime drawers, a good quality to price ratio product, is also likely to diversify his choices within the

same product category and buy other drawers with different features. The same is true if we look at

slides, particularly strong for China slides, the most basic and price-competitive model.

Moreover it is easy to notice a sort of inter-category correlation, even if less strong than the previous:

15

customers who are purchasing Excel slides seem likely to also buy Drawers, especially true for the

model “Ten”, being them both premium level products. This may be due to the fact that slides and

drawers are, to a certain point, complementary products but surprisingly, this does not seem to hold

true for lower level products. Finally, a very peculiar pattern is followed by Hinges, where consistent

degrees of correlations show off with almost all the other items of the range and strongly with the top

range model of slides. Surprisingly instead, the Chinese variant’s sales almost behave independently.

Considering distributors, accounting for the majority of FGV customers, the same patterns are

expected to repeat with some differences. Looking at data, the dynamics described at the general

level are in fact confirmed, with a significant magnitude of intra-category correlations that look

stronger than before.

Pearson correlation matrix: distributors (n=171)

products 51 51C 54 54B 54C 58 58C 58E 58F 58X

51 Hinges 1 0,137 0,348** 0,774** 0,493** 0,392** 0,288** 0,126 0,135 0,338**

51C China Hinges 0,137 1 0,006 0,002 0,018 -0,002 -0,007 -0,013 -0,026 -0,034

54 Slides 0,348** 0,006 1 0,535** 0,614** 0,269** 0,154* 0,105 0,136 0,006

54B Excel Slides 0,774** 0,002 0,535** 1 0,717** 0,131 0,179* 0,078 0,198** 0,573**

54C China Slides 0,493** 0,018 0,614** 0,717** 1 0,111 0,119 0,069 0,187* 0,154*

58 Drawers 0,392** -0,002 0,269** 0,131 0,111 1 0,590** 0,520** -0,002 -0,003

58C China Drawers 0,288** -0,007 0,154* 0,179* 0,119 0,590** 1 0,953** 0,078 0,070

58E Prime Drawers 0,126 -0,013 0,105 0,078 0,069 0,520** 0,953** 1 0,155* 0,059

58F China Prime Drawers 0,135 -0,026 0,136 0,198** 0,187* -0,002 0,078 0,155* 1 0,241**

58X Ten Drawers 0,338** -0,034 0,006 0,573** 0,154* -0,003 0,070 0.059 0,241** 1

**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).

Table 3: Pearson's cross-product correlation coefficients: distributors

Inter-category correlations are still present, but with new features. If we look at the one drawers/slides, for

instance, product sales that before were less correlated are now more correlated. The case of China’s Prime

drawers’ correlation with the whole slides category is emblematic. As before, Hinges look like a product that

is purchased, even if with stronger degrees, along with all others.

Generally, such an augmentation of correlations, especially for intra-family, may be due to the particular

purchasing needs of distributors: facing a broadly defined target market with multiple needs that have to be

provided with a broader range of products with different price marking.

On the manufacturers’ side, sales correlations of a lower intensity are expected to be produced, describing

different patterns due to different customers’ approaches to the market. Looking at data this exactly what

occurs: sales correlations existing before generally become lower, in some cases disappear. This is mostly

16

true if we look at intra-family correlation patterns identified above, where the case of slides family is highly

representative of this, since the consistent positive correlation becomes negative. This is not particularly

evident if we look at inter-category correlations, such as the case of drawers/slides: positive correlations are

lower in some case, as part as shown of a broad trend, but here many peculiarities exist at the product level.

Pearson correlation matrix: manufacturers (n=84)

products 51 51C 54 54B 54C 58 58C 58E 58F 58X

51 Hinges 1 -0,026 0,093 0,116 -0,046 0,732** 0,221* 0,086 -0,028 0,145

51C China Hinges -0,026 1 -0,017 -0,018 0,012 0,117 -0,010 -0,022 -0,014 -0,014

54 Slides 0,093 -0,017 1 -0,024 0,219* -0,023 -0,024 -0,024 -0,015 -0,020

54B Excel Slides 0,116 -0,018 -0,024 1 -0,031 0,397** 0,322** 0,636** 0,010 0,866**

54C China Slides -0,046 0,012 0,219* -0,031 1 -0,020 -0,033 0,150 0,264** -0,040

58 Drawers 0,732** 0,117 -0,023 0,397** -0,020 1 0,155 0,311** -0,005 0,416**

58C China Drawers 0,221* -0,010 -0,024 0,322** -0,033 0,155 1 0,255** -0,008 0,337**

58E Prime Drawers 0,086 -0,022 -0,024 0,636** 0,150 0,311** 0,255** 1 0,682** 0,654**

58F China Prime Drawers -0,028 -0,014 -0,015 0,010 0,264** -0,005 -0,008 0,682** 1 0,012

58X Ten Drawers 0,145 -0,014 -0,020 0,866** -0,040 0,416** 0,337** 0,654** 0,012 1

**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed)

Table 4: Pearson's cross-product correlation coefficients: manufacturers

This is the most relevant difference that we notice, when comparing distributors and manufacturers

behaviour: strong correlations shows off at the product level, and, in particular, higher values come out for

products holding a similar line position (entry with entry, premium with premium). The strongest correlations

in the table (𝜌!" > 0,7    ) is between Drawers and Hinges, basic line products and Excel slides and Ten

drawers, premium levels of the range.

This suggests the existence of a lower level of diversification in furniture manufacturers purchasing

behaviours, since they are called to reach narrow defined segments of final customers, with defined needs

and purchasing powers that must be met by more specific products, using elements of specialization. This

reflects in their buying decisions, where products of a similar level and perception are likely to be bought

together and the intra-family trends identified for distributors, symptoms of their will to provide a broad array

of items, almost disappear.

Finally, and not surprisingly, the behaviour of Hinges is typical: positively correlated with almost all the items

before, this is now true only for products of a similar level. This suggests that the Chinese variant is more

appreciated by manufacturers or small distributors, since this behaviour is always in line with such trends.

Behavioural differences within manufacturers of different furniture and within distributors will be considered in

the following section, devoted to clustering.

17

3.2  Cluster  analysis  

In this section, considering findings and results coming from the correlation analysis, homogenous groups of

customers will be identified using the K-means cluster technique, with the aim to give a coherent picture of

the company’s customer’s behaviours.

Looking at distributors, representing the majority of the company’s global sales, a 9-cluster setting was

arbitrarily chosen. Values for clusters 2 to 7 are not reported as only one observation turns out to belong to

each cluster: since the goal is to identify similar classes and patterns, this is something that is statistically

negligible. Mean values computed over the distributor’s data subset are instead reported in column 6 and are

to be used, from now on, as the main benchmark for comparison.

Considering the output, it is easy to notice that distributors’ behaviours across countries are very

homogeneous: aside from clusters 8 and 9, almost the entirety of items observed in a 10-dimensional space

belongs to a single and common group, cluster 1, described by under-average sales equally distributed

across products. Within this, basic Slides, Ten drawers and China Prime drawers seems to be more

appreciated by the market. This is coherent with the high level of cross-product correlations identified in the

first part of the analysis, representing widely distributed purchasing schemes typical for distributors.

Nevertheless, more defined preferences, even if for only 6 operators, show off also in the case of

distributors: clusters 8 and 9 represent more specific behaviours outwardly carried out by large distribution

groups. In cluster 8, sale shares for some items increase dramatically and firms’ orientation come out

accordingly, described by above-average sales for products of different line levels, particularly consistent in

the case of Hinges (basic), “Excel” slides (premium) and China drawers (entry). Cluster 9, instead is

characterized by sales close to zero for each item, with all the “efforts” devoted to a budget-friendly product,

China Hinges.

Cluster Centers (final): distributors

Product (columns attributes)

Cluster Mean

1 2à7 8 9

51 Hinges 0.1326% … 9.7516% 0.3534% 0.4244%

51C China Hinges 0.0782% … 1.1486% 7.6863% 0.5324%

54 Slides 0.2741% … 2.3604% e-10 0.3823%

54B Excel Slides 0.1494% … 5.8286% 0.0059% 0.4137%

54C China Slides 0.1484% … 1.1132% e-10 0.4834%

58 Drawers 0.1043% … 1.5428% e-10 0.4530%

58C China Drawers 0.1246% … 4.0050% e-10 0.4409%

58E Prime Drawers 0.1076% … 0.6010% e-10 0.2574% 58F China Prime

Drawers 0.2566% … 0.5554% e-10 0.2745%

58X Ten Drawers 0.3340% … 0.0449% e-10 0.5410% Cluster membership

(n=171) 159 6 2 4 _

Table 5: K-means clustering: distributors.

18

On the furniture manufacturers’ side, a (0,1) variable is introduced in order to identify preferences of

manufacturers producing different kinds of furniture, divided in six classes: “bathroom”, “kitchen”, “living-

room”, “bathroom”, “office” and “other”, where “other” includes every business area that cannot be related to

the five main categories.

As detected by the correlation analysis, here less uniform purchasing patterns, shown by both sales shares

and cluster memberships distributions, come up as signals of well defined preferences of manufacturers,

specialized in their own products and focused on their narrowly defined targets. Not surprisingly, looking at

the mean values reported in column 6, sales volumes decrease as well.

Cluster centers (final): manufacturers

Business area/Product (columns attributes)

Cluster Mean 1 2 3 4 5 6

Bedroom* 1 1 0 0 1 0

Kitchen* 1 0 0 0 1 1

Living* 0 1 0 1 1 0

Bathroom* 1 0 0 0 1 0

Office* 0 0 0 1 1 0

Other* 0 0 1 0 0 0

51 Hinges 0.3696% 0.2855% 0.0427% 0.0415% 0.4826% 0.2191% 0.2554%

51C China Hinges e-10 e-10 0.4601% e-10 e-10 e-10 0.0870%

54 Slides e-10 1.1339% 0.0191% e-10 0.4438% 0.2336% 0.3062%

54B Excel Slides 1.4351% 0.0706% 0.3097% 0.1684% e-10 0.0221% 0.2378%

54C China Slides 0.0059% 0.2138% 0.1837% 0.6941% 0.0041% 0.0536% 0.1626%

58 Drawers 0.6751% e-10 0.1629% 0.0287% 0.0029% 0.0678% 0.2172%

58C China Drawers 0.5494% 0.0907% 0.0822% 0.0058% 0.0157% 0.6187% 0.2123%

58E Prime Drawers 2.1013% 0.0078% 0.9894% e-10 0.0392% 0.3359% 0.4582%

58F China Prime Drawers 0.1536% e-10 2.5971% e-10 0.0503% 0.1207% 0.5060%

58X Ten Drawers 0.3176% 0.0269% 0.0433% e-10 e-10 0.0102% 0.0449%

Cluster membership (n=84) 9 20 18 8 8 21

*variable “business area” (0,1): 1:manufacture; 0:do not manufacture

Table 6: K-means clustering: manufacturers.

19

The output of the arbitrarily chosen 6-clusters setting can be presented as the following. According to size,

six homogeneous classes of customers can be identified:

- Kitchen manufacturers (cluster 6): with 21 belonging manufacturers, this constitutes the most

consistent group of customers, specialized in kitchen furniture manufacturing. They purchase items

along the whole product range, mostly in under-average quantities, with the brilliant exception of

China Drawers, an in-sourced entry level product that seems to fit well for their purposes.

- Bedroom/Living manufacturers (cluster2): this class, consistent as the first, is composed by

manufacturers who associate the production of living-room and bedroom furniture. They mostly hold

non-homogenous preferences, where very small purchasing volumes, clearly under-average for

most of products, come together with strictly defined preferences for mid-line solutions: Hinges and

Slides purchases are noteworthy.

- Outside-oriented (cluster 3): surprisingly, a huge share of manufacturers are active in business

areas that are not considered in the usual company’s target: wood or rustic furniture, hotel apparel,

general hardware, constructions, aluminium. Even more interesting, the dominant purchasing

strategy is composed of above-average purchases for a variety of items at different range positions:

China Hinges, Excel slides and Prime drawers for both the European and Chinese variants are the

most evident.

- Broadly oriented (cluster 1): a smaller class composed by eight customers is devoted to a wider

set of activities, particularly bedroom, kitchen and bathroom furniture manufacturing. Even though it

is smaller, this group looks relevant in terms of profitability: by purchasing consistently over-average

almost all of the products, this class is mostly oriented towards the medium to higher positions of the

range. Relevant here are the shares for Excel slides, Prime and Ten drawers. By contrary the

“Chinese” line is less appreciated, with the important exception of basic Drawers.

- Day-light lovers (cluster 4): within this cluster, as wide as the previous, the light-motive is the

binomium living-room and office. Very small purchasing shares are typical of this class that looks, at

least in theory, less profitable than all the previous. However, the exception of China Slides,

purchased in huge volumes, have to be noticed as signal of a low-budget orientation.

- All-rounders (cluster 5): active in all of the relevant target businesses, this eight customers group

shows not surprisingly a strong orientation for basic products. Particularly their over-average

purchases, pretty consistent, are all concentrated over Hinges and Slides. Other items’ buying is

only marginal.

As part of a coherent framework for sales strategy addressing, the results of both the correlation and

cluster analysis will be summarized in the following section.

3.3 Findings

First, the correlation analysis, if carried out at a global level, shows a consistent amount of intra-family

correlation. This is particularly true for drawers and slides, where customers are likely to diversify their

choices within the same product family. Moreover, even if less strong than the intra-family one, it is easy to

notice a sort of inter-family correlation. Once again with respect to drawers and slides, this is particularly

20

intense for premium level products. This may be due to the fact that slides and drawers are perceived by

customers as complementary in use, but this does not seem to hold true for the lower level products.

Instead, Hinges follow a very peculiar pattern: for the European variants, consistent degrees of correlations

are found with almost all of the other items of the range. Surprisingly instead, the Chinese variant’s sales

almost behave independently.

Considering distributors, the dynamics described at the general level are confirmed and enforced by a

stronger magnitude of intra-family correlations. Generally, such an augmentation of correlations, especially

for intra-family, may be due to the particular purchasing needs of distributors: facing a broadly defined target

market with multiple needs that have to be provided with a broader range of products with different price

marking.

Inter-family correlations are also present, but with new features. For instance, product sales that before were

less correlated are now more correlated. The case of China’s Prime drawers’ correlation with the whole

slides family is emblematic. As before, Hinges look like a product that is purchased, along with all others.

On the manufacturers’ side, as expected, sales correlations existing before generally decrease and in some

cases even disappear. This is mostly true if we look at intra-family correlation patterns identified above. The

case of the slides category is highly representative of this, since the consistent positive correlation becomes

negative. This is not particularly evident if we look at inter family dynamics, such as the case of

drawers/slides: positive correlations are lower in some case, as part of a broad trend, but many peculiarities

exist at the product level. Such new dynamics are good in drawing heterogeneous patterns of purchasing,

due to different manufacturers approaches to the market when considering their narrowly defined target

segments along with their features.

On the other hand, Cluster analysis shows that distributors’ behaviours across countries are something

homogeneous: aside from clusters 8 and 9, almost the entirety of items observed in a 10-dimensional space

belong to a single big common group, cluster 1, described by under average sales equally distributed across

product categories. Within this, Basic Slides, Ten drawers and China Prime drawers seem to be more

appreciated. This is coherent with the high level of cross-product correlations identified in the first part of the

analysis, representing widely distributed purchasing schemes typical for distributors.

Nevertheless, some degrees of specialization appear for a small number of players: clusters 8 and 9

represent more specific behaviours outwardly carried out by the largest distribution groups.

On the furniture manufacturers’ side, as previously detected by the correlation analysis, more specific

purchasing patterns, shown by both sales shares and cluster memberships distributions, come up as signals

of well defined preferences of manufacturers specialized in their own products and focused on their narrowly

defined final targets. With respect to this, six customer groups were identified, with different levels of

specialization and preferences, according to the business areas in which they are active: Kitchen

manufacturers, Bedroom/living manufacturers, Outside-oriented, Broadly-oriented, Daylight lovers, All-

rounders. Specific purchasing behaviours, volumes and levels of profitability for the company come out

accordingly.

21

LITERATURE  

[Witten & Frank 2005] Witten I.H.; Frank E.: Data Mining: Practical Machine Learning Tools and Tecniques,

2nd edition, Elsevier/Morgan Kaufmann, San Francisco 2005, USA

[Turban, Aronson, Liang & Sharda 2007] Turban E., Aronson, J. E., Liang, T.P., & Sharda, R. (2007):

Decision Support and Business Intelligence Systems. Pearson Education. New Jersey 2007, USA

[Tan, Steinbach & Kumar 2005] Tan P.N., Steinback M., Kumar V.: Introduction to Data Mining, US edition,

Addison Wesley, Boston 2005, USA

INDEX  OF  FIGURES  

Figure 1: Different ways of clustering the same set of points [Tan, Steinbach & Kumar 2005] ....................... 10

Figure 2: Hierarchical algorythm, the Dendogram ........................................................................................... 10

Figure 3: Hierarchical vs. Non hierarchical clustering ..................................................................................... 11

Figure 4: Cluster building process by K-means algorithm [Tan, Steinbach & Kumar 2005] ............................ 12

Table 1: Correlation analysis. A preliminary test ............................................................................................. 14

Table 2: Pearson's cross-product correlation coefficients: global ................................................................... 14

Table 3: Pearson's cross-product correlation coefficients: distributors ........................................................... 15

Table 4: Pearson's cross-product correlation coefficients: manufacturers ...................................................... 16

Table 5: K-means clustering: distributors ........................................................................................................ 17

Table 6: K-means clustering: manufacturers ................................................................................................... 18

ACKNOWLEDGMENTS  

I would like to give special thanks to Ms. Silvana Riboldi, Head of FGV market intelligence department, for

the constant support given during the research and Jules Kingery, my new American friend, for final reviews

and suggestions