Indian School of Business
Forecasting Sales for Dairy Products
Contents EXECUTIVE SUMMARY .............................................................................................................................. 3
Data Analysis ............................................................................................................................................. 3
Forecast Horizon: ...................................................................................................................................... 4
Forecasting Models: .................................................................................................................................. 4
Fresh milk - AmulTaaza (500 ml) ........................................................................................................... 4
Dahi/ Yogurt - Saras (200 gm) ............................................................................................................... 4
Dahi/Yogurt - Yakult Probiotic (350 ml) : .............................................................................................. 4
Assumptions .............................................................................................................................................. 5
Conclusions and Recommendations ......................................................................................................... 5
TECHNICAL ANALYSIS ................................................................................................................................ 6
Data Preparation: .................................................................................................................................. 6
Class: Fresh Milk; Item Description: AmulTaaza 500 ml ..................................................................... 6
Class: Dahi and Yoghurt; Item Description: SarasDahi 200gm ............................................................. 6
Class: Dahi& Yogurt; Item Description: Yakult Probiotic Drink ............................................................. 7
Appendix 1: Technical Details - Class: Fresh Milk; Item Description: AmulTaaza 500 ml ........................ 8
Appendix 2: Technical Details - Class: Dahi and Yoghurt; Item Description: SarasDahi 200gm ............. 11
Appendix 3: Technical Details - Class: Dahi& Yogurt; Item Description: Yakult Probiotic Drink ............ 15
EXECUTIVE SUMMARY
Retailers face a formidable challenge of ensuring that they have optimum levels of inventory for goods
that are perishable. This is because these goods have short shelf life without any salvage value and can
hurt the profitability of the retailers significantly. It therefore becomes critical for the retailers to know
accurate forecasts for perishable items such as fresh milk and yoghurt. These subclasses also drive
footfall into the retail stores and hence it is important to maintain high levels of service for these
products.
We have chosen to forecast unit sales for two product classes - Fresh Milk and Dahi-Yoghurt. The
rationale behind choosing these 2 product classes is that typically sales of these product classes would
have correlation and henceforth provide additional insights.
The sales data available from retail store for “Yakult Probiotic drink 325 ml” and “SarasDahi 200gm”
from class “Dahi and Yoghurt” and “AmulTaaza 500 ml” from class “Fresh Milk” is aggregated at a daily,
weekly and monthly level, to visualize any trends or recognizable patterns.
Data Analysis
Daily sales for AmulTaaza go up to 100 units, for SarasDahi up to 80 units and for Yakult Probiotic drink
up to 120 units. There is no consistent increase/decrease in sales figures or seasonal effects observable
during the 13 months of data. Very sharp increase in sales observed on a few days, without any
explainable reason.
The trend of weekly sales shows that sales rise over the weekends and come down during the
weekdays, most probably because more people come to the store on weekends. Sales also rise on
Wednesdays within weekdays.
Monthly sales for AmulTaaza show a decrease in variability as we move from Aug 2011 to Aug 2012. For
Yakult and SarasDahi, monthly sales show jumps up and down without any regular pattern.
Forecast Horizon:
In light of our business proposition for perishable goods, forecasting sales on a daily basis would be most
useful
Forecasting Models:
Fresh milk - AmulTaaza (500 ml)
We have chosen Moving Average (7) as our final forecasting model. After evaluating various methods
such as moving averages for other periods, naive forecastsetc, we decided to go for the MA (7) method
with a MSE of 351.25 on training and MSE of 304.24 on validation.
Dahi/ Yogurt - Saras (200 gm)
We have chosen Holt Winters No Trend (Alpha =0, Gamma = 0.03) as the final forecasting method. The
MSE of 222.66 produced by this model was superior to the MSE of 270.44 produced by the multiple
linear regression model.
Dahi/Yogurt - YakultProbiotic (350 ml)
We have tried the Holt Winters No Trend (Alpha =0.2, Gamma = 0.09) as well as the multiple regression
model. While the MSE for data validation for the two methods are comparable, the MSE for the training
data is much better for linear regression than the Holt Winters method. (96.79 Vs 140.65)
Assumptions
We have assumed that the historical purchasing pattern will be an indicator of the future purchasing
pattern as well and there will be no drastic change in it
The correlation that has been identified among products will continue to exist in the future
The forecasting model is based on daily sales which are assumed to represent demand of the products and does not consider the possibility of stock-outs that might have happened over the sales period
We suggest that the chosen models for each of the SKU's be used to forecast the daily level demand for
the month of September 2012.
Conclusions and Recommendations
We suggest that the chosen models for each of the SKU's be used to forecast the daily level demand for
the month of September 2012.
As seen from its forecasting model, sales of AmulTaaza do not exhibit any seasonality although there are
several random spikes. Hence, the implication for the managers is that any sudden increase or decrease
in sales should not be considered as increasing or decreasing trend. Average sales over past week are
the best predictor for next day’s sale. Since the sales go up and down drastically, managers should
consider having a high stock only if the stock out costs are considerably high than cost of overstocking.
For Saras (350 ml), we propose that the model be used with care. There are occasional peaks which the
model has not captured. While making decisions about inventory levels and safety stock, these
occasional peaks should be borne in mind. Since data till only May 2012 has been used, we strongly
recommend the use of latest data as soon as appropriate data is available for those months. The data
also suggest occasional supply disruptions, so the retailer may want to thoroughly check the data and
the corresponding forecasts for such anomalies.
As shown from the model for Yacult Probiotic drink, the daily sales exhibit seasonality. Hence, managers
should consider seasonality during the week while doing stock planning. The model provides reasonably
accurate forecasts (as seen from low values of MSE). Hence, managers can have good confidence on the
forecast.
TECHNICAL ANALYSIS
Data Preparation:
As part of the data preparation, total Quantity sold data was aggregated by Date for all the classes and
SKU’s. Quantity sold was then checked for every customer as well, for customers buying in bulk. The
mean of sales was calculated for each item description and any outliers which were more than two
standard deviations away from the mean value were replaced with mean value. The replacement of
outliers with mean value was chosen over deletion of such records as this is a time series data where
continuity is required for forecasting.
Class: Fresh Milk; Item Description: AmulTaaza 500 ml
As the data doesn’t have any trend or seasonality, the appropriate methods to capture the randomness
in data are the following:
Moving average smoothing
Neural Network
Charts: As observable from the charts below, the neural network is not able to fit the data, hence MA
method is chosen for further optimization.
The best fitting MA model was obtained for MA(7). Please see Technical details in Appendix 1
Class: Dahi and Yoghurt; Item Description: SarasDahi 200gm
Data Preparation: We realized that data for many of the months had a series of 0 sales. The time series
shows that except for the period from Dec '11 to May '12, all months have several 0 values making those
months inappropriate for analysis. We believe that there was probably a supply disruption in this period.
The final months we have chosen are from December '11 to May '12 as proper values are available.
As the data has no trends but considerable seasonality, the following forecasting models were used
Multiple Linear Regression with days of the week as dummy variables
Holt Winters with no trend
Charts:
Based on RMSE values and plotted chart, we decided to proceed with the Holt Winters No Trend
method (Alpha = 0, Gamma = 0.03) with season length = 7, please see details in Appendix 2
Class: Dahi& Yogurt; Item Description: Yakult Probiotic Drink
The models that can be proposed for this data are the following:
Linear Regression Model
Holt Winters No Trend
Charts: As observable from the chart below, Linear Regression gives an accurate forecast
The best fitting model was obtained
for Linear Regression. Please see
Technical details in Appendix 3
Appendix 1: Technical Details - Class: Fresh Milk; Item Description: AmulTaaza
500 ml
Data Availability: 1-Aug-2011 to 31-Aug-2012
RMSE/MSE Table:
As MA (7) gives lowest error on both training and validation data, MA (7) is chosen to forecast daily ales of
AmulTaaza
Forecasted Values:
Output of MA (7) on AmulTaaza Daily Sales
Transaction
Date Actual Predicted Residual
1-Jul-12 48 28.28571 19.71429
2-Jul-12 18 26.57143 -8.57143
3-Jul-12 15 26.57143 -11.5714
4-Jul-12 54 24.42857 29.57143
5-Jul-12 9 29.14286 -20.1429
6-Jul-12 36 29.14286 6.857143
7-Jul-12 51 30.42857 20.57143
MA 1
(naïve) MA 2 MA 3 MA 4 MA 5 MA 6 MA 7 MA 8 MA 9
MSE on training 630.25 501.43 429.11 408.27 384.35 374.67 351.25 351.81 355.50
MSE on validation 525.19 420.46 360.77 324.15 319.66 315.38 304.24 328.11 310.84
8-Jul-12 48 33 15
9-Jul-12 33 33 0
10-Jul-12 30 35.14286 -5.14286
11-Jul-12 9 37.28571 -28.2857
12-Jul-12 21 30.85714 -9.85714
13-Jul-12 24 32.57143 -8.57143
14-Jul-12 33 30.85714 2.142857
15-Jul-12 66 28.28571 37.71429
16-Jul-12 33 30.85714 2.142857
17-Jul-12 30 30.85714 -0.85714
18-Jul-12 3 30.85714 -27.8571
19-Jul-12 3 30 -27
20-Jul-12 6 27.42857 -21.4286
21-Jul-12 24 24.85714 -0.85714
22-Jul-12 0 23.57143 -23.5714
23-Jul-12 27 14.14286 12.85714
24-Jul-12 3 13.28571 -10.2857
25-Jul-12 3 9.428571 -6.42857
26-Jul-12 0 9.428571 -9.42857
27-Jul-12 24 9 15
28-Jul-12 30 11.57143 18.42857
29-Jul-12 51 12.42857 38.57143
30-Jul-12 30 19.71429 10.28571
31-Jul-12 12 20.14286 -8.14286
1-Aug-12 66 21.42857 44.57143
2-Aug-12 45 30.42857 14.57143
3-Aug-12 9 36.85714 -27.8571
4-Aug-12 27 34.71429 -7.71429
5-Aug-12 45 34.28571 10.71429
6-Aug-12 18 33.42857 -15.4286
7-Aug-12 3 31.71429 -28.7143
8-Aug-12 30 30.42857 -0.42857
9-Aug-12 6 25.28571 -19.2857
10-Aug-12 51 19.71429 31.28571
11-Aug-12 51 25.71429 25.28571
12-Aug-12 30 29.14286 0.857143
13-Aug-12 12 27 -15
14-Aug-12 9 26.14286 -17.1429
15-Aug-12 39 27 12
16-Aug-12 18 28.28571 -10.2857
17-Aug-12 36 30 6
18-Aug-12 12 27.85714 -15.8571
19-Aug-12 9 22.28571 -13.2857
20-Aug-12 12 19.28571 -7.28571
21-Aug-12 12 19.28571 -7.28571
22-Aug-12 27 19.71429 7.285714
23-Aug-12 9 18 -9
24-Aug-12 42 16.71429 25.28571
25-Aug-12 18 17.57143 0.428571
26-Aug-12 33 18.42857 14.57143
27-Aug-12 12 21.85714 -9.85714
28-Aug-12 15 21.85714 -6.85714
29-Aug-12 18 22.28571 -4.28571
30-Aug-12 45 21 24
31-Aug-12 39 26.14286 12.85714
Appendix 2: Technical Details - Class: Dahi and Yoghurt; Item Description:
SarasDahi 200gm
Data available from 1-Aug-2011 to 31-Aug-2012
Trend Analysis
Forecasting method
The data we had from May to December had no trend and a constant seasonality. The seasonality was
weekly with sales jumping over the weekend. Based on this we considered two methods for this data -
1) Multiple Linear Regression with days of the week as dummy variables 2) Holt Winters with no trend.
For Holt Winters, we fine tuned values of alpha & gamma to come up with the best fitted model. For
linear regression, we performed a multi-step process to check if there is correlation with other products
such as AmulTaaza or Yakult and that is explained in the subsequent section
Charts
The chart below shows the Actual Vs Plotted for the Validation Data for both the linear regression and
the Holt Winters method. For both models, we have used the finest tuned version to compare.
Multiple linear regression was initially applied for Saras sales using the previous day sales of AmulTaaza,
previous day sales of Yakult, time, and dummy variables for weekday.
As shown in the Regression output below, the p-value for previous day sales of AmulTaaza and time
variables were significantly greater than the chosen cutoff value of 0.05. On removing the previous sales
of AmulTaaza as a predictor in the model, we saw that there was a slight increase in MSE value (as
shown in the table above). Hence, we added back previous sales of AmulTaaza to the model. The next
step was to remove time variable from the model. The fact that p-value of time variable was 0.41
reinforces that fact that the data for Sarasdahi quantity sold does not display any trend. However, the
statistically significant p-values for weekday wise sales shown weekday-wise seasonality in the data. On
removing the time variable from the model, MSE reduced to 270.44 from 297.65.
The output of the final regression model run is as follows:
Linear Regression Iteration 1
Linear Regression Iteration 2
Linear Regression Iteration 3
The output for the Holt Winters Method is shown below
Holt Winters Output and Errors
Transaction
DateActual Forecast Error LCI UCI
1-May-12 9 10.2128329 -1.21283291 -17.5798437 38.0055095
2-May-12 15 10.6351448 4.36485518 -17.1575317 38.4278214
3-May-12 3 10.4789002 -7.47890016 -17.3137764 38.2715767
4-May-12 0 12.6415702 -12.6415702 -15.1511063 40.4342468
5-May-12 24 25.9181642 -1.91816422 -1.87451235 53.7108408
6-May-12 12 24.5459421 -12.5459421 -3.24673442 52.3386187
7-May-12 0 10.0710076 -10.0710076 -17.721669 37.8636842
8-May-12 24 10.2128329 13.7871671 -17.5798437 38.0055095
9-May-12 30 10.6351448 19.3648552 -17.1575317 38.4278214
10-May-12 6 10.4789002 -4.47890016 -17.3137764 38.2715767
11-May-12 36 12.6415702 23.3584298 -15.1511063 40.4342468
12-May-12 18 25.9181642 -7.91816422 -1.87451235 53.7108408
13-May-12 18 24.5459421 -6.54594215 -3.24673442 52.3386187
14-May-12 0 10.0710076 -10.0710076 -17.721669 37.8636842
15-May-12 6 10.2128329 -4.21283291 -17.5798437 38.0055095
16-May-12 24 10.6351448 13.3648552 -17.1575317 38.4278214
17-May-12 6 10.4789002 -4.47890016 -17.3137764 38.2715767
18-May-12 0 12.6415702 -12.6415702 -15.1511063 40.4342468
19-May-12 21 25.9181642 -4.91816422 -1.87451235 53.7108408
20-May-12 78 24.5459421 53.4540579 -3.24673442 52.3386187
21-May-12 3 10.0710076 -7.0710076 -17.721669 37.8636842
22-May-12 0 10.2128329 -10.2128329 -17.5798437 38.0055095
23-May-12 0 10.6351448 -10.6351448 -17.1575317 38.4278214
24-May-12 0 10.4789002 -10.4789002 -17.3137764 38.2715767
25-May-12 27 12.6415702 14.3584298 -15.1511063 40.4342468
26-May-12 12 25.9181642 -13.9181642 -1.87451235 53.7108408
27-May-12 0 24.5459421 -24.5459421 -3.24673442 52.3386187
28-May-12 0 10.0710076 -10.0710076 -17.721669 37.8636842
29-May-12 0 10.2128329 -10.2128329 -17.5798437 38.0055095
30-May-12 0 10.6351448 -10.6351448 -17.1575317 38.4278214
31-May-12 0 10.4789002 -10.4789002 -17.3137764 38.2715767
0102030405060708090
Sara
s S
ale
s
Transaction Date
Time Plot of Actual Vs Forecast (Validation Data)
Actual Forecast
Error Measures
(Training)
MAPE 58.859906
MAD 10.823147
MSE 201.07061
Error Measures
(Validation)
RMSE/MSE tables:
The below shows various values for each methods we tried in order to come up with the most optimized
one.
Method MSE Values
Linear Regression [Time, Dummy Variables (Weekday), Yakult(t-1), AmulTaaza(t-1)] 297.65
Linear Regression [Time, Dummy Variables (Weekday), Yakult(t-1)] 298.97
Linear Regression [Dummy Variables (Weekday), Yakult(t-1), AmulTaaza(t-1)] 270.44
Holt Winters No Trend (Alpha = 0.2, Gamma = 0.05) 245.38
Holt Winters No Trend (Alpha = 0.00, Gamma = 0.03) 222.66
Appendix 3: Technical Details - Class: Dahi& Yogurt; Item Description: Yakult
Probiotic Drink
Seasonality was captured using a linear regression model by creating a categorical variable for Day of
the week. This categorical variable was then used to create dummy variables, which are then used as
predictors in the Model
MAPE 75.982246
MAD 11.659562
MSE 222.66142
Linear Regression:
Regression Model
Regression Equation
Quantity =- 957.2644 + 0.02372543 * t - 2.74370599 * Day of the Week_Monday + 10.47619915 * Day
of the week_Saturday + 27.12547493 * Day of the week_Sunday - 3.09158349 * Day of the
week_Thursday -1.90358925 * Day of the week_Tuesday + 1.60529459 * Day of the week_Wednesday
Forecasting
Date Actual Forecast
Regression
Forecast
Holt
Winters
Regression
Residual
Holt Winters
Residual
8/1/2012 9 19.972351 19.57667224 -10.97235067 -10.57667224
8/2/2012 36 15.299274 17.10501376 20.70072628 18.89498624
8/3/2012 6 18.414658 16.5764155 -12.41465834 -10.5764155
8/4/2012 27 28.914659 31.22367656 -1.91465862 -4.223676558
8/5/2012 75 45.587736 57.30486095 29.41226447 17.69513905
8/6/2012 21 15.742356 32.24542733 5.25764426 -11.24542733
8/7/2012 9 16.606274 20.78549845 -7.60627361 -11.78549845
8/8/2012 6 20.138959 19.57667224 -14.13895858 -13.57667224
8/9/2012 15 15.465882 17.10501376 -0.46588163 -2.105013762
8/10/2012 24 18.581266 16.5764155 5.41873375 7.423584498
8/11/2012 21 29.081267 31.22367656 -8.08126653 -10.22367656
8/12/2012 45 45.754343 57.30486095 -0.75434344 -12.30486095
8/13/2012 15
15.908964
32.24542733
-0.90896365
-17.24542733
8/14/2012 12 16.772882 20.78549845 -4.77288152 -8.785498445
8/15/2012 48 20.305566 19.57667224 27.69443351 28.42332776
8/16/2012 6 15.63249 17.10501376 -9.63248954 -11.10501376
8/17/2012 21 18.747874 16.5764155 2.25212584 4.423584498
8/18/2012 12 29.247874 31.22367656 -17.24787444 -19.22367656
8/19/2012 72 45.920951 57.30486095 26.07904865 14.69513905
8/20/2012 24 16.075572 32.24542733 7.92442844 -8.245427327
8/21/2012 6 16.939489 20.78549845 -10.93948943 -14.78549845
8/22/2012 9 20.472174 19.57667224 -11.4721744 -10.57667224
8/23/2012 24 15.799097 17.10501376 8.20090255 6.894986238
8/24/2012 27 18.914482 16.5764155 8.08551793 10.4235845
8/25/2012 6 29.414482 31.22367656 -23.41448235 -25.22367656
8/26/2012 63 46.087559 57.30486095 16.91244074 5.69513905
8/27/2012 12 16.242179 32.24542733 -4.24217947 -20.24542733
8/28/2012 9 17.106097 20.78549845 -8.10609734 -11.78549845
8/29/2012 6 20.638782 19.57667224 -14.63878231 -13.57667224
8/30/2012 3 15.965705 17.10501376 -12.96570536 -14.10501376
8/31/2012 12 19.08109 16.5764155 -7.08108998 -4.576415502
Holt Winters No Trend: Since the data has no trend but a weekly seasonality, Holt Winter No Trend model can be used for Forecasting
Model Performance Comparison
Sensitivity Report
Both models are doing equally well, however linear regression scores slightly better on Training &
Validation data.
Residuals
ACF Plot for Linear Regression and Holt Winters model