25
2002/4/10 IDSL seminar Estimating Business Targets Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425.

Estimating Business Targets

  • Upload
    alton

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Estimating Business Targets. Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425. Abstract. Propose a new solution to the classical econometric task of frontier analysis Combine nearest neighbor methods and classical statistical methods - PowerPoint PPT Presentation

Citation preview

Page 1: Estimating Business Targets

2002/4/10 IDSL seminar

Estimating Business Targets

Advisor: Dr. Hsu

Graduate: Yung-Chu Lin

Data Source: Datta et al., KDD01, pp. 420-425.

Page 2: Estimating Business Targets

2002/4/10 IDSL seminar

Abstract

Propose a new solution to the classical econometric task of frontier analysis

Combine nearest neighbor methods and classical statistical methods

Identify under marketed customersBenchmark regional directory divisions

Page 3: Estimating Business Targets

2002/4/10 IDSL seminar

Outline

MotivationObjectiveHistorical approachesTarget estimation methodologyCase studyConclusion Personal opinion

Page 4: Estimating Business Targets

2002/4/10 IDSL seminar

Motivation

Setting targets is a critical taskSetting the target of each entity to the

average amongst the entities traditionallyTwo challenges

– The characteristics of the entities will have a heavy influence on the outcome

– The inherent unsupervised nature of the problem

Page 5: Estimating Business Targets

2002/4/10 IDSL seminar

Objective

Provide a methodology for estimating unsupervised maximal or minimal targets

Setting revenue target expectations for individual customers

Revenue target setting for regional yellow page directories

Page 6: Estimating Business Targets

2002/4/10 IDSL seminar

Historical Approaches

Mathematical programmingEconomics

Page 7: Estimating Business Targets

2002/4/10 IDSL seminar

Mathematical Programming

where is the target for xi, a vector for the ith observation

Sensitivity to errors or outliers since it assumes that all observed targets define the possible space

)( ii xg

i

Page 8: Estimating Business Targets

2002/4/10 IDSL seminar

Economics

where is a non-negative error term

The requirement of a model for the error term and for g

iii xg )(

i

Page 9: Estimating Business Targets

2002/4/10 IDSL seminar

Target Estimation Methodology

Nearest neighbor vs. clusteringThe neighborhoodsThe distance functionTarget estimation from the neighborhoodsA heuristic for comparing neighborhoods

Page 10: Estimating Business Targets

2002/4/10 IDSL seminar

Nearest Neighbor vs. Clustering

Time complexity– Clustering is better than nearest neighbor

Problem of clustering– Two similar entities fall into different cluster– Dimension higher, influence more serious– But nearest neighbor is not so

Page 11: Estimating Business Targets

2002/4/10 IDSL seminar

The Neighborhoods

xi: ith observationyi: the variable containg its target valueni: neighborhood for xi, where ni is a set of

observations {xi, xj, …}

Page 12: Estimating Business Targets

2002/4/10 IDSL seminar

The Distance Function

Continuous standardizee.g. Continuous- (2,1)(3,4)

Nominal- (a,b)(a,c) 2

22

1 )*3()*1( ww

220 w

Page 13: Estimating Business Targets

2002/4/10 IDSL seminar

Target Estimation From the Neighborhoods

Let yi(1), yi(2), …, yi(k) be the order statistics, so that yi(1) is the largest

Page 14: Estimating Business Targets

2002/4/10 IDSL seminar

A Heuristic for Comparing Neighborhoods

Maximal frontier E(xi) will range from 0 to 1Minimal frontier E(xi) >=1

Page 15: Estimating Business Targets

2002/4/10 IDSL seminar

Case Study

Target revenues for directory book advertisers

Target revenue for regional directories

Page 16: Estimating Business Targets

2002/4/10 IDSL seminar

(1) Target Revenues for Directory Book Advertisers

Goal– Find businesses that have low spending

relative to those with otherwise similar characteristics

Three categories of data available– Advertiser: e.g. number of employees– Directory: e.g. distribution size– Market : e.g. median household income

Page 17: Estimating Business Targets

2002/4/10 IDSL seminar

Calculating Nearest Neighbors

Standardize continuous data: natural logK=4Weight the variables equally

– But decrease the weights for many of the directory and market variables

Page 18: Estimating Business Targets

2002/4/10 IDSL seminar

Distribution for E(x) for Advertisers

Page 19: Estimating Business Targets

2002/4/10 IDSL seminar

A Decision Tree to Predict phi -xi

Page 20: Estimating Business Targets

2002/4/10 IDSL seminar

(2) Target Revenue for Regional Directories

Goal– Benchmark regional directory divisions

Separate the data into two sets– Training set: 80%– Test set: 20%

K=4

Page 21: Estimating Business Targets

2002/4/10 IDSL seminar

Book Type

System book– an entire serving area

System-neighborhood book– A smaller number of geographic areas in the

franchise areaNeighborhood book

– Areas outside of the telephone company’s franchise area

Page 22: Estimating Business Targets

2002/4/10 IDSL seminar

Four Different Distributions labeled according to the legend

Page 23: Estimating Business Targets

2002/4/10 IDSL seminar

Neigborhood books System books Non-system books

The x-axis shos log(distribution) and the y-axis E(x)

Page 24: Estimating Business Targets

2002/4/10 IDSL seminar

Conclusion

Present a general data mining methodology for estimating business targets by frontier analysis

First case– Increase sales focus on the under-marketed customers

– Increase the potential revenue by several million

Second case– Estimate optimal revenue performance targets for

directory divisions

– Increase for directory books is a minimum of several million dollars

Page 25: Estimating Business Targets

2002/4/10 IDSL seminar

Personal opinion

Combine several existed methodologies or disciplines can make new powerful one