Acquisition scorecard: Reject inference methodologies · credit quality. This paper outlines steps involved in some reject inference methods which can be used in the development of

Acquisition scorecard:Reject inference

methodologies

An EXL whitepaper

Written by

Nupur VermaDecision [email protected]

1 © 2016 ExlService Holdings, Inc.

[ Acquisition scorecard : Reject inference methodologies ]

Reject inference methods address this

issue by using actual performance of

approved applications and derived

performance of rejected applications. This

makes the model more accurate, robust

and unbiased. If applied correctly, reject

inference methods help in increasing the

approval rate, without compromising the

credit quality. This paper outlines steps

involved in some reject inference methods

which can be used in the development

of an acquisition scorecard. The methods

have been empirically tested and the best

suitable method for different scenarios has

been suggested.

IntroductionIn any statistical model, the key assumption

made is that the characteristics of the

development sample are similar to the

population on which the model will

be applied. This is true for collection

models because they are built on

approved applications and applied on the

same type of accounts - the approved

applications. Acquisition models are

built on approved applications, but have

to be used for the entire through-the-

door population – approved as well as

rejected. Here, the development sample

is biased because it is not representative

of the entire population. Reject Inference

methodologies are a way to account for

and correct this sample bias.

One of the basic problems a bank faces is that an application scorecard has to be applied on the through-the-door population, but the models are built only on the approved applications because the performance of only the approved applications is available with the lender.

Performance of rejected applications is not readily available, and hence the development of a model based on both rejected and approved applications is a challenge.



Canadian Imperial Bank of Commerce,

Toronto, Canada, discusses manual

estimation of method of reject inference,

when the modeler uses preexisting

knowledge to manually simulate

performance on the declined population.

The paper also discusses the augmentation

and extrapolation techniques. A paper

has also been written by Dennis Ash

and Steve Meester, May 2002, which

discusses Group Parceling, Heckman’s

bias correction and Supplemental Bureau

Data methods of reject inference. John

Banasik and Jonathan Crook in their paper

“Reject inference, augmentation, and

sample selection”, June 2006, discusses

augmentation and banded data techniques

of reject inference.

MethodologyVarious methods of RI are available to

include the performance of rejected

applications and make the model

applicable on the entire through-the-door

(TTD) population. There are two broad

categories of RI:

i. When the performance of rejected

applications is available – in-house or with

some other lender through credit bureau

ii. When the performance of rejected

applications is not available to the lender

When the performance of the rejected

applications is not available with a

competitor or in-house their performances

Reject Inference (RI) becomes a very

important tool when the creditor wants to

relax the approval criteria to improve the

approval rate. By inferring the performance

of the rejected applications, the impact

of such a decision can be studied. If the

approval rate is low then RI serves as a

very important tool. If the approval rate is

already very high and there is limited scope

of increasing the approval rate, RI may not

be very helpful.

Literature ReviewThere have been a number of studies to

solve for the sample bias in an application

scorecard using reject inference methods.

“Reject Inference Methodologies in Credit

Risk Modeling” by Derek Montrichard,



have to be mathematically derived, or

some assumptions have to be made on

their performance. This paper focuses on 4

methods of RI when bureau performance

of rejected applications is not available.

These are:

• Hard cutoff method

• Single weighted approach

• Double weighted approach:

• Augmentation

For all the 4 methods above, logistic

regression models were built on real data

for a personal loan portfolio of a MNC bank.

The data was carefully sanitized to ensure

security of confidential information. The

approval rate for this portfolio is 40%, with

a sample size of 90,000. The modeling

dataset had only application data and no

bureau data for accepts or rejects. Logistic

regression model results were studied.

These methods are discussed below.

1. Hard cutoff method

This is one of the simplest reject inference

methods. All rejects above a score cutoff

are tagged as “bads” and a model is then

built on actual performance of accepts and

inferred performance of rejects.

i. A logistic regression model on approved

population is built using their internal

performance to predict the probability

of default. Rejects are scored using this

model.

ii. Approved applications are ordered and

divided in 10 equal bands. An expected

bad rate level is set above which rejected

applications are tagged “bad”; all rejected

applications below this level are classified

as “good.” This cutoff score can be

determined through an iterative process.

iii. Inferred goods and bads for the rejected

applications are then added to the actual

good and actual bad of the approved

applications. A final logistic regression

model is built on this dataset to predict the

probability of default. This model can be

used in the new underwriting strategy.



In the figure 1 (right), approved applications

are distributed in 10 scorebands of equal

size. From scoreband 1 to scoreband 9 the

actual bad rate is increasing at a steady

pace. The bad rate jumps from 5.5% to

11% in the 10th bin. Here, 1.99% has been

taken as a cut-off. Rejected applications

with scores less than 1.99% are classified as

good and those with score more than 1.99%

are classified as bad. Actual performace

for the approved applications is already

available, hence there is no need to apply

this cutoff rule for approved applications.

Note that as high as 47% of the rejected

applications would be classified as bad

in this example. This is intuitive because

rejected applications would have a higher

bad rate than approved applications,

otherwise they would not have been

rejected.

In this method, the final modeling dataset

was heavily concentrated with the bads

of the rejected applications. The method

performed well in terms of Gini, KS and

concordance.

Approved %Rejected

Band Min Score Max Score Actual Bad Rate %Approved

1 0.00% 0.06% 0.2% 10% 0%

2 0.06% 0.12% 0.3% 10% 1%

3 0.12% 0.18% 0.4% 10% 2%

4 0.18% 0.25% 1.0% 10% 2%

5 0.25% 0.35% 1.9% 10% 2%

6 0.35% 0.50% 2.8% 10% 12%

7 0.50% 0.77% 3.7% 10% 9%

8 0.77% 1.21% 4.6% 10% 13%

9 1.21% 1.99% 5.5% 10% 13%

10 1.99% 40.00% 11.0% 10% 47%

Figure 1



2. Single weighted approach

This is a two-step modeling approach.

Here, instead of classifying a reject into

good or bad, each reject is classified as a

partial good and partial bad, with a weight

assigned to it. The score of the known good

bad model is used as a weight for rejects in

the final model. The steps in this approach

are outlined below:

i. A logistic regression model is built on

approved applications to estimate the

probability of default – P(default).

ii. All rejected applications are scored on

this model to get P(default).

iii. A weight is assigned for approved

and rejected applications. Weight for

approved application=1. Weight for each

reject is applied such that there are 2

observations for each rejected application:

Wt_1=P(default), Wt_2=1-P(default).

Figure 2 shows a sample of the modeling

dataset.

iv. A final logistic regression model is built

with approved appearing once in the

modeling dataset and rejects appearing

twice in the modeling dataset.

In this method there is no ambiguity in

assigning classes to the rejects as in the

hard cutoff method. Each reject is once

counted as a bad with a weight assigned

to it wt_1. The reject enters the modeling

dataset again as a good, with a weight

assigned to it wt_2. This method assumes

that all rejected applications have same

probability of getting approved, which

might not be the case.

Application_id Decision PD Weight Bad_Var Weight_description

1 Approve 0.01 1.00 1 =1

2 Approve 0.02 1.00 0 =1

3 Reject 0.20 0.20 1 =PD

3 Reject 0.20 0.80 0 =1-PD

4 Reject 0.30 0.30 1 =PD

4 Reject 0.30 0.70 0 =1-PD

5 Reject 0.12 0.12 1 =PD

5 Reject 0.12 0.88 0 =1-PD

Figure 2



3. Double weighted approach:

This method is similar to simple weighted

approach. This, however, is a 3 step

modeling approach. The approach for this

technique is as follows:

i. A logistic regression model is first built

on approved applications to estimate

the probability of default – P(default). All

rejected applications are scored on this

model to get P(default).

ii. An approve/reject model is built on

approved as well as rejected applications.

This will give P(approved).

iii. Weight is assigned to all applications.

For approved applications weight=1. Each

rejected application appears twice in the

modeling dataset: once as a bad with a

weight wt_1 and then as a good with a

weight wt_2.

Weights for reject are calculated as below:

• Wt_1=P(default)*P(Approved)

• Wt_2=1-P(default )*P(Approved)

A sample of the modeling dataset is shown

below:

iv. A final model is built on approved and

rejected applications. Actual performance

of approved application is used. Inferred

performance of rejected applications is

used.

Application_id Decision PD P_Approved Weight Bad_Var Weight_description

1 Approve 0.01 0.80 1.00 1 =1

2 Approve 0.02 0.70 1.00 0 =1

3 Reject 0.20 0.01 0.00 1 =P_Approved*PD

3 Reject 0.20 0.01 0.01 0 =P_Approved*(1-PD)

4 Reject 0.30 002 0.01 1 =P_Approved*PD


5 Reject 0.12 0.04 0.00 1 =P_Approved*PD




This method incorporates not just the

probability of a default for a reject, but also

the probability of getting approved. The

method is more complicated than other

methods because a set of three models

have to be built, which is a cumbersome

exercise.

4. Augmentation

In this method, rejects are not tagged

as goods or bads. Instead, approved

applications are reweighted or ‘weighted

up’ to represent the rejected applications

with similar chances of getting approved.

Methodology for this is mentioned below:

i. An approve/reject model is built to

predict the probability of an application

getting approved

ii. The scored dataset from the above step

is divided into 10 equal bands. Approved

applications of each band are assigned

a weight which is inversely proportional

to the average approval score of that

band. Instead of assigning weights on the

basis of bins, each approved applications

can also be assigned a different weight=

1/p(approval)

iii. Final KGB model is built on this

augmented dataset with weight assigned

to each accepted application.

Scorebands with low p(approval) are

assigned a higher weight and accepts in

these score bands have characteristics

similar to the rejected population. The

augmented dataset is expected to

represent the behavior of both approved

and declined applications. In this method,

performance of the model on the rejects

cannot be gauged and hence makes

it difficult to be compared with other

methods of RI.



Results:

A preferred method of RI should was

expected to satisfy the following criteria:

• Bad rate should be higher for rejects than

accepts

• Method should be able to increase the

approval rate without compromising on

the credit quality of the portfolio

• Method with higher gini, KS and AUC

would be considered to be more

predictive than others

• Method with higher accuracy would be

preferred

Model discriminatory powerThe performance of the 4 RI methods

in terms of separation power between

the goods and bads is shown in the table

below:

Method Population Actual %Bad Predicted %Bad Gini KS AUC

Hard cut-off

Overall 40% 40% 85% 73% 92.5%

Accepts 2% 6% 50% 38% 75.0%

Rejects 69% 65% 89% 81% 94.5%

Single

Weighted

Overall 4% 4% 55% 41% 77.5%

Accepts 2% 2% 59% 47% 79.5%

Rejects 5% 5% 44% 32% 72.0%

Double

Weighted

Overall 3% 3% 60% 47% 80.0%

Accepts 2% 2% 61% 47% 80.5%

Rejects 4% 4% 46% 34% 73.0%

Augmentation

Overall - * 4% - * - * - *

Accepts 2% 2% 63% 50% 81.5%

Rejects - * 6% - * - * - *

*Inferred performance of rejects is not available for the augmentation technique.

At overall level (accepts + rejects) the

hard cutoff approach gives a very good

separation power in terms of gini, KS and

AUC.

The separation is good for the rejects, but

not so good for the accepted applications.



Accuracy:

Table 2 shows how accurately bads and

goods are being classified in the 4 methods

at score cutoff of 2%, 3%, 4%, 5% and 6%.

Table 2 shows accuracy for accepts and

rejects taken together, while table 3 shows

the accuracy for accepts alone.

From table 2 we can see that the accuracy

at overall level is highest for double

weighted approach at all score cutoffs.

Accuracy for accepts alone is lowest

for the hard cutoff approach at all score

cutoffs. Accuracy for Accepts and Rejects

Score-cutoff Hard Cutoff Single Weighted Double Weighted Augmentation

600 74% 76% 88% -

500 71% 64% 83% -

400 70% 60% 80% -

300 69% 55% 76% -

200 68% 49% 73% -

Accuracy for Accepts


600 70% 91% 90% 91%

500 68% 88% 87% 88%

400 65% 86% 85% 85%

300 64% 79% 81% 80%

200 56% 75% 79% 75%

Table 2

Table 3



Approval Rate

Table 4 shows the approval rate for all the

four methods of RI at various score cutoffs.

At any given level of score cutoff, double

weighted approach gives highest approval

rate, and hard cutoff technique gives

lowest approval rate. This implies that by

keeping the bad rate constant, the bank will

be able to approve more applications than

other methods.

In summary, hard cutoff approach performs

well in discriminating goods from bad, but

does not give better accuracy and higher

approval rate than the other methods.

The augmentation technique has highest

separation power after hard cutoff

approach and also gives high accuracy,

Approval Rate


600 36% 76% 84% 45%

500 33% 65% 72% 42%

400 31% 60% 66% 40%

300 30% 53% 60% 37%

200 28% 48% 56% 34%

Table 4

but the approval rate is very low. Double

weighted approach performs better than

other methods in terms of accuracy and

approval rate, but its discriminatory power

is lower than the cutoff and augmentation

techniques.



ConclusionAs discussed earlier, two key objectives

of RI are to correct for sample bias and

identify rejects with good performance

which should have otherwise been

approved. All the four methods of RI

correct for sample bias by including

the performance of rejects in the final

modeling population. Double weighted

and single weighted approaches are

able to increase the approval rate. If the

business objective is to increase the

approval rate without compromising the

credit quality then the double weighted

approach should be used. This method

will significantly increase the approval rate,

and will accurately predict both rejects and

accepts. If the approval rate is already very

high and the objective is to identify bads

References • Derek Montrichard, Reject Inference

Methodologies in Credit Risk Modeling

• Naeem Siddiqi, Credit Risk Scorecards,

Chapter 6

• Reject Inference, augmentation and

sample selection, John Banasik,

Johnathan Crook, 25th Jan 2007

• Reject Inference Techniques

Implemented in Credit Scoring for SAS®

Enterprise Miner™, Billie Anderson, Susan

Haller, and Naeem Siddiqi, SAS Institute,

Cary, NC

AcknowledgementI would like to thank Alok Rustagi and

Varun Aggrawal for their valuable inputs

which greatly helped in making this paper

more informative.

among the through-the-door population,

then hard cutoff method should be used.

GLOBAL HEADQUARTERS280 Park Avenue, 38th Floor, New York, NY 10017

T: +1.212.277.7100 • F: +1.212.277.7111

United States • United Kingdom • Czech Republic • Romania • Bulgaria • India • Philippines • Colombia • South Africa

Email us: [email protected] On the web: EXLservice.com

EXL (NASDAQ: EXLS) is a leading operations management and analytics company that helps

businesses enhance growth and profitability in the face of relentless competition and continuous

disruption. Using our proprietary, award-winning Business EXLerator Framework®, which

integrates analytics, automation, benchmarking, BPO, consulting, industry best practices and

technology platforms, we look deeper to help companies improve global operations, enhance

data-driven insights, increase customer satisfaction, and manage risk and compliance. EXL

serves the insurance, healthcare, banking and financial services, utilities, travel, transportation

and logistics industries. Headquartered in New York, EXL has more than 24,000 professionals in

locations throughout the United States, Europe, Asia, Latin America, Australia and South Africa.

© 2016 ExlService Holdings, Inc. All Rights Reserved.

For more information, see www.exlservice.com/legal-disclaimer

Victor81659

Typewritten text

This document has been reviewed and approved by the content compliance team for circulation outside EXL.

Victor81659

Draft

Documents

Acquisition scorecard: Reject inference methodologies · credit quality. This paper outlines steps involved in some reject inference methods which can be used in the development of