Credit Scoring and Data Mining

Embed Size (px)

Citation preview

  • 8/4/2019 Credit Scoring and Data Mining

    1/170

    Mang 6054 Credit Scoring and Data MiningDr. Bart Baesens

    School of Management

    University of Southampton

    [email protected]

  • 8/4/2019 Credit Scoring and Data Mining

    2/170

    Lecturer: Bart BaesensStudied at the Catholic University of Leuven

    (Belgium) Business Engineer in Management

    Informatics, 1998

    Ph.D. in Applied Economic Sciences, 2003

    Ph.D. title: Developing Intelligent Systems for Credit Scoring Using

    Machine Learning TechniquesAssociate professor at the K.U.Leuven, Belgium,and Vlerick Leuven Ghent Management School

    Lecturer at the School of Management at the Universityof Southampton, United Kingdom

    Research: data mining, Web intelligence, neural networks, SVMs,rule extraction, survival analysis, CRM, credit scoring, Basel II,

    Twitter: www.twitter.com/DataMiningApps

    Facebook: Data Mining with Bart

    www.dataminingapps.com222

    http://www.twitter.com/DataMiningAppshttp://www.dataminingapps.com/http://www.dataminingapps.com/http://www.twitter.com/DataMiningApps
  • 8/4/2019 Credit Scoring and Data Mining

    3/170

    3

    Software Support SAS

    SAS Enterprise Miner

    SAS Credit Scoring Nodes

    interactive grouping, scorecard,reject inference, credit exchange

    Weka

    SPSS

    Microsoft Excel

  • 8/4/2019 Credit Scoring and Data Mining

    4/170

    Relevant Background: Credit Scoring Books Credit Risk Management: Basic Concepts, Van Gestel

    and Baesens. Oxford University Press, 2008.

    Credit Scoring and Its Applications, Thomas,Edelman, and Crook. Siam Monographs onMathematical Modeling and Computation,2002.

    The Credit Scoring Toolkit, Anderson, Oxford University Press, 2007.

    Consumer Credit Models: Pricing, Profit and Portfolios, Thomas, Oxford

    University Press, 2009. The Basel Handbook, Ong, 2004.

    Basel II Implementation: A Guide to Developing and Validating aCompliant Internal Risk Rating System, Ozdemir and Miu, McGraw-Hill, 2008.

    The Basel II Risk Parameters: Estimation, Validation and Stress Testing,

    Engelmann and Rauhmeier, Springer, 2006. Recovery Risk: The Next Challenge in Credit Risk Management. Edward

    Altman, Andrea Resti, Andrea Sironi (editors), 2005

    Developing Intelligent Systems for Credit Scoring, Bart Baesens,Ph.D. thesis, K.U.Leuven, 2003.

    4

  • 8/4/2019 Credit Scoring and Data Mining

    5/170

    Relevant Background: Data mining books

    Hastie T., Tibshirani R., Friedman J., Elementsof Statistical Learning: data mining, inference

    and prediction, Springer-Verlag, 2001.Hand D.J., Mannila H., Smyth P., Principles

    of Data Mining, the MIT Press, 2001.

    Han J., Kamber M., Data Mining: Concepts and

    Techniques, 2nd edition, Morgan Kaufmann, 2006.Bishop C.M., Neural Networks for Pattern Recognition,

    Oxford University Press, 1995.

    Cristianini N., Taylor J.S., An Introduction to SupportVector Machines and other kernel-based learning

    methods, Cambridge University Press, 2000.

    Cox D.R., Oakes D., Analysis of Survival Data,Chapman and Hall, 1984.

    Allison P.D., Survival Analysis Using the SAS System,

    SAS Institute Inc., 1995.5

    http://search.barnesandnoble.com/booksearch/imageviewer.asp?ean=9781555442798http://research.microsoft.com/~cmbishop/images/NNPR_large.jpghttp://mitpress.mit.edu/images/products/books/026208290X-f30.jpg
  • 8/4/2019 Credit Scoring and Data Mining

    6/170

    6

    Relevant Background: Credit scoring journals

    Journal of Credit Risk Risk Magazine

    Journal of Banking and Finance

    Journal of the Operational Research Society

    European Journal of Operational Research Management Science

    IMA Journal of Mathematics

  • 8/4/2019 Credit Scoring and Data Mining

    7/170

    Relevant Background: Data mining journals

    Data Mining and Knowledge Discovery

    Machine Learning

    Expert Systems with Applications

    Intelligent Data Analysis

    Applied Soft Computing

    European Journal of Operational Research

    IEEE Transactions on Neural Networks

    Journal of Machine Learning Research

    Neural Computation

    7

    http://www.jmlr.org/
  • 8/4/2019 Credit Scoring and Data Mining

    8/1708

    Relevant Background: Credit scoring Web sites

    www.defaultrisk.com

    http://www.bis.org/Websites of regulators:

    www.fsa.gov.uk (United Kingdom)

    www.hkma.gov.uk (Hong Kong)

    www.apra.gov.au (Australia)

    www.mas.gov.sg (Singapore)

    In the U.S. 4 agencies: OCC, Federal Reserve, FDIC, OTS

    Proposed Supervisory Guidance for Internal Ratings Based Systems forCredit Risk, Advanced Measurement Approaches for Operational Risk, andthe Supervisory Review Process (Pillar 2), Federal Register, Vol 72, No. 39,February 2007.

    Risk-based capital standards: Advanced Capital Adequacy Framework-Basel II; Final rule, Federal Register, December 2007.

    http://www.defaultrisk.com/http://www.bis.org/http://www.fsa.gov.uk/http://www.hkma.gov.uk/http://www.apra.gov.au/http://www.mas.gov.sg/http://www.mas.gov.sg/http://www.apra.gov.au/http://www.hkma.gov.uk/http://www.fsa.gov.uk/http://www.bis.org/http://www.defaultrisk.com/
  • 8/4/2019 Credit Scoring and Data Mining

    9/170

    Relevant Background: Data mining web sites

    www.kdnuggets.com (data mining)

    www.kernel-machines.org (SVMs)

    www.sas.com

    See support.sas.com for interesting macros

    www.twitter.com/DataMiningApps

    9

    http://www.kdnuggets.com/http://www.kernel-machines.org/http://www.sas.com/http://www.twitter.com/DataMiningAppshttp://www.twitter.com/DataMiningAppshttp://www.sas.com/http://www.kernel-machines.org/http://www.kernel-machines.org/http://www.kernel-machines.org/http://www.kdnuggets.com/
  • 8/4/2019 Credit Scoring and Data Mining

    10/17010

    Course Overview

    Credit Scoring

    Basel I and Basel II

    Data preprocessing for PD modeling

    Classification techniques for PD modeling

    Measuring performance of PD models

    Input selection

    Reject inference

    Behavioural scoring

    Other applications

  • 8/4/2019 Credit Scoring and Data Mining

    11/170

    Section 1

    An Introduction to

    Credit Scoring

  • 8/4/2019 Credit Scoring and Data Mining

    12/170

    Credit Scoring Estimate whether applicant will successfully repay

    his/her loan based on various information Applicant characteristics (e.g. age, income, )

    Credit bureau information

    Application information of other applicants Repayment behavior of other applicants

    Develop models (also called scorecards) estimatingthe probability of default of a customer

    Typically, assign points to each piece of information,add all points, and compare with a threshold (cut-off)

    12

  • 8/4/2019 Credit Scoring and Data Mining

    13/170

    Judgmental versus Statistical ApproachJudgmental

    Based on experience

    Five Cs: character, capital, collateral, capacity, condition

    Statistical

    Based on multivariate correlations between inputs and riskof default

    Both assume that the future will resemble the past.

    13

  • 8/4/2019 Credit Scoring and Data Mining

    14/170

    Types of Credit Scoring Application scoring

    Behavioral scoring

    Profit scoring

    Bankruptcy prediction

    14

  • 8/4/2019 Credit Scoring and Data Mining

    15/170

    Application Scoring

    Estimate probability of default at the time the applicant applies for the loan!

    Use predetermined definition of default

    For example, three months of payment arrears Has now been fixed in Basel II

    Use application variables (age, income, marital status, years ataddress, years with employers, known client, )

    Use bureau variables

    Bureau score, raw bureau data (for example, number of credit checks,total amount of credits, delinquency history),

    In the US

    Fico scores: between 300 to 850

    Experian, Equifax, TransUnion

    Others

    Baycorp Advantage (Australia and New Zealand)

    Schufa (Germany)

    BKR (the Netherlands)

    CKP (Belgium)

    Dun & Bradstreet (MidCorp, SME)

    15

  • 8/4/2019 Credit Scoring and Data Mining

    16/170

    Example Application Scorecard

    16

    So, a new customer applies for credit:

    AGE 32 120 pointsGENDER Female 180 pointsSALARY $1,150 160 points

    Total 460 points

    Let cut-off = 500

    REFUSE CREDIT

    ...

  • 8/4/2019 Credit Scoring and Data Mining

    17/170

    Example Variables Used in ApplicationScorecard Development

    Age

    Time at residence

    Time at employment

    Time in industry

    First digit of postal code

    Geographical: Urban/Rural/Regional/Provincial

    Residential StatusEmployment status

    Lifestyle code

    Existing client (Y/N)

    Years as client

    Number of products internally

    Internal bankruptcy/write-off

    Number of delinquencies

    Beacon/Empirical

    Time at bureau

    Total Inquiries

    Time since last Inquiry

    Inquiries in the last 3/6/12 mths

    Inquiries in the last 3/6/12 mths as % of total

    Income

    Total Liabilities

    Total Debt

    Total Debt service $

    Total Debt Service Ratio

    Gross Debt Service Ratio

    Revolving Debt/Total DebtOther Bank card

    Number of other bank cards

    Total Trades

    Trades opened in last six months

    Trades Opened in Last six months /Total Trades

    Total Inactive Trades

    Total Revolving trades

    Total term trades

    % revolving trades

    Total term trades

    Number of trades current

    Number of trades R1, R2..R9

    % trades R1-R2

    % Trades R3 or worse

    % trades O1..O9

    % trades O1-O2

    % Trades O3 or worse

    Total credit lines - revolvingTotal balance - revolving

    Total Utilization

    total balance- all

    17

  • 8/4/2019 Credit Scoring and Data Mining

    18/170

    Application Scoring: Summary New customers

    Two snapshots of the state of the customer Second snapshot typically taken around 18 months

    after loan origination

    Static procedure

    Scorecards are out-of-date, even before they areimplemented (Hand 2003).

    Application scoring aims at ranking customersno calibrated probabilities of default needed!

    18

  • 8/4/2019 Credit Scoring and Data Mining

    19/170

    Behavioral ScoringBehavioral Scoring

    Study risk behavior of existing customers Update risk assessment taking into account recent behavior

    Example of behavior:

    Average/Max/Min/trend in checking account balance, bureau score,

    Delinquency history (payment arrears, )

    Job changes, home address changes,

    Dynamic

    Video clip to snapshot (typically taken after 12 months)

    19 continued...

    Performance period/video clip

    Observation point Outcome point/snapshot

  • 8/4/2019 Credit Scoring and Data Mining

    20/170

    Behavioral Scoring Behavioral scoring can be used for

    Debt provisioning and profit scoring Authorizing accounts to go in excess

    Setting credit limits

    Renewals/reviews

    Collection strategies Characteristics:

    Many, many variables (input selection needed,cf. infra)

    Purpose is again scoring = ranking customer withrespect to their default likelihood

    20

  • 8/4/2019 Credit Scoring and Data Mining

    21/170

    Profit Scoring Calculate profit over customer's lifetime

    Need loss estimates for bad accounts Decide on time horizon

    Survival analysis

    Dynamic models: incorporate changes in economicclimate

    Score the profitability of a customer for a particularproduct (product profit score)

    Score the profitability of a customer over all products(customer profit score)

    Video clip to video clip

    Not being adopted by many financial institutions (yet!)

    21

  • 8/4/2019 Credit Scoring and Data Mining

    22/170

    Corporate default risk modelingPrediction approach

    Predict bankrupt versus non-bankrupt given ratios (solvency, liquidity, )describing financial status of companies

    Assumes the availability of data (and defaults!)

    For example, Altmans z-model

    1968, linear discriminant analysis

    For public industrial companies: Z=1.2X1 + 1.4X2+ 3.3X3 + 0.6X4 +1.0X5 (healthy if Z > 2.99, unhealthy if Z < 1.81)

    For private industrial companies: Z=6.56X1 + 3.26X2 + 6.72X3 +1.05X4 (healthy if Z > 2.60, unhealthy if Z < 1.1)

    X1: Working Capital/Total Assets, X2: Retained Earnings/Total Assets,X3: EBIT/ Total Assets, X4: Market (Book) Value of Equity/Total Liabilities,

    X5: Net Sales/Total AssetsExpert based approach

    Expert or expert committee decides upon criteria

    22

  • 8/4/2019 Credit Scoring and Data Mining

    23/170

    Example expert based scorecard (Ozdemirand Miu, 2009)

    Business Risk Score

    Industry Position 6

    Market Share Trends and Prospects 2

    Geographical Diversity of Operations 2

    Diversity Product and Services 6

    Customer Mix 1

    Management Quality and Depth 4

    Executive Board oversight 2

    . .

    23

  • 8/4/2019 Credit Scoring and Data Mining

    24/170

    Corporate default risk modeling

    Agency rating approach In the absence of default data (for example, sovereign exposures, low

    default portfolios)

    Purchase ratings (for example, AAA, AA, A, BBB) to reflectcreditworthiness

    Each rating corresponds to a default probability (PD)

    Shadow rating approach

    Purchase ratings from a ratings agency

    Estimate/mimick ratings using statistical models and data

    24

  • 8/4/2019 Credit Scoring and Data Mining

    25/170

    The Altman Model: Enron Case

    25 Source: Saunders and Allen (2002)

  • 8/4/2019 Credit Scoring and Data Mining

    26/170

    Rating Agencies Moodys Investor Service, Standard & Poors,

    Fitch IBCA Assign ratings to debt and fixed income securities

    at the borrowers request to reflect financial strength

    (creditworthiness) of the economic entity

    Ratings for Companies (private and public)

    Countries and governments (sovereign ratings)

    Local authorities

    Banks

    Models are based on quantitative and qualitative(judgmental) analysis (black box)

    26

  • 8/4/2019 Credit Scoring and Data Mining

    27/170

    Rating Agencies

    27 Van Gestel, Baesens, 2008

  • 8/4/2019 Credit Scoring and Data Mining

    28/170

    Section 2

    Basel I, Basel II and Basel III

  • 8/4/2019 Credit Scoring and Data Mining

    29/170

    29

    Regulatory versus Economic capitalRole of capital

    Banks should have sufficient capital to protect themselves andtheir depositors from the risks (credit/market/operational) they aretaking

    Regulatory capital

    Amount of capital a bank should have according to a regulation(e.g. Basel I or Basel II)

    Economic capital

    Amount of capital a bank has based on internal modeling strategyand policy

    Types of capital:

    Tier 1: common stock, preferred stock, retained earnings

    Tier 2: revaluation reserves, undisclosed reserves, generalprovisions, subordinated debt (can vary from country tocountry)

    Tier 3: specific to market risk

  • 8/4/2019 Credit Scoring and Data Mining

    30/170

    30

    The Basel I and II Capital AccordsBasel Committee

    Central Banks/Bank regulators of major (G10) industrial countries Meet every 3 months at Bank for International Settlements (BIS) in

    Basel

    Basel I Capital Accord 1988

    Aim is to set up minimum regulatory capital requirements in order to

    ensure that banks are able, at all times, to give back depositors funds Capital ratio = available capital/risk-weighted assets

    Capital ratio should be > 8%

    Both tier 1 and tier 2 capital

    Need for new Accord Regulatory arbitrage

    Not sufficient recognition of collateral guarantees

    Solvency of debtor not taken into account

    Market risk later introduced in 1996

    No operational risk

  • 8/4/2019 Credit Scoring and Data Mining

    31/170

    31

    Basel I: Example Minimum capital = 8% of risk-weighted assets

    Fixed risk ratesRetail:

    Cash: 0%

    Mortgages: 50%

    Other commercial: 100%

    Example:

    100$ mortgage

    50% 50$ risk-weighted assets 8% 4$ capital needed

    Risk weights are independent of obligor and facilitycharacteristics!

  • 8/4/2019 Credit Scoring and Data Mining

    32/170

    32

    Three Pillars of Basel II

    Credit Risk

    Standard Approach Internal Ratings

    Based Approach

    Foundation

    Advanced

    Operational Risk

    Market Risk

    Sound internal

    processes toevaluate risk(ICAAP)

    Supervisorymonitoring

    Semi-annual

    disclosure of: Banks risk profile

    Qualitative andQuantitativeinformation

    Risk managementprocesses

    Risk managementStrategy

    Objective is to informpotential investors

    Pillar 1: Minimum CapitalRequirement

    Pillar 2: Supervisory ReviewProcess

    Pillar 3: Market Discipline andPublic Disclosure

  • 8/4/2019 Credit Scoring and Data Mining

    33/170

    33

    Pillar 1: Minimum Capital Requirements Credit Risk: key components

    Probability of default (PD) (decimal) Loss given default (LGD) (decimal)

    Exposure at default (EAD) (currency)

    Maturity (M)

    Expected Loss (EL) = PD x LGD x EAD For example, PD=1%, LGD=20%, EAD=1000 Euros EL=2

    Euros

    Standardized Approach

    Internal Ratings Based Approach

    In the U.S., initially, in the notice of proposed rule making, adifference was made between ELGD (expected LGD) and LGD(based on economic downturn). This was later abandoned.

  • 8/4/2019 Credit Scoring and Data Mining

    34/170

    34

    The Standardized Approach Risk assessments (for example, AAA, AA, BBB, ...)

    are based on external credit assessment institution(ECAI) (for example, Standard & Poors, Moodys, ...)

    Eligibility criteria for ECAI given in Accord (objectivity,independence, ...)

    Risk weights given in Accord to computeRisk Weighted Assets (RWA)

    Risk weights for sovereigns, banks, corporates, ...

    Minimum Capital = 0.08 x RWA

  • 8/4/2019 Credit Scoring and Data Mining

    35/170

    35

    The Standardized ApproachExample:

    Corporate/1 mio dollars/maturity 5years/unsecured/external rating AA

    RW=20%, RWA=0.2 mio, Reg. Cap.=0.016 mio

    Credit risk mitigation (collateral)

    guidelines provided in accord (simple versuscomprehensive approach)

    For retail:

    Risk weight =75% for non-mortgage; 35% for mortgage

    But:

    Inconsistencies

    Sufficient coverage?

    Need for individual risk profile!

    No LGD, EAD

  • 8/4/2019 Credit Scoring and Data Mining

    36/170

    36

    Internal Ratings Based (IRB) ApproachInternal Ratings Based Approach

    Foundation Advanced

    PD LGD EAD

    Foundationapproach

    Internalestimate

    Regulator'sestimate

    Regulator'sestimate

    Advancedapproach

    Internalestimate

    Internal estimate Internal estimate

  • 8/4/2019 Credit Scoring and Data Mining

    37/170

    37

    IRB Approach Split exposure into 5 categories:

    corporate (5 subclasses) sovereign

    bank

    retail (residential mortgage, revolving, other retail

    exposures)

    equity

    Foundation IRB approach not allowed for retail

    Risk weight functions to derive capital requirements

    Phased rollout of the IRB approach across bankinggroup using implementation plan

    Transition period of 3 years

  • 8/4/2019 Credit Scoring and Data Mining

    38/170

    38

    Basel II: Retail Specifics Retail exposures can be pooled

    Estimate PD/LGD/EAD per rating/pool/segment! Note: pool=segment=grade=rating

    Default definition

    obligor is past due more than 90 days

    (can be relaxed during transition period) default can be applied at the level of the credit facility

    varies per country (e.g. with materiality thresholds)!

    PD is the greater of the one-year estimated PD or 0.03%

    Length of underlying historical observation period toestimate loss characteristics must be at least 5 years (canbe relaxed during transition period, minimum 2 years at thestart)

    No maturity (M) adjustment for retail!

  • 8/4/2019 Credit Scoring and Data Mining

    39/170

    39

    Developing a Rating Based System Application and behavioral scoring models provide ranking

    of customers according to risk This was OK in the past (for example, for loan approval),

    but Basel II requires well-calibrated default probabilities

    Map the scores or probabilities to a number of distinct

    borrower ratings/pools

    Decide on number of ratings and their definition Typically around 15 ratings

    Can be defined by mapping onto Moodys, S&P scale,

    or expert based!

    Impact on regulatory capital!

    ABCE D

    600580570520 Score

  • 8/4/2019 Credit Scoring and Data Mining

    40/170

    40

    Developing a Rating SystemFor corporate, sovereign, and bank exposures

    To meet this objective, a bank must have a minimum ofseven borrower gradesfor non-defaulted borrowers andone for those that have defaulted, paragraph 404 of theBasel II Capital Accord.

    For retail

    For each pool identified, the bank must be able to providequantitative measures of loss characteristics (PD, LGD,and EAD) for that pool. The level of differentiation for IRBpurposes must ensure that the number of exposures in a

    given pool is sufficient so as to allow for meaningfulquantification and validation of the loss characteristics atthe pool level. There must be a meaningful distribution ofborrowers and exposures across pools. A single pool mustnot include an undue concentration of the banks total retail

    exposure, paragraph 409 of the Basel II Capital Accord.

  • 8/4/2019 Credit Scoring and Data Mining

    41/170

    41

    A Basel II Project PlanNew customers: application scoring

    1 year PD (needed for Basel II) Loan term/18-month PD (needed for loan approval)

    Customers having credit: behavioral scoring

    1 year PD

    Other existing customers applying for credit:mix application/behavioural scoring

    1 year PD

    Loan term/18-month PD

  • 8/4/2019 Credit Scoring and Data Mining

    42/170

    42

    Estimating Well-calibrated PDs Calculate empirical 1-year realized default rates for

    borrowers in a specific grade/pool

    Use 5 years of data to estimate future PD (retail)

    How to calibrate the PD?

    Average Maximum

    Upper percentile (stressed/conservative PD)

    Dynamic model (time series, e.g. ARIMA)

    periodtimegiventhe

    ofbeginningtheatXratingwithobligorsofNumber

    periodtimetheduringdefaultedthatperiodtimegiventhe

    ofbeginningtheatXratingwithobligorsofNumber

    XgraderatingforPDyear-1

  • 8/4/2019 Credit Scoring and Data Mining

    43/170

    43

    Estimating Well-calibrated PDs

    empirical 12 month PD rating grade B

    0,00%

    0,50%

    1,00%

    1,50%

    2,00%

    2,50%

    3,00%

    3,50%

    4,00%

    4,50%

    5,00%

    0 10 20 30 40 50 60

    Month

    PD

  • 8/4/2019 Credit Scoring and Data Mining

    44/170

    44

    Basel II model architecture

    ModelScorecardLogistic Regression

    Internal DataExternal DataExpert Judgment

    Risk ratings definitionPD calibration

    Level 0

    Level 1

    Level 2

    CharacteristicName

    AttributeScorecard

    Points

    AGE 1 Up to 26 100AGE 2 26 - 35 120

    AGE 3 35 - 37 185

    AGE 4 37+ 225

    GENDER 1 Male 90

    GENDER 2 Female 180

    SALARY 1 Up to 500 120

    SALARY 2 501-1000 140

    SALARY 3 1001-1500 160

    SALARY 4 1501-2000 200

    SALARY 5 2001+ 240

    CharacteristicName

    AttributeScorecard

    Points

    AGE 1 Up to 26 100AGE 2 26 - 35 120

    AGE 3 35 - 37 185

    AGE 4 37+ 225

    GENDER 1 Male 90

    GENDER 2 Female 180

    SALARY 1 Up to 500 120

    SALARY 2 501-1000 140

    SALARY 3 1001-1500 160

    SALARY 4 1501-2000 200

    SALARY 5 2001+ 240

    CharacteristicName

    CharacteristicName

    CharacteristicName

    AttributeAttributeAttributeScorecard

    PointsScorecard

    PointsScorecard

    Points

    AGE 1AGE 1AGE 1 Up to 26Up to 26Up to 26 100100100AGE 2AGE 2AGE 2 26 - 3526 -3526 - 35 120120120

    AGE 3AGE 3AGE 3 35 - 3735 -3735 - 37 185185185

    AGE 4AGE 4AGE 4 37+37+37+ 225225225

    GENDER 1GENDER 1GENDER 1 MaleMaleMale 909090

    GENDER 2GENDER 2GENDER 2 FemaleFemaleFemale 180180180

    SALARY 1SALARY 1SALARY 1 Up to 500Up to 500Up to 500 120120120

    SALARY 2SALARY 2SALARY 2 501-1000501-1000501-1000 140140140

    SALARY 3SALARY 3SALARY 3 1001-15001001-15001001-1500 160160160

    SALARY 4SALARY 4SALARY 4 1501-20001501-20001501-2000 200200200

    SALARY 5SALARY 5SALARY 5 2001+2001+2001+ 240240240

    empirical 12 month PD rating grade B

    0,00%

    0,50%

    1,00%

    1,50%

    2,00%2,50%

    3,00%

    3,50%

    4,00%

    4,50%

    5,00%

    0 10 20 30 40 50 60

    Month

    PD

  • 8/4/2019 Credit Scoring and Data Mining

    45/170

    45

    Risk Weight Functions for Retail

    N() is cumulative standard normal distribution, N-1() isinverse cumulative standard normal distribution and

    is asset correlation Regulatory capital=K.EAD

    Residential mortgage exposures

    Qualifying revolving exposures

    Other retail exposures

    PDNPDNNLGDK )999.0(1)(1

    1. 11

    15.0

    04.0

    35

    35

    35

    35

    1

    1116.0

    1

    103.0

    e

    e

    e

    ePDPD

    Risk Weight Functions For

  • 8/4/2019 Credit Scoring and Data Mining

    46/170

    46

    Risk Weight Functions ForCorporates/Sovereigns/Banks

    M represents the nominal or effective maturity (between 1 and5 years)

    Firm size adjustment for SMEs

    S represents total annual sales in millions of euros (between5 and 50 million)

    bbMPDNPDNNLGDK

    5.11))5.2(1()999.0(

    1)(

    11. 11

    ))ln(05478.011852.0( PDb

    50

    50

    50

    50

    1

    1124.01

    112.0 e

    e

    e

    ePDPD

    )45

    51(04.0

    1

    1124.0

    1

    112.0

    50

    50

    50

    50

    S

    e

    e

    e

    ePDPD

    Risk Weight Functions

  • 8/4/2019 Credit Scoring and Data Mining

    47/170

    47

    Risk Weight Functions

    Regulatory Capital=K (PD, LGD). EAD Only covers unexpected loss (UL)!

    Expected loss=LGD.PD Expected loss covered by provisions!

    RWA not explicitly calculated. Can be backed out as follows: Regulatory Capital=8% . RWA RWA=12.50 x Regulatory Capital=12.50 x K x EAD

    Note: scaling factor of 1.06 for credit-risk-weighted assets (based onQIS 3, 4 and 5)

    The asset correlations are chosen depending on the businesssegment (corporate, sovereign, bank, residential mortgages,revolving, other retail) using some empirical but not publishedprocedure!

    Based on reverse engineering of economic capital models fromlarge banks The asset value correlations reflect a combination of supervisory

    judgment and empirical evidence (Federal Register) Note: asset correlations also measure how the asset class is

    dependent on the state of the economy

    LGD/ EAD E M E i Th

  • 8/4/2019 Credit Scoring and Data Mining

    48/170

    48

    LGD/ EAD Errors More Expensive ThanPD Errors

    Consider a credit card portfolio where PD = 0.03 ; LGD = 0.5 ; EAD =$10,000

    Basel formulae give capital requirement ofK(PD,LGD).EAD

    K(0.03,0.50)(10000) = $343.7 10% over estimate on PD means capital required is

    K(0.033,0.50)(10,000) = $367.3

    10% over estimate on LGD means capital required is

    K(0.03,0.55)(10,000) = $378.0

    10% over estimate on EAD means capital required is

    K(0.03,0.50)(11,000) = $378.0

  • 8/4/2019 Credit Scoring and Data Mining

    49/170

    49

    The Basel II Risk Weight Functions

    Basel III

  • 8/4/2019 Credit Scoring and Data Mining

    50/170

    Basel III Objective: fundamentally strengthen global capital standards

    Key attention points:

    focus on tangible equity capital since this is the component withthe greatest loss-absorbing capacity

    reduced reliance on banks internal models

    reduce the reliance on external ratings

    greater focus on stress testing

    Loss absorbing capacity beyond common standards forsystematically important banks

    Tier 3 capital eliminated

    Address model risk by e.g. introducing non-risk based leverageratio as a backstop

    Liquidity coverage ratio and the Net Stable Funding Ratio

    No major impact on underlying credit risk models!

    The new standards will take effect on 1 January 2013 and for themost part will become fully effective by January 2019

    50

    Basel II versus Basel III

  • 8/4/2019 Credit Scoring and Data Mining

    51/170

    Basel II versus Basel III

    Basel II Basel III

    Tier 1 capital ratio 4%.RWA 6% .RWA (by 2015)

    Core tier 1 capital ratio (commonequity=shareholder+ retained earnings)

    2%.RWA 4.5% .RWA (by 2015)

    Capital conservation buffer (commonequity)

    - 2,5% .RWA (by 2019)

    Countercyclical buffer - 0% 2.5% .RWA (by 2019)

    Non risk-based leverage ratio (tier 1capital)

    - 3% .Assets (by 2018)Note: Assets include off balancesheet exposures and derivatives.

    Total capital ratio [Tier 1 CapitalRatio] + [Tier 2Capital Ratio] +[Tier 3 CapitalRatio]

    [Tier 1 Capital Ratio] + [CapitalConservation Buffer] +[Countercyclical Capital Buffer] +[Capital for Systemically ImportantBanks]

  • 8/4/2019 Credit Scoring and Data Mining

    52/170

    Section 3

    Preprocessing Data

    for Credit Scoring andPD Modeling

  • 8/4/2019 Credit Scoring and Data Mining

    53/170

    53

    Motivation Dirty, noisy data

    For example, Age= -2003 Inconsistent data

    Value 0 means actual zero or missing value

    Incomplete data

    Income=? Data integration and data merging problems

    Amounts in Euro versus Amounts in dollar

    Duplicate data

    Salary versus Professional Income Success of the preprocessing steps is crucial to success of

    following steps

    Garbage in, Garbage out (GIGO)

    Very time consuming (80% rule)

  • 8/4/2019 Credit Scoring and Data Mining

    54/170

    54

    Preprocessing Data for Credit Scoring

    Types of variables

    Sampling

    Missing values

    Outlier detection and treatment

    Coarse classification of attributes Weights of Evidence and Information Value

    Segmentation

    Definition of target variable

  • 8/4/2019 Credit Scoring and Data Mining

    55/170

    55

    Types of Variables

    Continuous Defined on a continuous interval For example, income, amount on savings account, ... In SAS: Interval

    Discrete Nominal

    No ordering between values For example, purpose of loan, marital status

    Ordinal Implicit ordering between values For example, credit rating (AAA is better than AA, AA is better

    than A, ) Binary For example, gender

  • 8/4/2019 Credit Scoring and Data Mining

    56/170

    56

    Sampling Take sample of past applicants to build scoring model

    Think careful about population on which the model that is going to bebuilt using the sample, will be used

    Timing of sample

    How far do I go to get my sample?

    Trade-off: many data versus recent data

    Number of bads versus number of goods

    Undersampling, oversampling may be needed (dependent onclassification algorithm, cf. infra)

    Sample taken must be from a normal business period, to get asaccurate a picture as possible of the target population

    Make sure performance window is long enough to stabilize bad rate

    (e.g. 18 months!) Example sampling problems

    Application scoring: reject inference

    Behavioural scoring: seasonality depending upon the choice ofthe observation point

  • 8/4/2019 Credit Scoring and Data Mining

    57/170

    57

    Missing Values Keep or Delete or Replace

    Keep The fact that a variable is missing can be important

    information!

    Encode variable in a special way (e.g. separate

    category during coarse classification) Delete

    When too many missing values, removing thevariable or observation might be an option

    Horizontally versus vertically missing values Replace

    Estimate missing value using imputationprocedures

    Imputation Procedures for Missing Values

  • 8/4/2019 Credit Scoring and Data Mining

    58/170

    58

    Imputation Procedures for Missing Values

    For continuous attributes

    Replace with median/mean (median more robust to outliers) Replace with median/mean of all instances of the same class

    For ordinal/nominal attributes

    Replace with modal value (= most frequent category)

    Replace with modal value of all instances of the same class

    Advanced schemes

    Regression or tree-based imputation

    Predict missing value using other variables

    Nearest neighbour imputation

    Look at nearest neighbours (e.g. in Euclidean sense) forimputation

    Markov Chain Monte Carlo methods

    Not recommended!

  • 8/4/2019 Credit Scoring and Data Mining

    59/170

    59

    Dealing with Missing Values in SASdata credit;input income savings;datalines;1300 4001000 .2000 800. 200

    1700 6002500 10002200 900. 10001500 .;

    procstandarddata=credit replaceout=creditnomissing;run;

    Impute node in

    Enterprise Miner!

  • 8/4/2019 Credit Scoring and Data Mining

    60/170

    60

    Outliers E.g., due to recording or data entry errors or noise

    Types of outliers Valid observation: e.g. salary of boss, ratio

    variables

    Invalid observation: age =-2003

    Outliers can be hidden in one dimensional views of thedata (multidimensional nature of data)

    Univariate outliers versus multivariate outliers

    Detection versus treatment

  • 8/4/2019 Credit Scoring and Data Mining

    61/170

    61

    Univariate Outlier Detection Methods Visually

    histogram based box plots

    z-score

    Measures how many standard deviations an observationis away from the mean for a specific variable

    m is the mean of variable xi and sits standard deviation Outliers are variables having |zi| > 3 (or 2.5)

    Note: mean of z-scores is 0, standard deviation is 1

    Calculate the z-score in SAS using PROC STANDARD

    s

    m

    i

    i

    x

    z

  • 8/4/2019 Credit Scoring and Data Mining

    62/170

    62

    Univariate Outlier Detection Methods

  • 8/4/2019 Credit Scoring and Data Mining

    63/170

    63

    Box Plot A box plot is a visual representation of five numbers:

    Median M P(XM)=0.50 First Quartile Q1P(XQ1)=0.25

    Third Quartile Q3P(XQ3)=0.75

    Minimum

    Maximum

    IQR=Q3-Q1

    MinQ1 Q3M

    1,5*IQR

    Outliers

    M l i i O li D i M h d

  • 8/4/2019 Credit Scoring and Data Mining

    64/170

    64

    Multivariate Outlier Detection MethodsMultivariate outliers!

    T t t f O tli

  • 8/4/2019 Credit Scoring and Data Mining

    65/170

    65

    Treatment of Outliers For invalid outliers:

    E.g. age=300 years Treat as missing value (keep, delete, replace)

    For valid outliers:

    Truncation based on z-scores:

    Replace all variable values having z-scores of

    > 3 by the mean + 3 times the standard deviation Replace all variable values having z-scores of

    < -3 by the mean -3 times the standard deviation

    Truncation based on IQR (more robust than z-scores)

    Truncate to M3s, with M=median and s=IQR/(2 x 0.6745)

    See Van Gestel, Baesens et al., 2007

    Truncation using a sigmoid

    Use a sigmoid transform, f(x)=1/(1+e-x)

    Also referred to as winsorizing

    C ti th i SAS

  • 8/4/2019 Credit Scoring and Data Mining

    66/170

    66

    Computing the z-scores in SAS

    data credit;

    input age income;datalines;34 130024 100020 200040 2100

    54 170039 250023 220034 70056 1500

    ;

    procstandarddata=credit mean=0std=1out=creditnorm;run;

    Di ti ti d G i f Att ib t

  • 8/4/2019 Credit Scoring and Data Mining

    67/170

    67

    Discretization and Grouping of AttributeValues

    Motivation

    Group values of categorical variables for robust analysis

    Introduce non-linear effects for continuous variables

    Classification technique requires discrete inputs (for example, Bayesiannetwork classifiers)

    Create concept hierarchies (Group low level concepts,for example, raw age data, to higher lever concepts, for example, young,middle aged, old)

    Also called Coarse classification, Classing, Categorisation, binning, grouping,

    Methods

    Equal interval binning Equal frequency binning (histogram equalization)

    Chi-squared analysis

    Entropy based discretization (using e.g. decision trees)

    C Cl if i C ti V i bl

  • 8/4/2019 Credit Scoring and Data Mining

    68/170

    68

    Coarse Classifying Continuous Variables

    Lyn Thomas, A survey of credit and behavioural scoring: forecastingfinancial risk of lending to consumers, International Journal ofForecasting, 16, 149-172, 2000

    Th Bi i M th d

  • 8/4/2019 Credit Scoring and Data Mining

    69/170

    69

    The Binning Method For example, consider the attribute income

    1000, 1200, 1300, 2000, 1800, 1400 Equal interval binning

    Bin width=500

    Bin 1, [1000, 1500[ : 1000, 1200, 1300, 1400

    Bin 2, [1500, 2000[ : 1800, 2000

    Equal frequency binning (histogram equalization)

    2 bins

    Bin 1: 1000, 1200, 1300

    Bin 2: 1400, 1800, 2000

    However, both these methods do not take into accountthe default risk!

    Th Chi d M th d

  • 8/4/2019 Credit Scoring and Data Mining

    70/170

    70

    The Chi-squared Method Consider the following example (taken from the book,

    Credit Scoring and Its Applications, by Thomas,Edelman and Crook, 2002)

    Suppose we want three categories

    Should we take option 1: owners, renters, and othersor option 2: owners, with parents, and others

    Attribute Owner RentUnfurnished

    RentFurnished

    Withparents

    Other Noanswer

    Total

    Goods 6000 1600 350 950 90 10 9000

    Bads 300 400 140 100 50 10 1000

    Good:badodds

    20:1 4:1 2.5:1 9.5:1 1.8:1 1:1 9:1

    continued...

    Th Chi d M th d

  • 8/4/2019 Credit Scoring and Data Mining

    71/170

    71

    The Chi-squared Method The number of good owners given that the odds are the same as in

    the whole population are (6000+300) x 9000/10000=5670.

    Likewise, the number of bad renters given that the odds are the sameas in the whole population are(1600+400+350+140) x 1000/10000=249.

    Compare the observed frequencies with the theoretical frequenciesassuming equal odds using a chi-squared test statistic.

    Option 1

    Option 2

    The higher the test statistic, the better the split (formally, comparewith chi-square distribution with k-1degrees of freedom for k classesof the characteristic).

    Option 2 is the better split!

    583

    121

    121160

    1089

    10891050

    249

    249540

    2241

    22411950

    630

    630300

    5670

    567060002

    2

    662265

    2656002385

    23852050105

    105100945

    945950630

    6303005670

    567060002

    C Cl ifi ti i SAS

  • 8/4/2019 Credit Scoring and Data Mining

    72/170

    Coarse Classification in SAS

    72

    data residence;

    input default$ resstatus$ count;datalines;good owner 6000good rentunf 1600good rentfurn 350

    good withpar 950good other 90good noanswer 10bad owner 300bad rentunf 400bad rentfurn 140bad withpar 100bad other 50bad noanswer 10;

    C Cl ifi ti i SAS

  • 8/4/2019 Credit Scoring and Data Mining

    73/170

    Coarse Classification in SAS

    73

    data coarse1;

    input default$ resstatus$ count;datalines;good owner 6000good renter 1950good other 1050

    bad owner 300bad renter 540bad other 160;

  • 8/4/2019 Credit Scoring and Data Mining

    74/170

    Coarse Classification in SAS

  • 8/4/2019 Credit Scoring and Data Mining

    75/170

    Coarse Classification in SAS

    75

    proc freq data=coarse1;

    weight count;tables default*resstatus / chisq;run;

    proc freq data=coarse2;weight count;tables default*resstatus / chisq;

    run;

    Recoding Nominal/Ordinal Variables

  • 8/4/2019 Credit Scoring and Data Mining

    76/170

    76

    Recoding Nominal/Ordinal Variables Nominal variables

    Dummy encoding Ordinal variables

    Thermometer encoding

    Weights of evidence coding

    Dummy and Thermometer Encoding

  • 8/4/2019 Credit Scoring and Data Mining

    77/170

    77

    Dummy and Thermometer Encoding

    Input 1 Input 2 Input 3

    Purpose=car 1 0 0

    Purpose=real estate 0 1 0

    Purpose=other 0 0 1

    Original Input CategoricalInput

    Thermometer Inputs

    Input 1 Input 2 Input 3

    Income 800 Euro 1 0 0 0

    Income > 800 Euro andIncome 1200 Euro

    2 0 0 1

    Income > 1200 Euro and 10000 Euro

    3 0 1 1

    Income > 10000 Euro 4 1 1 1

    Many inputs

    Explosion of data set!

    Thermometer encoding

    Dummy encoding

    Weights of Evidence

  • 8/4/2019 Credit Scoring and Data Mining

    78/170

    78

    Weights of Evidence Measures risk represented by each (grouped) attribute

    category (for example, age 23-26) The higher the weight of evidence (in favor of being

    good), the lower the risk for that category

    Weight of Evidence

    category = ln (p_goodcategory / p_badcategory ),

    where p_goodcategory = number of goodscategory/ number of goodstotal

    p_badcategory = number of badscategory/ number of badstotal

    If p_goodcategory > p_badcategory then WOE > 0

    If p_goodcategory < p_badcategory then WOE < 0

    continued...

    Weights of Evidence

  • 8/4/2019 Credit Scoring and Data Mining

    79/170

    79

    Weights of Evidence

    Age CountDistr.

    Count GoodsDistr.

    Goods BadsDistr.Bads WOE

    Missing 50 2.50% 42 2.33% 8 4.12% -57.28%

    18-22 200 10.00% 152 8.42% 48 24.74% -107.83%

    23-26 300 15.00% 246 13.62% 54 27.84% -71.47%

    27-29 450 22.50% 405 22.43% 45 23.20% -3.38%

    30-35 500 25.00% 475 26.30% 25 12.89% 71.34%

    35-44 350 17.50% 339 18.77% 11 5.67% 119.71%44+ 150 7.50% 147 8.14% 3 1.55% 166.08%

    Total: 2000 1806 194

    Distr Good Distr Bad/Ln x 100

    ...

    Information Value

  • 8/4/2019 Credit Scoring and Data Mining

    80/170

    80

    Information Value Information Value (IV) is a measure of predictive

    power used to: assess the appropriateness of the classing

    select predictive variables

    IV is similar to an entropy (cf. infra):

    IV = S( p_good category- p_bad category )*woe category ) Rule of thumb:

    < 0.02 : unpredictive

    0.02 0.1 : weak

    0.1 0.3 : medium

    0.3 + : strong

    Age

    Distr.

    Goods

    Distr.

    Bads WOE IV

    Missing 2.33% 4.12% -57.28% 0.010318-22 8.42% 24.74% -107.83% 0.1760

    23-26 13.62% 27.84% -71.47% 0.1016

    27-29 22.43% 23.20% -3.38% 0.0003

    30-35 26.30% 12.89% 71.34% 0.0957

    35-44 18.77% 5.67% 119.71% 0.1568

    44+ 8.14% 1.55% 166.08% 0.1095

    Information Value: 0.6502

    WOE: Logical Trend?

  • 8/4/2019 Credit Scoring and Data Mining

    81/170

    81

    WOE: Logical Trend?

    Predictive Strength

    -150

    -100

    -50

    0

    50

    100

    150

    200

    Missing 18-22 23-26 27-29 30-35 35-44 44 +

    Age

    Weigh

    t

    Younger people tend to represent a higher riskthan the older population

    Segmentation

  • 8/4/2019 Credit Scoring and Data Mining

    82/170

    82

    Segmentation When more than onescorecard is required

    Build a scorecard for each segment separately Three reasons (Thomas, Jo, and Scherer, 2001)

    Strategic

    Banks might want to adopt special strategies to specific

    segments of customers (for example, lower cut-offscore depending on age or residential status).

    Operational

    New customers must have separate scorecardbecause the characteristics in the standard scorecard

    do not make sense operationally for them. Variable Interactions

    If one variable interacts strongly with a number ofothers, it might be sensible to segment according to

    this variable. continued...

    Segmentation

  • 8/4/2019 Credit Scoring and Data Mining

    83/170

    83

    Segmentation Two ways

    Experience based In cooperation with financial expert

    Statistically based

    Use clustering algorithms (k-means,

    decision trees, ) Let the data speak

    Better predictive power than single modelif successful!

    Decision Trees for Clustering

  • 8/4/2019 Credit Scoring and Data Mining

    84/170

    84

    Decision Trees for Clustering

    Segmentation

    Customer > 2 yrsBad rate = 0.3%

    Customer < 2 yrsBad rate = 2.8%

    Existing CustomerBad rate = 1.2%

    Age < 30Bad rate = 11.7%

    Age > 30Bad rate = 3.2%

    New ApplicantBad rate = 6.3%

    All Good/BadsBad rate = 3.8%

    4 Segments

    Definitions of Bad

  • 8/4/2019 Credit Scoring and Data Mining

    85/170

    85

    Definitions of Bad 30/60/90 days past due

    Charge off/write off Bankrupt

    Claim over $1000

    Profit based

    Negative NPV

    Less than x% owed collected

    Fraud over $500

    Basel II: 90 days, but can be changed by nationalregulators (for example, U.S., UK, Belgium, )

  • 8/4/2019 Credit Scoring and Data Mining

    86/170

    Section 4

    Classification Techniques

    for Credit Scoring andPD Modeling

    The Classification Problem

  • 8/4/2019 Credit Scoring and Data Mining

    87/170

    87

    The Classification Problem The classification problem can be stated as follows:

    Given an observation (also called pattern) withcharacteristics x=(x1,..,xn), determine its class cfrom a predetermined set of classes {c1,...,cm}

    The classes {c1,..,cm} are known on beforehand

    Supervised learning! Binary (2 classes) versus multiclass (> 2 classes)

    classification

    Regression for classification

  • 8/4/2019 Credit Scoring and Data Mining

    88/170

    Regression for classification

    Linear regression gives : Y=0+ 1Age+ 2Income+ 3Gender

    Can be estimated using OLS (in SAS use proc reg or proc glm)Two problems:

    No guarantee that Y is between 0 and 1 (i.e. a probability)

    Target/Errors not normally distributed

    Using a bounding function to limit the outcome between 0 and 1:

    88

    Customer Age Income Gender G/B

    John 30 1200 M B

    Sarah 25 800 F GSophie 52 2200 F G

    David 48 2000 M B

    Peter 34 1800 M G

    Customer Age Income Gender G/B Y

    John 30 1200 M B 0

    Sarah 25 800 F G 1Sophie 52 2200 F G 1

    David 48 2000 M B 0

    Peter 34 1800 M B 1

    z

    e

    zf

    1

    1)(

    0

    0,1

    0,2

    0,3

    0,4

    0,5

    0,6

    0,7

    0,8

    0,9

    1

    -7 -5 -3 -1 1 3 5 7

    Logistic Regression

  • 8/4/2019 Credit Scoring and Data Mining

    89/170

    g g Linear regression with a transformation such that the output

    is always between 0 and 1 can thus be interpreted as a

    probability (e.g. probability of good customer)

    Or, alternatively

    The parameters can be estimated in SAS using proc logistic Once the model has been estimated using historical data, we

    can use it to score or assign probabilities to new data

    89

    ...)( 32101

    1)gender,...income,,age|goodcustomer(

    genderincomeagee

    P

    ...))genderincome,age,|badcustomer(

    )gender,...income,age,|goodcustomer((ln 3210

    genderincomeage

    P

    P

    Logistic Regression in SAS

  • 8/4/2019 Credit Scoring and Data Mining

    90/170

    g g

    90

    Customer Age Income Gender Response

    John 30 1200 M No

    Sarah 25 800 F Yes

    Sophie 52 2200 F Yes

    David 48 2000 M No

    Peter 34 1800 M Yes

    Historical data

    proc logistic data=responsedata;

    class Gender;

    model Response=Age Income Gender;

    run;

    Customer Age Income Gender Response score

    Emma 28 1000 F 0,44

    Will 44 1500 M 0,76

    Dan 30 1200 M 0,18

    Bob 58 2400 M 0,88

    ...)80.0050.022.010.0(1

    1)Gender,...Income,,Age|yesresponse(

    genderincomeagee

    P

    New data

    Logistic Regression

  • 8/4/2019 Credit Scoring and Data Mining

    91/170

    91

    Logistic Regression The logistic regression model is formulated as follows:

    Hence,

    Model reformulation:

    nn

    nn

    nn XX

    XX

    XXne

    e

    eXXYP

    ...

    ...

    )...(1 110

    110

    110 11

    1),...,|1(

    nnnn XXXXnn

    ee

    XXYPXXYP

    ...)...(11 110110

    1

    1

    1

    11),...,|1(1),...,|0(

    1),...,|0(),,...,|1(0 11 nn XXYPXXYP

    nXnX

    eXXYP

    XXYP

    n

    n

    . . .110

    ),...,|0(

    ),...,|1(

    1

    1

    continued...

    Logistic Regression

  • 8/4/2019 Credit Scoring and Data Mining

    92/170

    92

    Logistic Regression

    nnn

    n XXXXYP

    XXYP

    ...)

    ),...,|0(

    ),...,|1((ln

    1101

    1

    ),...,|1(1

    ),...,|1(

    1

    1

    n

    n

    XXYP

    XXYP

    is the odds in favor of Y=1

    is called the logit)),...,|1(1

    ),...,|1((ln

    1

    1

    n

    n

    XXYP

    XXYP

    Logistic Regression

  • 8/4/2019 Credit Scoring and Data Mining

    93/170

    93

    Logistic Regression

    If Xi increases by 1:

    ei is the odds-ratio: the multiplicative increasein the odds when Xi increases by one (othervariables remaining constant/ceteris paribus)

    i>0ei> 1 odds and probability increase

    with Xi i

  • 8/4/2019 Credit Scoring and Data Mining

    94/170

    94

    Logistic Regression and Weight ofEvidence Coding

    CustID

    Age Age Group Age WoE

    1 20 1: until 22 -1.1

    2 31 2: 22 until 35 0.2

    3 49 3: 35+ 0.9

    Actual Age

    AfterclassingAfter re-

    coding

    continued...

    Logistic Regression and Weight of

  • 8/4/2019 Credit Scoring and Data Mining

    95/170

    95

    Logistic Regression and Weight ofEvidence Coding

    No dummy variables!

    More robust

    ...)_*_*(1 01

    1),...,|1(

    woepurposewoeagen purposeagee

    XXYP

    Decision Trees

  • 8/4/2019 Credit Scoring and Data Mining

    96/170

    96

    Decision Trees Recursively partition the training data

    Various tree induction algorithms C4.5 (Quinlan 1993)

    CART: Classification and Regression Trees(Breiman, Friedman, Olsen, and Stone 1984)

    CHAID: Chi-squared Automatic InteractionDetection (Hartigan 1975)

    Classification tree

    Target is categorical

    Regression tree Target is continuous (interval)

    Decision Tree Example

  • 8/4/2019 Credit Scoring and Data Mining

    97/170

    97

    Decision Tree Example

    Income > $50,000

    Job > 3 Years High Debt

    No

    No No

    Good Risk

    Yes

    Bad Risk

    Yes

    Good RiskBad Risk

    Yes

    Decision Trees Terminology

  • 8/4/2019 Credit Scoring and Data Mining

    98/170

    98

    Income > $50,000

    Job > 3 Years High Debt

    No

    No No

    Good Risk

    Yes

    Bad Risk

    Yes

    Good RiskBad Risk

    Yes

    Decision Trees Terminology

    Leafnodes

    BranchInternal node

    Root

    Estimating decision trees from data

  • 8/4/2019 Credit Scoring and Data Mining

    99/170

    Estimating decision trees from data

    Splitting decision

    Which variable to split at what value (e.g. age < 30or not, income < 1000 or not; maritalstatus=married or not, )?

    Stopping decision

    When to stop growing the tree?

    When to stop adding nodes to the tree?

    Assignment decision

    Which class (e.g. good or bad customer) to assignto a leave node?

    Usually the majority class!

    99

    Splitting decision: entropy

  • 8/4/2019 Credit Scoring and Data Mining

    100/170

    p g py

    Green dots represent good customers; red dots badcustomers

    Impurity measures disorder/chaos in a data set

    Impurity of a data sample S can be measured as follows:

    Entropy: E(S)= -pGlog2(pG)-pBlog2(pB) (C4.5) Gini: Gini(S)=2pGpB (CART)

    100

    Minimal ImpurityMinimal Impurity Maximal Impurity

    Measuring Impurity using Entropy

  • 8/4/2019 Credit Scoring and Data Mining

    101/170

    Measuring Impurity using EntropyEntropy is minimal when pG=1 or pB =1 and maximal

    when pG= p

    B=0.50

    Consider different candidate splits and measure for eachthe decrease in entropy or gain

    101

    Entropy

    0

    0,2

    0,4

    0,6

    0,8

    1

    0 0,2 0,4 0,6 0,8 1

    Example: Gain Calculation

  • 8/4/2019 Credit Scoring and Data Mining

    102/170

    102

    Entropy top node = -1/2 x log2(1/2) 1/2 x log2(1/2) = 1 Entropy left node = -1/3 x log2(1/3) 2/3 x log2(2/3) = 0.91

    Entropy right node = -1 x log2(1) 0 x log2(0) = 0

    Gain = 1 (600/800) x 0.91 (200/800) x 0 = 0.32

    Consider different alternative splits and pick the one with biggestgain

    102

    400 400

    200 400 200 0

    G B

    G B G B

    Age < 30 Age 30

    Gain

    Age < 30 0,32

    Marital status=married 0,05

    Income < 2000 0,48

    Savings amount < 50000 0,12

    Stopping decision: early stopping

  • 8/4/2019 Credit Scoring and Data Mining

    103/170

    103

    pp g y pp gAs the tree is grown, the data becomes more and more segmented and

    splits are based on fewer and fewer observations

    Need to avoid the tree from becoming too specific and fitting the noise ofthe data (aka overfitting)

    Avoid overfitting by partitioning the data into a training set (used for splittingdecision) and a validation set (used for stopping decision)

    Optimal tree is found where validation set error is minimal

    103

    Validation set

    Trainingset

    minimum

    Misclassificationerror

    STOP Growing tree!

    Number of tree nodes

    Advantages/Disadvantages of Trees

  • 8/4/2019 Credit Scoring and Data Mining

    104/170

    104

    g g Advantages

    Ease of interpretation Non-parametric

    No need for transformations of variables

    Robust to outliers

    Disadvantages Sensitive to changes in the training data (unstable)

    Combinations of decision trees: bagging, boosting,random forests

  • 8/4/2019 Credit Scoring and Data Mining

    105/170

    Section 5

    Measuring the

    Performance of Credit ScoringClassification Models

    How to Measure Performance?

  • 8/4/2019 Credit Scoring and Data Mining

    106/170

    106

    Performance

    How well does the trained model perform in predictingnew unseen (!) instances?

    Decide on performance measure

    Classification: Percentage Correctly Classified (PCC),

    Sensitivity, Specificity, Area Under ROC curve(AUROC), ...

    Regression: Mean Absolute Deviation (MAD), MeanSquared Error (MSE), ...

    Methods Split sample method

    Single sample method

    N-fold cross-validation

    Split Sample Method

  • 8/4/2019 Credit Scoring and Data Mining

    107/170

    107

    p p For large data sets

    Large= > 1000 Obs, with > 50 bad customers Set aside a test set (typically one-third of the data)

    which is NOT used during training!

    Estimate performance of trained classifier on test set

    Note: for decision trees, validation set is part oftraining set

    Stratification

    Same class distribution in training set and test set

    Performance Measures for Classification

  • 8/4/2019 Credit Scoring and Data Mining

    108/170

    108

    Classification accuracy, confusion matrix, sensitivity,specificity

    Area under the ROC curve

    The Lorenz (Power) curve

    The Cumulative lift curve

    Kolmogorov Smirnov

    Classification performance: confusion matrix

  • 8/4/2019 Credit Scoring and Data Mining

    109/170

    109

    Cut off=0,50

    G/B Score

    John G 0,72

    Sophie B 0,56David G 0,44

    Emma B 0,18

    Bob B 0,36

    G/B Score Predicted

    John G 0,72 G

    Sophie B 0,56 G

    David G 0,44 BEmma B 0,18 B

    Bob B 0,36 B

    Actual statusPositive (Good) Negative (Bad)

    Predicted

    status

    Positive (Good) True Positive (John) False Positive (Sophie)

    Negative (Bad) False Negative (David) True Negative (Emma, Bob)

    Classification accuracy=(TP+TN)/(TP+FP+FN+TN)=3/5

    Classification error = (FP +FN)/(TP+FP+FN+TN)=2/5Sensitivity=TP/(TP+FN)=1/2Specificity=TN/(FP+TN)=2/3

    Confusion matrix

    All these measures depend upon the cut off!

    Classification Performance: ROC analysis

  • 8/4/2019 Credit Scoring and Data Mining

    110/170

    y Make a table with sensitivity and specificity for each possible cut-off

    ROC curve plots sensitivity versus 1-specificity for each possible cut-off

    Perfect model has sensitivity of 1 and specificity of 1 (i.e. upper left corner)

    Scorecard A is better than B in above figure ROC curve can be summarized by the area underneath (AUC); the bigger

    the better!

    110

    Cut-off sensitivity specificity 1-specificity

    0 1 0 1

    0,01

    0,02

    .

    0,99

    1 0 1 0

    0

    0,2

    0,4

    0,6

    0,8

    1

    0 0,2 0,4 0,6 0,8 1

    Sensitivity

    (1 - Specificity)

    ROC Curve

    Scorecard - A Random Scorecard - B

  • 8/4/2019 Credit Scoring and Data Mining

    111/170

    The Area Under the ROC Curve

  • 8/4/2019 Credit Scoring and Data Mining

    112/170

    112

    How to compare intersecting ROC curves?

    The area under the ROC curve (AUC) The AUC provides a simple figure-of-merit for the

    performance of the constructed classifier

    An intuitive interpretation of the AUC is that it provides

    an estimate of the probability that a randomly choseninstance of class 1 is correctly ranked higher than arandomly chosen instance of class 0 (Hanley andMcNeil, 1983) (Wilcoxon or Mann-Whitney or Ustatistic)

    The higher the better

    A good classifier should have an AUC larger than 0.5

    Cumulative Accuracy Profile (CAP)

  • 8/4/2019 Credit Scoring and Data Mining

    113/170

    Sort the population from high score to low model score Measure the (cumulative) percentage of bads for each score decile

    E.g. top 30% most likely bads according to model captures 65% of truebads

    Often summarised as top-decile lift, or how many bads in top 10%

    113

    Accuracy Ratio

  • 8/4/2019 Credit Scoring and Data Mining

    114/170

    114

    The accuracy ratio (AR) is defined as:(Area below power curve for current model-Area below powercurve for random model) /(Area below power curve for perfect model-Area below power

    curve for random model) Perfect model has an AR of 1

    Random model has an AR of 0

    AR is sometimes also called the Gini coefficient

    AR=2*AUC-1

    B

    A

    B

    A

    AR=B/(A+B)

    Current model

    Perfect model

    The Kolmogorov-Smirnov (KS) Distance

  • 8/4/2019 Credit Scoring and Data Mining

    115/170

    115

    Separation measure

    Measures the distance between the cumulative scoredistributions P(s|B) and P(s|G)

    KS = maxs|P(s|G)-P(s|B)|, where:

    P(s|G)= xsp(x|G) (equals 1- sensitivity)

    P(s|B) = xsp(x|B) (equals the specificity) KS distance metric is the maximum vertical distance

    between both curves

    KS distance can also be measured on the ROC graph

    Maximum vertical distance between ROC graph anddiagonal

    continued...

    The Kolmogorov-Smirnov Distance

  • 8/4/2019 Credit Scoring and Data Mining

    116/170

    116

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    KS distance

    score

    P(s|B)

    P(s|G)

  • 8/4/2019 Credit Scoring and Data Mining

    117/170

    Section 6

    Setting the

    Classification Cut-off

    Setting the Classification Cut-off

  • 8/4/2019 Credit Scoring and Data Mining

    118/170

    118

    P(customer is good payer|x)

    Depends upon risk preference strategy No input provided by Basel accord, capital should be in

    line with risks taken

    Strategy curve

    Choose cut-off that produces same acceptancerate as previous scorecard but improves the badrate

    Choose cut-off that produces same bad rate as

    previous scorecard but improves the acceptancerate

    Strategy Curve

  • 8/4/2019 Credit Scoring and Data Mining

    119/170

    119

    Accept Rate

    0

    2

    4

    68

    10

    12

    14

    16

    18

    40 45 50 55 60 65 70 75 80 85 90 95

    B

    adRate

    Setting the Cut-off Based on MarginalG

  • 8/4/2019 Credit Scoring and Data Mining

    120/170

    120

    Good-Bad Rates

    Usually one chooses a marginal good-bad rate ofabout 5:1 or 3:1

    Dependent upon the granularity of the cut-off change

    Cut-off Cum. Goods

    above score

    Cum. Bads

    above score

    Marginal

    good-bad rate

    0.9 1400 50 -

    0.8 2000 100 12:1

    0.7 2700 170 10:1

    0.6 2950 220 5:1

    0.5 3130 280 3:1

    0.4 3170 300 2:1

    Including a Refer Option

  • 8/4/2019 Credit Scoring and Data Mining

    121/170

    121

    Refer option

    Gray zone Requires further (human) inspection

    0 400190

    Reject

    210

    AcceptRefer

    ...

  • 8/4/2019 Credit Scoring and Data Mining

    122/170

    122

    Baesens B., Van Gestel T., Viaene S.,

    Stepanova M., Suykens J., Vanthienen J.,Benchmarking State of the Art ClassificationAlgorithms for Credit Scoring, Journal of theOperational Research Society, Volume 54,Number 6, pp. 627-635, 2003.

    Case Study 1

    Benchmarking Study Baesens et al., 2003

  • 8/4/2019 Credit Scoring and Data Mining

    123/170

    123

    8 real-life credit scoring data sets

    UK, Benelux, Internet 17 algorithms or variations of algorithms

    Various cut-off setting schemes

    Classification accuracy + Area under Receiver

    Operating Characteristic Curve McNemar test + DeLong, DeLong and Clarke-Pearson

    test

    Baesens B., Van Gestel T., Viaene S., Stepanova M., Suykens J.,Vanthienen J., Benchmarking State of the Art Classification Algorithmsfor Credit Scoring, Journal of the Operational Research Society, Volume54, Number 6, pp. 627-635, 2003.

    continued...

    Benchmarking Study

  • 8/4/2019 Credit Scoring and Data Mining

    124/170

    124

    Benchmarking Study: Conclusions

  • 8/4/2019 Credit Scoring and Data Mining

    125/170

    125

    NN and RBF LS-SVM consistently yield goodperformance in terms of both PCC and AUC

    Simple linear classifiers such as LDA and LOGalso gave very good performances

    Most credit scoring data sets are only weaklynon-linear

    Flat maximum effect

    continued...

    Benchmarking Study: Conclusions

  • 8/4/2019 Credit Scoring and Data Mining

    126/170

    126

    Although the differences are small, they may be large

    enough to have commercial implications. (Henley and

    Hand, 1997)

    Credit is an industry where improvements, even when

    small, can represent vast profits if such improvementcan be sustained. (Kelly, 1998)

    For mortgage portfolios, a small increase in PDscorecard discrimination will significantly lower theminimum Capital Requirements in the context of BaselII. (Experian, 2003)

    The best way to augment the performance of ascorecard is to look for better predictors!

    Role of Credit Bureaus!

    Credit BureausC di f i di b

  • 8/4/2019 Credit Scoring and Data Mining

    127/170

    127

    Credit reference agencies or credit bureaus UK: Information typically accessed through name and address

    Types of information: Publicly available information For example, time at address, electoral rolls, court judgments

    Previous searches from other financial institutions Shared contributed information by several financial institutions

    Check whether applicant has loan elsewhere and how he pays

    Aggregated information For example, data at postcode level

    Fraud Warnings Check whether prior fraud warnings at given address

    Bureau added value For example, build generic scorecard for small populations or

    new product In U.K.: Experian, Equifax, Call Credit In U.S.: Equifax, Experian, TransUnion

  • 8/4/2019 Credit Scoring and Data Mining

    128/170

    Section 7

    Input Selection for

    Classification

    Input Selection

  • 8/4/2019 Credit Scoring and Data Mining

    129/170

    129

    Inputs=Features=Attributes=Characteristics=Variables

    Also called attribute selection, feature selection

    If n features are present, 2n-1 possible feature setscan be considered

    Heuristic search methods are needed!

    Good feature subsets contain features highly

    correlated (predictive of) with the class, yetuncorrelated with (not predictive of) each other(Hall and Smith 1998)

    Can improve the performance and estimation timeof the classifier

    continued...

    Input Selection

  • 8/4/2019 Credit Scoring and Data Mining

    130/170

    130

    Curse of dimensionality

    Number of training examples required increasesexponentially with number of inputs

    Remove irrelevant and redundant inputs

    Interaction and correlation effects

    Correlation implies redundancy Difference between redundant input and relevant input

    An input can be redundant, but that does not meanit is relevant (due to e.g. high correlation with

    another input already in the model)

    Input selection procedure

  • 8/4/2019 Credit Scoring and Data Mining

    131/170

    131

    Step 1: Use a filter procedure

    For quick filtering of inputs Inputs are selected independent of the

    classification algorithm

    Step 2: Wrapper/stepwise regression

    Use the p-value of the regression for inputselection

    Step 3: AUC based pruning

    Iterative procedure based on AUC

    Filter Methods for Input Selection

  • 8/4/2019 Credit Scoring and Data Mining

    132/170

    132

    Continuous target(e.g. LGD)

    Discrete target(e.g. PD)

    Continuous input Pearson correlationSpearman rank ordercorrelation

    Fisher score

    Discrete input Fisher score Chi-squared analysis

    Cramers V

    Information Value

    Gain/Entropy

    Chi-squared Based Filter

  • 8/4/2019 Credit Scoring and Data Mining

    133/170

    133

    Under the independence assumption,P(married and good payer) = P (married).P(good payer) =

    0.6 * 0.8

    The expected number of good payers that are married are0.6*0.8*1000 = 480.

    Good Payer Bad Payer Total

    Married 500 100 600

    Not Married 300 100 400

    800 200 1000

    continued...

    Chi-squared Based Filter

  • 8/4/2019 Credit Scoring and Data Mining

    134/170

    134

    If both criteria independent, one should have thefollowing contingency table

    Compare using a chi-squared test statistic

    Good Payer Bad Payer Total

    Married 480 120 600

    Not Married 320 80 400

    800 200 1000

    Chi-squared Based Filter

  • 8/4/2019 Credit Scoring and Data Mining

    135/170

    135

    is chi-squared distributed with k-1 degrees of freedom, with kbeing the number of classes of the characteristic

    The bigger (lower) the value of , the more (less) predictive theattribute is

    Apply as filter: rank all attributes with respect to theirp-value and select the most predictive ones

    Note:

    always bounded between 0 and 1, higher values indicate betterpredictive input.

    41.10

    80

    80100

    320

    320300

    120

    120100

    480

    4805002

    n

    VsCramer'

    Example: Filters (German credit data set)

  • 8/4/2019 Credit Scoring and Data Mining

    136/170

    136Van Gestel, Baesens, 2008

    Fisher scoreD fi d

  • 8/4/2019 Credit Scoring and Data Mining

    137/170

    137

    Defined as

    Higher value indicates more predictive input

    Can also be used if target is continuous and input discrete

    Example for German credit data set

    22

    ||scoreFisher

    BG

    BG

    ss

    xx

    Van Gestel, Baesens, 2008

    Wrapper/stepwise regressionU th l t d id th i t f th

  • 8/4/2019 Credit Scoring and Data Mining

    138/170

    138

    Use the p-value to decide upon the importance of theinputs

    p-value < 0.01: highly significant

    0.01 < p-value < 0.05: significant

    0.05 < p-value < 0.10: weakly significant

    0.1 < p-value: not significant Can be used in different ways:

    Forward

    Backward

    Stepwise

    Search StrategiesDi ti f h

  • 8/4/2019 Credit Scoring and Data Mining

    139/170

    139

    Direction of search

    Forward selection

    Backward elimination

    Floating search

    Oscillating search

    Traversing the input space Greedy Hill Climbing search (Best First search)

    Beam search

    Genetic algorithms

    Example Search Space for Four Inputs

  • 8/4/2019 Credit Scoring and Data Mining

    140/170

    140

    {I1,I2,I3,I4}

    {I2}{I1} {I3} {I4}

    {I1,I4} {I2,I4} {I3,I4}{I2,I3}{I1,I2} {I1,I3}

    {I1, I2,I4}{I1, I2,I3} {I1,I3,I4} {I2,I3,I4}

    {}

    Stepwise Logistic Regression

  • 8/4/2019 Credit Scoring and Data Mining

    141/170

    141

    SELECTION=

    FORWARD,

    PROC LOGISTIC first estimates parameters for

    effects forced into the model. These effects are theintercepts and the first nexplanatory effects in theMODEL statement, where nis the number specifiedby the START= or INCLUDE= option in the MODELstatement (nis zero by default). Next, theprocedure computes the score chi-square statisticfor each effect not in the model and examines thelargest of these statistics. If it is significant at theSLENTRY= level, the corresponding effect is addedto the model. Once an effect is entered in themodel, it is never removed from the model. The

    process is repeated until none of the remainingeffects meet the specified level for entry or until theSTOP= value is reached.

    continued...

    Stepwise Logistic Regression

  • 8/4/2019 Credit Scoring and Data Mining

    142/170

    142

    SELECTION=

    BACKWARD,

    Parameters for the complete model as specified in

    the MODEL statement are estimated unless theSTART= option is specified. In that case, only theparameters for the intercepts and the first nexplanatory effects in the MODEL statement areestimated, where nis the number specified by theSTART= option. Results of the Wald test forindividual parameters are examined. The leastsignificant effect that does not meet the SLSTAY=level for staying in the model is removed. Once aneffect is removed from the model, it remainsexcluded. The process is repeated until no other

    effect in the model meets the specified level forremoval or until the STOP= value is reached.

    continued...

    Stepwise Logistic Regression

  • 8/4/2019 Credit Scoring and Data Mining

    143/170

    143

    SELECTION=

    STEPWISE

    similar to the SELECTION=FORWARD option

    except that effects already in the model do notnecessarily remain. Effects are entered into andremoved from the model in such a way that eachforward selection step may be followed by one ormore backward elimination steps. The stepwiseselection process terminates if no further effect canbe added to the model or if the effect just enteredinto the model is the only effect removed in thesubsequent backward elimination.

    SELECTION=

    SCORE

    PROC LOGISTIC uses the branch and bound

    algorithm of Furnival and Wilson (1974)

    Stepwise Logistic Regression: Example

  • 8/4/2019 Credit Scoring and Data Mining

    144/170

    144

    proclogisticdata=bart.dmagecr;class checking history purpose savings employed marital coapp residentproperty age other housing;

    model good_bad=checking duration history purpose amount savings employedinstallp marital coapp resident property other/selection=stepwise slentry=0.10slstay=0.01;run;

    !

    AUC based pruningStart from a model (e g logistic regression) with all n inputs

  • 8/4/2019 Credit Scoring and Data Mining

    145/170

    145

    Start from a model (e.g. logistic regression) with all n inputs

    Remove each input in turn and re-estimate the model

    Remove the input giving the best AUC

    Repeat this procedure until AUC performance decreasessignificantly.

    Inputs

    Average AUCPerformance

  • 8/4/2019 Credit Scoring and Data Mining

    146/170

    Section 8

    Reject Inference

    Population

  • 8/4/2019 Credit Scoring and Data Mining

    147/170

    147

    Through-the-door

    Rejects Accepts

    Bads Goods? Bads ? Goods

    Reject Inference Problem:

  • 8/4/2019 Credit Scoring and Data Mining

    148/170

    148

    Problem:

    Good/Bad Information is only available for pastaccepts, but not for past rejects.

    Reject bias when building classification models

    Need to create models for the through-the-doorpopulation

    If past application policy was random, then rejectinference no problem, however this is not realistic

    Solution:

    Reject Inference: A process whereby theperformance of previously rejected applications isanalyzed to estimate their behavior.

    Reject Inference

  • 8/4/2019 Credit Scoring and Data Mining

    149/170

    149

    Through-the-door

    10,000

    Rejects

    2,950

    Accepts

    7,050

    Inferred Bads

    914

    Inferred Goods

    2,036

    Known Bads

    874

    Known Goods

    6,176

    Techniques to Perform Reject Inference Define as bad

  • 8/4/2019 Credit Scoring and Data Mining

    150/170

    150

    Define as bad

    Define all rejects as bads

    Reinforces the credit scoring policy and prejudicesof the past

    In SAS Enterprise Miner (reject inference node):

    Hard cut-off augmentation Parcelling

    Fuzzy Augmentation

    Hard Cut-Off Augmentation AKA Augmentation 1

  • 8/4/2019 Credit Scoring and Data Mining

    151/170

    151

    AKA Augmentation 1

    Build a model using known goods and bads

    Score rejects using this model and establishexpected bad rates

    Set an expected bad rate level above which anaccount is deemed bad

    Classify rejects as good and bad based on thislevel

    Add to known goods/bads

    Remodel

    Parcelling Rather than classify rejects as good or bad it assigns

  • 8/4/2019 Credit Scoring and Data Mining

    152/170

    152

    Rather than classify rejects as good or bad, it assignsthem proportional to the expected bad rate at that

    score

    score rejects with good/bad model

    split rejects into proportional good and bad groups.

    Score # Bad # Good % Bad % Good

    0-99 24 10 70.3% 29.7%

    100-199 54 196 21.6% 78.4%

    200-299 43 331 11.5% 88.5%

    300-399 32 510 5.9% 94.1%

    400+ 29 1,232 2.3% 97.7%

    Reject

    342

    654

    345

    471

    778

    Rej - Bad

    240

    141

    40

    28

    18

    Rej - Good

    102

    513

    305

    443

    760

    ...

    Parcelling However

  • 8/4/2019 Credit Scoring and Data Mining

    153/170

    153

    However

    Reject bad proportion cannot be the same asapproved

    Therefore

    Allocate higher proportion of bads from reject

    Rule of thumb: bad rate for rejects should be24 times that of approves

    Nearest Neighbor Methods Create two sets of clusters: goods and bads

  • 8/4/2019 Credit Scoring and Data Mining

    154/170

    154

    Create two sets of clusters: goods and bads

    Run rejects through both clusters

    Compare Euclidean distances to assign most likelyperformance

    Combine accepts and rejects and remodel

    However, rejects may be situated in regions far awayfrom the accepts

    Bureau Based Reject Inference Bureau data

  • 8/4/2019 Credit Scoring and Data Mining

    155/170

    155

    Bureau data

    performance of declined apps on similar productswith other companies

    legal issues

    difficult to implement in practice timings,definitions, programming

    Declined got

    credit elsewhere

    Jan 99

    Analyze

    performance

    Dec 00

    ...

    Additional Techniques to PerformReject Inference

  • 8/4/2019 Credit Scoring and Data Mining

    156/170

    156

    Reject Inference

    Approve all applications Three-group approach

    Iterative Re-Classification

    Mixture Distribution modelling

    Reject Inference... There is no unique best method of universal

  • 8/4/2019 Credit Scoring and Data Mining

    157/170

    157

    ... There is no unique best method of universal

    applicability, unless extra information is obtained. That is,

    the best solution is to obtain more information (perhapsby granting loans to some potential rejects) about those

    applicants who fall in the reject region, David Hand,1998.

    Cost of granting credit to rejects versus benefit of betterscoring model (for example, work by Jonathan Crook)

    continued

    Reject Inference Banasik et al. were in the exceptional situation of

  • 8/4/2019 Credit Scoring and Data Mining

    158/170

    158

    Banasik et al. were in the exceptional situation ofbeing able to observe the repayment behavior of

    customers that would normally be rejected. Theyconcluded that the scope for improving scorecardperformance by including the rejected applicantsinto the model development process is present but

    modest. Banasik J., Crook J.N., Thomas L.C., Sample

    selection bias in credit scoring models, In:Proceedings of the Seventh Conference on Credit

    Scoring and Credit Control(CSCCVII2001),Edinburgh, Scotland, 2001.

    Reject Inference No method of universal applicability

  • 8/4/2019 Credit Scoring and Data Mining

    159/170

    159

    No method of universal applicability

    Much controversy

    Withdrawal inference

    Competitive market

    Withdrawal Inference

  • 8/4/2019 Credit Scoring and Data Mining

    160/170

    160

    Through-the-door

    Withdrawn Rejects

    ? Bads ? Goods? Bads ? Goods

    Accepts

    Bads Goods

  • 8/4/2019 Credit Scoring and Data Mining

    161/170

    Section 10

    Behavioral Scoring

    Why Use Behavioral Scoring? Debt provisioning

  • 8/4/2019 Credit Scoring and Data Mining

    162/170

    162

    p g

    Setting credit limits (revolving credit)

    Authorizing accounts to go in excess

    Collection strategies

    What preventive actions should be taken

    if customer starts going into arrears? (earlywarning system)

    Marketing applications (for example, customersegmentation and targeted mailings)

    Basel II!

    Behavioral Scoring Once behavioral scores have been obtained, they can be used to

  • 8/4/2019 Credit Scoring and Data Mining

    163/170

    163

    increase/decrease the credit limits

    But: score measures the risk of default given the currentoperating policy and credit limit!

    So, using the score to change the credit limit invalidates theeffectiveness of the score!

    Analogy: only those who have no or few accidents when they

    drive a car at 30 mph in the towns should be allowed to drive at70 mph on the motorways (Thomas, Edelman, and Crook 2002)

    Other skills may be needed to drive faster!

    Similarly, other characteristics required to manage account withlarge credit limits when compared to accounts with small credit

    limits Typically, divide the behavioral score into bands and give

    different credit limit to each band.

  • 8/4/2019 Credit Scoring and Data Mining

    164/170

    Variables Used in Behavioural Scoring

    Define derived variables measuring financial solvency

  • 8/4/2019 Credit Scoring and Data Mining

    165/170

    165

    g yof customer

    Customer can have many products and have differentroles for each product

    For example, primary owner, secondary owner,guarantor, private versus professional products, ...

    Think about product taxonomy to define customerbehavior

    For example, average savings amount, look atdifferent savings products

    The best way to augment the performance of ascorecard is to create better variables!

    Aggregation Functions Average (sensitive to outliers)

    Median (insensitive to outliers)

  • 8/4/2019 Credit Scoring and Data Mining

    166/170

    166

    Minimum/Maximum

    For example, worst status during last 6 months, highestutilization during last 12 months,

    Absolute trend

    Relative trend

    Most recent value, Value 1 month ago,

    Ratio variable

    E.g.: Obligation/debt ratio: calculated by adding the proposedmonthly house payment and any regular monthly installmentrevolving debt and dividing by the monthly income.

    Others: current balance/credit limit,

    6

    6

    6

    T

    TT

    x

    xx

    6

    6 TT xx

    Impact of Using Aggregation Functions Impact of missing values

  • 8/4/2019 Credit Scoring and Data Mining

    167/170

    167

    p g

    How to calculate trends?

    Ratio variables mighty have distributions with fat tails(large positive and negative values)

    Use truncation/winsoring!

    Data set expands in the number of attributes Input selection!

    What if input selection allows you to choose between,for example, average and most recent value of a ratio?

    Choose average value for cyclic dependent ratios Choose most recent for structural (non-cyclic)

    dependent ratios

    Behavioral Scoring Implicit assumption

  • 8/4/2019 Credit Scoring and Data Mining

    168/170

    168

    p p

    Relationship between the performancecharacteristics and the subsequent delinquencystatus of a customer is the same now as 2 or 3years ago.

    Classification approach to behavioral scoring

    Markov Chain method to behavioral scoring

    Classification Approach to BehavioralScoring

  • 8/4/2019 Credit Scoring and Data Mining

    169/170

    169

    Performance period versus observation period

    Develop classification model estimating probability of

    default during observation period Length of performance period and observation period

    Trade-off

    Typically 6 months to 1 year

    Seasonality effects

    Performance period

    Observation point Outcome point

    How to Choose the Observation Point? Seasonality, dependent upon observation point

  • 8/4/2019 Credit Scoring and Data Mining

    170/170

    Develop 12 PD models

    Less maintainable

    Need to backtest/stress test/monitor all thesemodels!

    Use sampling methods to remove seasonality Use 3 years of data and sample the observation

    point randomly for each customer during thesecond year

    Only 1 modelGives a seasonally neutral data set