20
The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest Pafenberg, Jay Schultz, Cynthia Waldron, Xun Wang, Claudia Wood, Peter Zorn Federal Housing Finance Agency Consumer Financial Protection Bureau Freddie Mac Urban Institute June 11, 2013 The views expressed are those of the authors and do not necessarily represent those of the Consumer Financial Protection Bureau, the Federal Housing Finance Agency, Freddie Mac or their staff.

The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

Embed Size (px)

Citation preview

Page 1: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

The National Mortgage Database (NMDB)

Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest Pafenberg, Jay Schultz,

Cynthia Waldron, Xun Wang, Claudia Wood, Peter Zorn

Federal Housing Finance AgencyConsumer Financial Protection Bureau

Freddie Mac

Urban InstituteJune 11, 2013

The views expressed are those of the authors and do not necessarily represent those of the Consumer Financial Protection Bureau, the Federal Housing Finance Agency, Freddie Mac or their

staff.

Page 2: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 2

What is the NMDB?

A new, nationally representative, loan-level mortgage database jointly funded and managed by the FHFA and CFPB based on a prototype developed by Freddie Mac.» 1st lien mortgages reported to the credit bureaus are used as both the sampling frame and

the source of performance data. No new data is collected—the NMDB will make better use of data that already exists.

» The database is a 1/20 sample (not a registry of loans). » Because the credit bureaus archive their data, the NMDB recovers data that would have

been available had the project been started years ago. The initial 1/20 sample is representative of all mortgages open at any time from January 1998 to June 2012 and (with weights) any borrower who had at least one mortgage during that period.

» Going forward, a 1/20 representative sample of newly originated mortgages will be added each quarter, and terminated mortgages will exit the sample.

» 10.1 million mortgages are in the initial historic database. In the future the database will track about 3.5 million active mortgages.

Credit bureau data are comprehensive. However, they are raw servicing data which requires significant cleaning to make them useful. Also need to add data from other sources.» Major commitment of government staff to do this. Never done before.» Working with active cooperation of credit bureau staff.

NMDB will also have a survey component. Each quarter a representative subset of borrowers associated with loans newly added to the database will be sent a mail survey soliciting information on their mortgage shopping and origination experience.

Page 3: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 3

Four Overlapping Databases

The basic unit of observation is the mortgage.» The database will contain full credit information for all borrowers associated with the sampled

mortgages. » Borrower data will be gathered from one year prior to sampled mortgage origination to one

year after termination and tracked quarterly.» Performance on the sample mortgages will be collected monthly.

The NMDB will also make available an historic data base containing full credit data (including scores) from 1998 to 2012 of a representative sample of borrowers associated with an active mortgage during the 1998 to 2012 period.» The database will contain information on all mortgages taken out by these borrowers during

the 1998 to 2012. » Data also gathered on all other credit obligations active during this period. » Performance for each mortgage will be tracked from 2000 to 2012.

The NMDB will maintain a separate database of a representative 1-in-20 sample individuals who have ever had an active mortgage from 1998 onward.» Quarterly information will be maintained on these individuals from one-year prior to taking out

their first mortgage (or 1998) until they die. » Persons will be added to the database when they take out their first mortgage. .

The NMDB origination survey data will also be maintained as a separate database.

Page 4: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 4

Why is the NMDB Needed?

• HMDA: – Not fully reasonably representative—does not include HMDA non-reporters.– Lacks detailed borrower, loan or performance data. – Available only 9 to 21 months after mortgages are originated.

• LPS McDash and/or CoreLogic:– Servicing files from 26 large servicers versus 2,000 servicers in credit bureaus.– Not representative—poor coverage of portfolio loans.– Same problems as underlying NMDB data—duplication, hanging performance

and servicing sales—but not cleaned as the NMDB will be, so you don’t know it.– No information on other obligations, previous or subsequent mortgages, or

borrowers.

• Problems with NY Fed Equifax: – Similar source as NMDB, but unit of observation is borrowers not loans. – Same problems as underlying NMDB—duplication, hanging performance and

servicing sales—but not cleaned as the NMDB will be, so you don’t know it.– Little supplementation with other data and difficult to link files over time.

Page 5: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 5

What is Missing in Bureau Data?

Key items missing are property value (LTV) and characteristics, borrower characteristics (e.g. age, race, income, gender), and some mortgage characteristics (e.g. ARM status, PMI, origination channel).

The database is being supplemented with information obtained from matching to existing external sources (some still under negotiation):» Home Mortgage Disclosure Act (HMDA) (70% match rate gives income/race).» Property transaction (deed/title) data (55% match rate).» MLS data (useful for purchase price in non-disclosure states).» Property appraisal data.» Household moving/address information on last three addresses.» Third party servicing data (e.g. LP, LPS). Private label MBS data. Maybe

securities data as well (e.g. Ginnie Mae, GSEs).» Administrative files (FHA, VA, RHS, GSEs, home loan banks and possibly large

banks). 47% of sample loans are Gov’t-backed. An additional 17% of borrowers have a non-sampled Gov’t backed loan.

» Data on age, gender and marital status from public records collected by the credit bureau.

Page 6: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 6

What Specific Fields will NMDB Have?

For each sample mortgage the database will have:» Monthly—Performance (delinquent, current, foreclosure); balance; scheduled

payment; actual payment; escrow payment; amortizing contract rate.» Fixed Characteristics—Date opened; term; amount borrowed; number of

borrowers; mortgage purpose (home purchase, refinance, new mortgage on free and clear property); owner occupancy status; type of mortgage (FHA/VA/RHS/home improvement/manufactured housing/other); GSE (Fannie/Freddie/Ginnie/Private MBS); servicer type; balloon amount and date; appraised property value, APR, CLTV, LTV and DTI used in underwriting; ARM status; PMI; date closed, payoff amount, and termination form (if closed).

» Modification/foreclosure status—date entered modification/foreclosure; change in terms; special program (HARP/HAMP); part of bankruptcy; charge-off amount.

For each sample mortgage co-signer the database will have:» Age (date of birth); gender; marital status; deceased indicator; race/ethnicity (from

HMDA); income at the time of origination (from HMDA);» Quarterly Vantage Credit Score, bankruptcy, and income estimator» Do they live in property associated with mortgage; first-time homebuyer; census-

tract/zip code and timing of last three addresses

Page 7: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 7

Specific Fields (continued)

For the property associated with each sample mortgage the database will have:» Quarterly—LTV; CLTV; and value (from AVM model).» Fixed Characteristics—Date purchased; purchase amount; location (census tract,

MSA and Zip Code); type of property (e.g. single family); age of structure; square footage; assessed value; owner-occupied.

For all concurrent 2nd liens on the property associated with each sample mortgage the database will have:» Monthly—Performance (delinquent, current, foreclosure); balance; scheduled

payment; actual payment; escrow payment; amortizing contract rate; credit limit (if a HELOC).

» Fixed Characteristics—Date opened (piggie-back or not); term; open- or closed-end; amount borrowed (or credit limit); number of borrowers; same servicer as 1st; date closed, payoff amount, and termination form (if closed).

Page 8: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 8

Specific Fields (continued)

For all other mortgages, credit cards, installment loans, student loans, auto loans, lines of credit, and other consumer loans associated with sample mortgage co-borrowers the database will have:» Monthly—Performance (delinquent, current, foreclosure); balance; scheduled

payment; actual payment; escrow payment; amortizing contract rate; credit limit (if open-ended).

» Fixed Characteristics—Type of credit; date opened; term; open- or closed-end; amount borrowed (or credit limit); number of borrowers; same servicer/property as sample mortgage; date closed, payoff amount, and termination form (if closed).

Information on inquiries and public records for all borrowers associated with sample mortgages will also be gathered.

An origination survey will be sent (mailed) to a representative subset of new mortgagees in the database each quarter. The survey has been pre-tested three times with response rates of 60 and 45 percent for the last two pilots. The survey is designed to pick up information on issues like loan shopping and suitability that are not available from any other source.

Page 9: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 9

Timeline

Contract signed with Experian on September 27, 2012. Initial data delivery took place in December 2012—1/20 sample of all loans

in existence between January 1998 and June 2012 (10.1 million loans and 14.7 million borrowers after preliminary cleaning).

An analytic group at FHFA, Freddie Mac and CFPB is processing and cleaning the data and will match it to external sources, impute data for loans that cannot be matched, and develop a series of regular reports and queries to facilitate use of the NMDB.

» It will likely take until next spring to finish cleaning the data.» 8 FTEs working on the project—major commitment of FHFA.» Many challenges in following people and mortgages (e.g. servicing is sold;

people die or are added to mortgages). An existing pilot prototype dataset in development for 2 ½ years funded by

Freddie Mac (1/500 sample of loans outstanding since 2003).» Prototype will be maintained and updated until at least summer 2013.» Already used in FHFA’s 2012 HERA-mandated report.» Pilot testing of an additional Origination survey and a Delinquency Survey.

Page 10: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 10

Access and the Future

NMDB is being set up as a public good. We believe that the contract signed with Experian is a model for data access.

The challenge is to (1) protect borrower/lender personally identifiable information and (2) provide useful data. Local geography is critical for mortgage analysis.

Our solution: Data is physically housed only on a FHFA/CFPB server. Access, however, is allowed for any federal government/reserve bank/GSE

employee going through access process:» Must sign an agreement not to reverse engineer identity of borrower or

lender. Severe penalties for violations of agreement.» All work behind a firewall—data can’t be removed.» NMDB software must support a variety of purposes—simple queries

(number of new mortgages in California) to complex research projects.» We are working to allow broader academic/research public access via

Census-style programs.

Page 11: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 11

Examples of how NMDB can be used

Example 1: Second liens

Example 2: Loan performance transition matrix

Example 3: Credit tightening

Example 4: Market Comparisons

All examples with 2010 data using the Prototype

Page 12: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 12

Example 1: Second liensNMDB coverage is more extensive than HMDA’s

Open Date

Mill

ions

of M

ortg

ages

2004 2005 2006 2007 2008 2009 2010

0.0

0.5

1.0

1.5

NMDB First LienNMDB First with SecondNMDB Second by Second Open Date HMDA First LienHMDA Second Lien

Page 13: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 13

Example 1: Second liens (continued)Default rates are higher for firsts with seconds

0

10%

20%

30%

40%

Def

ault

(90d

or

wor

se)

2004 2005 2006 2007 2008 2009

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2

First without Second

First with Second

0

10%

20%

30%

40%

Def

ault

(90d

or

wor

se)

2004 2005 2006 2007 2008 2009

First Lien Open Date

First with Concurrent non HELOC First with Subsequent non HELOC First without SecondFirst with Concurrent non HELOC First with Subsequent non HELOC First without Second

First with Concurrent HELOC First with Subsequent HELOC First with Concurrent HELOC First with Subsequent HELOC

Page 14: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 14

Example 1 continued

Performance of firsts w/ different types of seconds

ALL HELOC non HELOC HELOC non HELOC

Seconds and Firsts perform similarly 87% 89% 87% 88% 78%Seconds perform better 8% 7% 8% 7% 15%Seconds perform worse 5% 4% 5% 4% 7%

Concurrent Subsequent

88% of firsts and their associated seconds perform similarly

When performance diverges, seconds tend to out-perform their associated firsts.

GSE firsts and their associated seconds perform better than non-GSE loans.

Firsts with piggyback non-HELOC (closed end) seconds have the highest default rates.

Page 15: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 15

Example 2: Loan performance transition matrix60D+ loans tend to worsen in performance

Current 30D 60D 90D 120+ D FCL No hist Closed

Current 95.36% 0.96% 0.06% 0.02% 0.04% 0.02% 1.83% 1.72% 100%

30D 26.36% 41.84% 25.33% 0.42% 0.20% 0.18% 3.92% 1.75% 100%

60D 7.23% 12.83% 35.13% 38.33% 0.27% 0.97% 3.11% 2.13% 100%

90D 6.20% 1.71% 5.40% 25.27% 46.50% 8.31% 5.18% 1.45% 100%

120+ D 2.91% 0.75% 0.48% 1.04% 75.56% 9.90% 6.83% 2.53% 100%

FCL 1.59% 0.06% 0.03% 0.00% 2.53% 85.27% 3.82% 6.70% 100%

No hist 10.43% 0.11% 0.23% 0.04% 0.79% 0.28% 86.51% 1.60% 100%

82.12% 1.72% 0.86% 0.48% 2.24% 2.15% 8.59% 1.84% 100%

TotalMay 2010 Performance

Ap

ril 2

010

per

form

ance

Total

Row Percent

95% of current mortgages remain current the next month.

Slightly over 40% of 30-day delinquent loans remain 30-day delinquent the next month (the mode), with roughly equal percentages transitioning into current and 60-day delinquent.

The disproportionate share of loans delinquent 60 or more days transition into an even worse performing state the next month.

Page 16: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 16

Example 2: Loan performance transition matrix (continued)

First without seconds cure more frequently

80

85

90

95

100

CurrentC

urre

nt

0

5

10

15

20

30 D 60 D 90 D

Firsts with Second

Firsts no Second

120+ D

20

25

30

35

30 D

30

35

40

45

50

20

25

30

35

40

0

5

10

15

20

60 D

5

10

15

20

25

20

25

30

35

40

35

40

45

50

Jun06

Jun07

Jun08

Jun09

Jun07

Jun09

0

5

10

15

20

90 D

Jun06

Jun07

Jun08

Jun09

Jun07

Jun09

0

5

10

15

20

Jun06

Jun07

Jun08

Jun09

Jun07

Jun09

0

5

10

15

20

Jun06

Jun07

Jun08

Jun09

Jun07

Jun09

10

15

20

25

30

Jun06

Jun07

Jun08

Jun09

Jun07

Jun09

45

50

55

60

Page 17: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 17

Example 3

Credit quality of originations

500

600

700

800

900

1000

Vantage Score D is tr ibution for Purc has e and Refinanc e Originations

H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H22003 2004 2005 2006 2007 2008 2009 2010

P urchase Refinance

Note: The data are weighted values from the NMDB and include jumbo loans. P urpose is identified using credit bureau and HMDA data. The box representsthe middle 50% of the observations, the median is marked by the white line in the box and the dotted lines extend to the 5th and 95th percentiles. The widthsof the boxes are proportionate to the volume of loans.

Van

tag

e S

co

re

Origination Date

Score distributions are the tightest (lowest risk) since 2003.» The Vantage score cutoff, as measured by the 5th percentile of the score distribution, is currently

set higher for both purchase and refinance loans than at any point since 2003.

Refinance mortgages appear to face an especially high score cutoff.

Page 18: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 18

Example 3 continued

Credit quality of originations—GSE comparison

500

600

700

800

900

1000

Vantage Sc ore D is tr ibution for Conv entional, C onforming GSE and non-GSE Purc has e and R efinanc e Originations

Va

nta

ge

Sc

ore

500

600

700

800

900

1000

H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H22003 2004 2005 2006 2007 2008 2009 2010

GS E P urchase non-GS E P urchase GS E Refinance non-GS E Refinance

Note: The market are weighted values from the NMDB and exclude jumbo and FHA /VA loans. P urpose is identified using credit bureau and HMDA data. The box representsthe middle 50% of the observations, the median is marked by the white line in the box and the dotted lines extend to the 5th and 95th percentiles. The widthsof the boxes are proportionate to the volume of loans.

Va

nta

ge

Sc

ore

Origination Date

The credit quality of GSE loans significantly exceeds that of non-GSE loans, especially for purchase money mortgages.

Page 19: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 19

Example 4: Market comparisonsComparison of FHA Originations

600

650

700

750

800

Vantage Sc ore D is tr ibution for Non-FHA Loans and FHA Loans

H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H2 H1 H22000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Non-FHA /VA FHA /VA

Note: The market is weighted values from NMDB and includes jumbo loans. The box represents the middle 50% of the observations, the median is marked by the white line in the boxand the lines extend to the 5th and 95th percentiles. The widths of the boxes are proportionate to the volume of loans.

Va

nta

ge

S

co

re

(re

sc

aled)

Origination Date

The credit quality of FHA/VA market originations is consistently lower than that of non-FHA/VA market originations.

This difference in quality diminished somewhat during the height of the boom (2004 through 2006), and has increased since 2007.

Page 20: The National Mortgage Database (NMDB) Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith, Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest

National Mortgage Database 20

Example 4: Market comparisons (continued)

Monitoring and benchmarking FHA

• Monitoring—As of June 2010 (for loans originated since 2003):– 13.4% of FHA loans were either in a state of delinquency or were closed with a

loss.– 11.5% of all open FHA loans were in a state of delinquency.– Comparable figures for VA were 8.5% and 6.9%, respectively;– Comparable figures for RHS were 6.9% and 6.5%, respectively.

• Benchmarking—Controlling for loan size, geography (state) and cohort:– FHA is underperforming. The average delinquency rate of loans with FHA’s mix

of loan size, state and cohort is 7.9% and 6.2%, respectively. – FHA’s worst performing book year is 2007, with an “excess delinquency rate” of

12.4% above average.– Newly eligible FHA loans (above old limits) are performing worse than market by

about 4 percentage points. However, this may be market effect—FHA loans in same markets but below old limits have about the same excess delinquency.

– VA and RHS are performing as predicted (+/- 0.5 percentage points).