29
Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference Nancy Wallace Haas School of Business Real Estate and Financial Markets Laboratory Fisher Center for Real Estate and Urban Economics Macro Financial Modeling Conference January 29, 2016 1

*0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

  • Upload
    dotuong

  • View
    217

  • Download
    3

Embed Size (px)

Citation preview

Page 1: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Mortgage and Housing Microdata: DatabaseConstruction and Challenges for Statistical

Inference

Nancy Wallace

Haas School of BusinessReal Estate and Financial Markets Laboratory

Fisher Center for Real Estate and Urban Economics

Macro Financial Modeling ConferenceJanuary 29, 2016

1

Page 2: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Statistical Inference in the Mortgage Market

I Standards should be very high given policy objectives.

I Mortgage origination is a high dimensional contracting problem so mostcontract elements are jointly endogenous.

I Lenders set the contract menus based on pricing embedded options (defaultand prepayment) – leads to ex post sample selection and sorting in mortgageorigination data.

I Pre-crisis mortgage regulation controlled by competing regulatory authorities,Federal TILA, state consumer protection and foreclosure laws, zip code levelUnderserved Area Goals.• Induces non-random assignment and rationing of contract types even at

the level of zip codes.

I No standardized loan identification number exists in U.S. – leads tochallenges in data set assembly and sampling.

I Available mortgage data sets are problematic because none of them spanthe contracting space.

2

Page 3: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

No Single Data Set Spans the Contracting Space HMDA LPS/

EquifaxABSNet/Lewtan

DataQuick/Corelogic

FNMA/FHLMC

Coverage ~90%Orig.

~20%Orig. ~100%PLS LiensandTransfers

N.A.

Storage(G) 100 8,500 360 2,100 130Loans(M) 262.97 67.40 20.00 118.47 31.31Orig.($T) 44.70 11.50 4.48 36.48 5.61

StaticFieldsLoanLevelProductTypes

All PrimarilyGSE

Non-GSE All GSEFRM

GeographicIdentifiers

CensusTracts

ZipCodes,Zip(3)GSE

ZipCodes Address Zip(3)

BorrowerIncome

Yes No Yes No DTI

InterestRate No Yes Yes No NoOrig.Date Year YM YM YMD NoOrig.Amount Yes Yes Yes Yes YesPurpose P,R P,R,Other P,R,Other P,R P,ROriginator Yes No Aggregator Yes AggregatorHousePrice No Yes–LTV Yes–LTV Yes Yes-LTVOtherLiens No IfEquifax Partial Yes NoSec.Status Partial Yes Yes No Partial

DynamicFieldsLoanLevelPaymentPerformance

No YMD YMD PayoffYMD Y

HousePrice No No No Yes,notattributes

No

CreditScore No IfEquifax No No No

3

Page 4: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Linking mortgage data: Without Loan IDs

DataQuickCharacteris/cs

DataQuickTransac/ons

HMDA LPS

ABSNetRMBS

GSEs

HousePriceModel

MortgagePopula/on

MicroNetworkModel

Compe//on&Funding

MicroCorporateStressTest

MacroStressTest

InterestRateModel

Equifax FlowofFunds Y-9C

Analy&

cs

Data

Techno

logy

HighPerformanceDataManagement:Cloudera/Impala/HadoopHighPerformanceCompu/ngCluster(HPCC)

MicroLoanLevel

StressTest

EDGARBondCUSIP

RMBSBonds

ABSNetMort.

Liens

OtherCredit

4

Page 5: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Combined Matching and Learning Algorithms Success Rate

I Average match rates in the literature are usually around 30% or less.

I DataQuick and ABSNet Merge (2000 - 2013): For zip codes that arepresent in DQ (90% coverage in U.S.) overall match is 87%, 97% forCalifornia.

I DataQuick and HMDA Merge (2005 - 2013): Overall match of 60%,90% for California.

I LPS and DQ merge is currently underway.

I Exploit network structure of lenders and subsidiaries to enhancematches.

5

Page 6: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Why does this Matter?

I Challenges of sampling and population inference using existing data sets.

I Example 1: Inference concerning the competitive structure of the mortgagemarket.

I Example 2: Inference concerning house price dynamics and mortgageperformance.

6

Page 7: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Herfindahls using only HMDA: Market Appears Competitive

Stanton et al. (2014)

25°N

45°N

100°W 80°W

0.035

0.043

0.051

0.058

0.064

0.073

0.082

0.093

0.104

0.117

0.134

0.157

0.185

0.225

0.277

0.367

0.462

0.604

0.850

1.000

7

Page 8: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Mortgage Market Loan-Flow Networks

1.3 Aggregator Subsidiary: Correspondent, wholesale, retail

1.4 Ind./Affil.Dep.

1.5 Ind. MC1.6 Ind.Broker

1.1 HoldingCo. Sec.. Shelf

1.2 HoldingCo. Sec. Shelf

1.0 Bank/ThriftHolding Company

2.0 IBank/Ind.Holding Company

2.1 HoldingCo. Sec Shelf

2.2 HoldingCo. Sec Shelf

2.3 Aggregator Subsidiary: Correspondent, wholesale

2.4 Ind. Dep. 2.5 Ind. MC2.6 Ind.Broker

Supply Chain

Network

8

Page 9: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Example: Trees – 2006 Mortgage Originations

WEPCO FCU

WESTWORKS MORTGAGE

WHITE SANDS FCU

WEST TEXAS STATE BANK

VVV MANAGEMENT CORP

VNB NEW YORK CORP

WESTERN MORTGAGE

BNC MORTGAGE INC.

INDYMAC BANCORP INC

H&R BLOCK INC

WAUWATOSA SAVINGS BANK

WOODFOREST NATIONAL MORTGAGE

TAYLOR, BEAN & WHITAKER

WRENTHAM CP BANK

NEW CENTURY

LUMINENT MORTGAGE

VOLUNTEER FED L S&L

CAPSTEAD MORTGAGE CO

WASHINGTON MUTUAL CAP MARKETS

CITIGROUP INC

WOODS GUARANTEE SAVINGS

WESTERN CITIES FINANCIAL

WEICHERT FINANCIAL C

COUNTRYWIDE FINANCIAL

GENERAL MOTORS CORPORATION

WESTERN CU

WESTERN PLAINS MORTGAGE

WEYERHAEUSER EMPLOYEE CU

WESTERN SIERRA BANK

WEST−AIR COMM FCU

WHITMAN MORTGAGE

YANKEE FARM CREDIT

WASHINGTION MUTUAL BANK

9

Page 10: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Example: Networks – 2006 Mortgage Originations

WEPCO FCU

WESTWORKS MORTGAGE

WHITE SANDS FCU

WEST TEXAS STATE BANK

VVV MANAGEMENT CORP

VNB NEW YORK CORP

WESTERN MORTGAGE

BNC MORTGAGE INC.

INDYMAC BANCORP INC

H&R BLOCK INC

WAUWATOSA SAVINGS BANK

WOODFOREST NATIONAL MORTGAGE

TAYLOR, BEAN & WHITAKER

WRENTHAM CP BANK

NEW CENTURY

LUMINENT MORTGAGE

VOLUNTEER FED L S&L

CAPSTEAD MORTGAGE CO

WASHINGTON MUTUAL CAP MARKETS

CITIGROUP INC

WOODS GUARANTEE SAVINGS

WESTERN CITIES FINANCIAL

WEICHERT FINANCIAL C

COUNTRYWIDE FINANCIAL

GENERAL MOTORS CORPORATION

WESTERN CU

WESTERN PLAINS MORTGAGE

WEYERHAEUSER EMPLOYEE CU

WESTERN SIERRA BANK

WEST−AIR COMM FCU

WHITMAN MORTGAGE

YANKEE FARM CREDIT

WASHINGTION MUTUAL BANK

10

Page 11: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

“Mortgage Loan-Flow Networks and Financial Norms,” Stanton,

Walden, and Wallace (2015)I Develop an equilibrium model of intermediaries that share risk in a

financial network.I Key properties:

• Participation in network is voluntary (assume pairwise stability as inJackson and Wolinsky, 1996)

• Agents decide whether to invest in quality• Financial norms represented by investment decisions• Model simple enough to allow for numerical approximation of equilibrium

in mid- large-sized networks.

I Three general implications:1. Network structure influences financial norms – both direct and indirect

counterparties.2. Heterogeneous financial norms often coexist in the network.3. Network proximity is related to financial norms.

11

Page 12: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Key Results

I Decisions to share risk and to invest in screening are both non-monotonic incapital requirements (analytic).

I Heterogeneous screening (quality) policies arise naturally within lendingnetworks, leading to clusters of high and low quality lending (analytic).

I High quality lenders tend to have more connections and these tend to be ofhigher quality (simulation).

I Ex post, a sub-network of concentrated, low quality lending channels isobservable in the 2006 U.S. mortgage market (empirical).

I Systemically, pivotal firms can be forecast (empirical and analytical).

12

Page 13: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Pivotal Firm Linkages

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

13

Page 14: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

No Single Data Set Links Lenders, House Prices with

Time-Matched House Characteristics and Mortgage IDs

DataQuick ModifiedDataQuick

Corelogic Trulia

TransactionPrices

Yes Yes Yes Yes

HouseCharacteristics

Static UpdatedAnnually

Static Static

LienRecordingHistory

Yes Yes No No

Foreclosure Yes Yes Yes YesRecordedMortgageOriginator

Yes Yes No No

MortgageID No No No NoHousePriceIndices

RepeatSales/Hedonic

DynamicHedonic

RepeatSales RepeatSales

14

Page 15: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Shortcomings of repeat-sales indices

I Small sample size: Ignores houses that sell just once during sampleperiod.

I Changing Quality/Quantity: Assumes house characteristics remainconstant (and that relative prices of each characteristic remainconstant).• Adding a new bedroom will make us think prices of unaltered houses

have gone up.I Volatility: Volatility of index will underestimate volatility of individual

house prices.• Construction of index automatically induces smoothing.• Even ignoring this, index is a (somewhat) diversified portfolio, not an

individual house.

I The first-differencing required in repeat-sales estimators mechanicallyinduces serial correlation depending on the timing of sales.

I Sample-selection bias: Are houses that sell representative of housesthat don’t?

15

Page 16: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Problems with repeat-sales indices 1Most houses don’t sell very often

I A house is only included if it sells more than once, so we throw out ahuge fraction of house sales.

I In San Francisco between 2003–2012,• 24,342 houses sold at least once.• 20,778 (85.4%) sold only once, so are discarded.• Index considers only 3,564 houses (14.6%)

I In Los Angeles between 2003–2012,• 236,406 houses sold at least once.• 194,375 (82.2%) sold only once, so are discarded.• Index considers only 42,031 houses (17.8%)

16

Page 17: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Problems with repeat-sales indices 2Changes in property characteristics

I In San Francisco, 2502 Leavenworth sold in 2003 and 2008.• In 2003:

F Square footage = 1,752F Total rooms = 5F Bathrooms = 1F Price = $1,503,000

• In 2008:F Square footage = 2,913F Total rooms = 8F Bathrooms = 3F Price = $5,500,000

I Changes in size (or quality) are wrongly counted as “returns”

17

Page 18: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Problems with Repeat-Sales indices 3The index is not a house price!

I For mortgage valuation, stress testing, etc., we need the distribution offuture house prices.

I In repeat-sales methodology, each period’s index growth is aconstant.• No specification of inter-period dynamics.

I Even if we assume house prices follow (say) geometric Brownianmotion, we can’t just use the volatility of the index.• Index levels are estimated (though standard errors are never shown).• Construction of index automatically induces smoothing.• Volatility of index will underestimate volatility of individual house prices.• Even ignoring this, index is a (somewhat) diversified portfolio, not an

individual house.I We also can’t use index properties to test for (e.g.) serial correlation in

house prices.• Estimation procedure mechanically induces serial correlation in index.

F Even when there is none in individual house prices.

18

Page 19: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

“A New Dynamic House-Price Index for Mortgage Valuation and

Stress Testing,” Stanton and Wallace, 2015

I Objective to account for:• Dynamics flexibly.• Property characteristics.• Macroeconomic variables.• All sales, not just repeats.• Unobserved heterogeneity across properties.

I Purpose of the index• Develop suitable dynamic house price index for applications in mortgage

valuation.• Develop suitable index for DFAST nine-quarter-ahead planning horizon –

incorporate macro fundamentals directly into index.

19

Page 20: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

New IndexWrite log-price, yi ,t , as

yi ,t =

house price index︷ ︸︸ ︷Abxi ,t + Bbξt + αb,t + µi + εi ,t ,

= X tβb + αb,t + µi + εi ,t ,

αb,t = ραb,t−1 + ηt , where

εi ,t ∼ i.i.d. N(0,σ2ε ),

µi ∼ i.i.d. N(0,σ2µ),

ηi ,t ∼ i.i.d. N(0,σ2η).

I xi ,t = home hedonics (beds, size, lot size).I ξt = macro fundamentals (interest rates, population, unemployment).I µi = house-specific random effect (due to unobservable differences).I αb,t = unexplained (and unobserved) portion of the index.

20

Page 21: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Estimation

I Model can be written as a standard, linear state-space model if weaugment the state, st , to include µi :

st =

αb,t

β1,t

β2,t...

βk ,t

µ1,t

µ2,t...

µI,t

αb,t

β1

β2...βk

µ1

µ2...µI

=

α+ ραb,t−1 + ηt

β1,t−1

β2,t−1...

βk ,t−1

µ1,t−1

µ2,t−1...

µI,t−1

.

21

Page 22: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Estimation

I In principle, we could estimate the model’s parameters by maximizingthe likelihood function for this state space model using standardKalman Filter (Kalman and Bucy, 1961).

I We don’t do it this way.• Dimensionality of problem is very high.• Lots of missing data.• Want to be able to extend to even more general settings.

F E.g., more general distributional assumptions.

I Instead we use Markov Chain Monte Carlo (MCMC) Bayesianmethods (e.g., Johannes and Polson, 2009).• Details in paper.

22

Page 23: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Alameda County: Comparison WRS Indices and Dynamic Index

23

Page 24: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

San Diego County: Comparison WRS indices and Dynamic Index

24

Page 25: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

San Francisco County: WRS indices and Dynamic Index

25

Page 26: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Alameda Zip = 94618: Comparison Zillow Medians, Medians and

Model

26

Page 27: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

San Diego Zip = 92129: Comparison Zillow Medians, Medians and

Model

27

Page 28: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

San Francisco, Zip = 94111: Comparison Zillow Medians,

Medians, and Model

28

Page 29: *0.5in @let@token thebibliography[1]![1][]== Mortgage and … ·  · 2016-01-29Mortgage and Housing Microdata: Database Construction and Challenges for Statistical Inference

Policy Challenge Network Structures House Price Indices Conclusions

Conclusions

I Without a mortgage loan ID integrating existing data sets is a significanthurdle – regulatory oversight is challenging

I Random sampling from non-integrated loan-level mortgage data sets maylead to biased estimates of the true population distributions of the contractspace, the geographic coverage of the contracts, and industrial organizationof the industry.

I Lack of dynamic house price data sets present significant challenges fordisentangling price from quality changes in the housing market – especially achallenge in coastal states with high real option values.

• Currently, market and regulators rely on WRS indices due to lack of data.• Use of a fundamentally static price indices leads to significant problems

in mortgage valuation and stress testing.

I New large data set matching and learning algorithms hold considerablepromise in addressing both the mortgage and the house price datachallenges for economists and regulators.

29