Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Big Data in Real Estate:A Twitter Case Study
Clifford A. Lipscomb, Ph.D., MRICSVice Chairman and Co-Managing DirectorGreenfield Advisors, Inc.C2ER – Denver, ColoradoJune 12-16, 2017
1
Acknowledgements*Co-authors:
Kimberly Winson-GeidemanAndy KrauseNick Evangelopoulos
*Real Estate Analysis in the Information Age: Techniques for Big Data and Statistical Modeling (Taylor & Francis/Routledge)
2
Traditional Real Estate Data
• Sales Transactions• Micro-Property Level Data
•Property characteristics, listings, TAV, longitude/latitude
• Macro-Property Level Data•HPI•Demographic/Economic
3
Traditional Real Estate Data – Examples
• Tax assessor data• MLS listings data• FHFA House Price Index (HPI)• Income distribution by Census tract
4
“Big” Real Estate Data
• Extremely large datasets that cannot be processed or analyzed without significantly more computing power or new tools• Big data is driven by everyday business activities and decisions• Data are collected more frequently, potentially at the sub-second level
5
Big Real Estate Data (cont.)
• Three Vs – volume, velocity, variety•Big data is characterized by immense size –volume, constant streaming – velocity, and the variety of forms they take.
• Structured or unstructured
6
Structured v. Unstructured Data
• Unstructured Twitter Data
• Structured Twitter Data
7
Big Real Estate Data - examples
• Social media – Facebook, Twitter, Instagram• Search engine data – Google Trends• Real estate search engines – Realtor.com, Zillow, Redfin, Loopnet, RealtyTrac, and more• Data aggregators – ATTOM Data Solutions, CoreLogic, Black Knight
8
Twitter Case Study – Seattle, Washington• Explores two relationships:
1. Crime reports and housing prices2. House price changes and citizen sentiment
• Data Used: 1. Traditional real estate data – sales data from King
County, WA (retrieved from King County Assessor)2. Peripheral (Big) data – Twitter data (downloaded
using Twitter’s API)
9
Data Cleaning Process• House sale data narrowed to sales within the City of
Seattle• Twitter data gathered based on 51 Seattle Police
Beats
10
Seattle Police Beat Map
11
Modeling and Analysis
• Hedonic price model (OLS) – base modelSalePrice = f(chars, location, nhood factors) + ε
• Two forms1. Aspatial model2. Spatial model – designed to check for spatial
autocorrelation
12
Base Model Results
13
Variable Estimate Std. Error t value Pr(>|t|)
Intercept 8.490406913 0.525465268 16.15788412 7.95E-58
Townhouse -0.101920896 0.011174443 -9.120892815 9.26E-20
Lot Size (sq ft) -1.02E-05 1.01E-06 -10.1595524 4.19E-24
Home Size (sq ft) 0.000267764 8.73E-06 30.66486801 1.78E-195
Basement Size (sq ft) -0.000106377 1.03E-05 -10.31576837 8.51E-25
Attached Garage Size (sq ft) -0.000168928 2.22E-05 -7.597485443 3.36E-14
Deck size (sq ft) 9.85E-05 1.66E-05 5.916721641 3.42E-09
Building Quality 0.204892437 0.00461437 44.40312313 0
Condition = 3 0.092331334 0.026218839 3.521564506 0.000431368
Condition = 4 0.144597114 0.026548528 5.446520862 5.29E-08
Condition = 5 0.158582069 0.027755814 5.713472101 1.15E-08
Effective Age 0.002500578 0.000130316 19.18856588 2.73E-80
Baths 0.035425066 0.006272717 5.647483687 1.68E-08
Bedrooms -0.013670654 0.004203397 -3.25228721 0.001149487
Traffic Noise: Moderate -0.02668539 0.009266726 -2.879699715 0.003990968
Traffic Noise: Severe -0.058541486 0.010154198 -5.765249506 8.45E-09
View of Cascades 0.062800723 0.016796975 3.738811539 0.000186165
View of City Skyline 0.257997537 0.07526905 3.427670954 0.00061181
View of Olympics 0.078245028 0.017347408 4.510473653 6.56E-06
Other View 0.019784136 0.011393782 1.736397652 0.082531545
Waterfront 0.521253357 0.043127046 12.08646106 2.41E-33
Sales Date 0.000158876 3.08E-05 5.150160958 2.66E-07
Diagnostics
rsquared 0.66778926
sterr 0.259759736
fstat 775.8173452
AIC 1177.034876
Aspatial (Base Model) vs. Spatial Model
14
Crime and House Prices• The Seattle Police Department’s “Tweets by Beat”
Program tweets reported crimes along with their location and crime type.• 5 major crime types – violent, property,
behavioral, traffic, and all others• What is the impact of localized crime on house
prices?• Crime counts are added as a variable to the hedonic
model to explain price variation
15
Results – Crime Types and House Prices
16
Sentiment and House Prices• Citizen sentiment is defined as the sentiment of
Twitter users within Seattle’s city limits.• Examined the relationship between local sentiment
and local house price movements• Using textual analysis, a sentiment score is computed
using the positivity or negativity of the text within tweets.
• Area Sentiment = Positive words – Negative words
17
Seattle Sentiment – 2016
18
Price Movement vs. Sentiment
19
Case Study Conclusions• Weak relationship between crime and house
prices as well as local sentiment and house price movement
• Other factors responsible for house price movement?
• Inadequate approach to measure these relationships?
20
The Future of Big Data
• Big data will continue to influence the way real estate analyses are performed.• Big data will improve efficiency for buyers, sellers, practitioners, and researchers.• Big data frameworks will be further entrenched in other areas (e.g. federal statistics).
21
Clifford A. Lipscomb, Ph.D., MRICSVice Chairman and Co-Managing DirectorGreenfield Advisors, Inc.
106 N. Bartow Street | Cartersville, GA 30120 | USAOffice: 770.334.3952Cell: 770.289.1923E-mail: [email protected]: www.greenfieldadvisors.com
22