Upload
ian-beckett
View
38
Download
2
Embed Size (px)
Citation preview
Electronic copy available at: https://ssrn.com/abstract=2878903
THE EFFECTS OF MOBILE APPS ON SHOPPER PURCHASES AND PRODUCT RETURNS
Unnati NarangVenkatesh Shankar*
December 1, 2016
* Unnati Narang ([email protected]) is a PhD student in Business Administration-Marketing at Mays Business School, Texas A&M University. Venkatesh Shankar ([email protected]) is Professor of Marketing and Coleman Chair in Marketing and Director of Research, Center for Retailing Studies, Mays Business School, Texas A&M University. We thank participants at the 2016 Theory and Practice in Marketing (TPM) Conference and the 2016 Professors’ Institute meeting of Marketing EDGE for valuable comments.
Electronic copy available at: https://ssrn.com/abstract=2878903
THE EFFECTS OF MOBILE APPS ON SHOPPER PURCHASES AND PRODUCT RETURNS
Do mobile apps influence shopper purchases and product returns? We model the effects of app adoption in the context of a large omnichannel retailer with 32 million shoppers. We leverage the launch of a mobile app by the retailer and use a difference-in-differences approach to identify and estimate the differences between app adopters and non-adopters in shopping outcomes, such as the incidence and monetary value of purchases and product returns. We find that app adopters buy 21% more often but spend 12% less per purchase occasion and return 73% more often than non-adopters in the month after adoption. Overall, app adoption results in a 24% increase in net monetary value of purchases. Our findings are robust to alternative explanations and measures. Furthermore, our analysis of the drivers of app use reveals that exposure to offers and rewards through the app plays a key role in driving shopping outcomes. Surprisingly, the number of unique app features accessed by the shopper has an inverted U-shaped relationship with shopping outcomes, suggesting managerial caution against “all-in-one” app designs.
Keywords: difference-in-differences, exponential Type II Tobit, mobile marketing, mobile apps, quasi-experiments
Electronic copy available at: https://ssrn.com/abstract=2878903
1
1. Introduction
In recent years, the penetration of mobile devices has reached unprecedented levels. By January
2016, 79.1% of the United States’ (U.S.) population or 198.5 million people owned a smartphone
(comScore 2016a). By the end of 2016, over two billion people worldwide will be smartphone
users (eMarketer 2014) and by 2020, more than 70% of the world population will own a
smartphone (Ericsson Mobility Report 2016).
Mobile devices play a unique role in influencing shoppers along and beyond their paths to
purchase. Mobile devices are interactive, engaging, portable, wireless, location-specific, and
personable. As a result, they are uniquely positioned to influence shoppers at various stages of
the shopping process – need recognition, information search, alternative evaluation, purchase,
and post purchase (Shankar and Balasubramanian 2009).
Little wonder, mobile marketing is becoming a strategic priority for firms. U.S. firms spent
over $28 billion on mobile advertising in 2015 and are projected to double this level by 2018
(eMarketer 2016). Chief Marketing Officers (CMOs) of leading firms already allocate up to 20%
of their budget to mobile (Forrester 2016). Terry Lundgren, Macy’s CEO, views mobile to be the
starting point for shopping when he says, “shoppers are starting the journey with their phone,
doing their research. Then they might buy in the store or they’ll buy at Macys.com or
Bloomingdales.com” (Peterson 2015). Mobile devices have changed the way people shop, giving
rise to an emerging area of mobile shopper marketing and revolutionizing retail (Shankar et al.
2016). More than 80% of U.S. shoppers use a mobile device to shop even within a store (Google
M/A/R/C Study 2013). Nearly 70% of Amazon’s customers used mobile to shop in the 2015
holiday season (Eadicicco 2015).
2
Mobile apps are increasingly dominating mobile device use. Mobile apps account for 87% of
mobile usage, which constitutes the bulk of digital media time (comScore 2016b). By 2015, there
were over 250 billion app downloads from the App Store and Google Play (Sims 2015). About
20% of all Starbucks transactions originated from its “order and pay” app (Forbes 2015).
Do mobile apps influence shopper behavior? Mobile apps offer informational (e.g., product
and store information) and experiential (e.g., offer, loyalty program reward redemption) benefits.
These benefits may lead shoppers to purchase more often and spend more money. However,
while a mobile app can induce purchases through the app or mobile web, does it increase overall
purchases across all the channels, including brick-and-mortar and online? Furthermore, a mobile
app can prompt a shopper to act and make a purchase, but such an action can also result in post-
purchase regret, leading to higher product returns. Therefore, the net effect of mobile apps on the
monetary value of purchases is unclear.
Furthermore, app features such as product search, store check-in, loyalty program, and
promotional offer, may have specific effects on shopper purchases and returns and help explain
the effects of a mobile app on shopping outcomes. The use of a greater number of app features
may lead to more purchases and even product returns.
Despite the widespread use and the potentially significant effects of mobile apps and their
features, the consequences of mobile apps and the effects of app features have been
underexplored. A few studies (e.g., Kim et al. 2015; Gill et al. 2016) have considered the effects
of mobile apps on loyalty points accrued, purchase intent, website visits, or aggregate purchase
amounts. We extend prior research by isolating the effects of mobile app adoption on a richer,
managerially important set of outcomes, such as individual purchase incidence and amount and
3
return incidence and amount. Importantly, we explain these effects through app feature-related
mechanisms. Specifically, we address three research questions:
Does mobile app adoption lead to higher or lower incidence and monetary value of purchases and returns?
What are the sizes of differences in purchases and returns between app adopters and non-adopters?
What are the effects of the number and type of app features on shopping outcomes and how do they help explain the overall effects of a mobile app on shopping outcomes?
We address our research questions using a unique dataset from a large omnichannel retailer
of video games, consumer electronics and wireless services. Teasing out the effects of mobile
apps on purchase outcomes is complex. A major challenge is endogeneity and self-selection
potentially confounding the effects of variables affecting both app adoption and shopping
outcomes. We tackle this challenge in two ways by: (a) adopting a combination of difference-in-
differences method with propensity score matching and Heckman selection correction, and (b)
carrying out a series of robustness tests to rule out alternative explanations.
Our results show that app adopters buy 21% more often but spend 12% less per purchase
occasion and return 73% more often than non-adopters in the month after adoption. Overall, app
adoption results in a 24% increase in net monetary value of purchases. Surprisingly, the number
of unique app features accessed by the shopper has an inverted U-shaped relationship with
shopping outcomes, suggesting managerial caution against “all-in-one” app designs.
Our research contributes to the mobile marketing and omnichannel marketing literatures in at
least three ways. First, we expand the scope of mobile marketing literature by examining the
impact of mobile apps on both purchases and returns across channels. To our knowledge, no
other mobile marketing study has examined the net monetary value by accounting for returns.
Second, unlike most prior studies that focus on associations between mobile interventions (e.g.,
coupons) and purchase outcomes, we isolate the effect of mobile app adoption on purchases and
4
returns. Finally, we examine the effects of a comprehensive set of mobile app features that
enable managers to improve resource allocation to the conceptualization, development, launch,
and maintenance of mobile apps.
2. Related Literature and Framework for Empirical Analysis
Mobile devices influence shoppers both in- and out-of-store by offering convenient and
interactive anytime-anywhere access to relevant information (Shankar et al. 2016). Mobile apps
may affect purchases in two major ways. First, mobile apps can provide information benefits at
the right time and place to shoppers. Such benefits include product information, product reviews,
and store location information (Danaher et al. 2015; Dubé et al. 2015; Fong et al. 2015). Second,
mobile apps can offer experiential/interactive benefits through loyalty program use, notification,
offers, and store check-ins (Shankar and Balasubramanian 2009). For example, Starbucks offers
consumers location-based promotions via its mobile app for declaring their loyalty on social
networks and status badges for store check-ins and a mobile pay option (Andrews et al. 2016a).
Two streams of research are relevant to our research questions. First, the literature on mobile
apps has focused on the effects of apps on a few outcomes. Mobile app use improves brand
attitude and purchase intention (Bellman et al. 2011). A brand’s mobile app also promotes visits
to its website (Xu et al. 2014), enhances loyalty points accrued (Kim et al. 2015), and can
influence purchase probability (Dinner et al. 2015). Furthermore, the use of mobile app can
increase shoppers’ spending in the business-to-consumer (B2C) (Einav et al. 2014) as well as the
business-to-business (B2B) (Gill et al. 2016) context. Informational apps are more effective in
driving purchase intention than experiential apps (Bellman et al. 2011) and app features, such as
information lookup significantly affect loyalty points accrual (Kim et al. 2015).
5
Second, studies on mobile device adoption have concentrated on its effects on purchase
incidence or monetary value. Wang et al. (2015) examine changes in shopper spending after
mobile device adoption and find that purchase incidence and monetary value of purchases
increase. In contrast, Lee et al. (2016) find that the monetary value of each purchase is lower for
shoppers who transact more using mobile than web. Xu et al. (2016) study the effect of tablet
adoption on commerce through smartphones and computers in the online retail context and
conclude that commerce through tablets substitutes desktop commerce but complements
smartphone commerce.
Our study complements and extends these research streams as shown in Table 1. We study
the effects of app adoption a variety of shopping outcomes, purchase incidence, purchase
amount, return incidence, and return amount. In addition, we examine the effects of a
comprehensive set of app features on shopping outcomes.
< Table 1 about here >
Based on the evidence from these related research streams, we develop a conceptual
framework delineating the drivers of shopper decisions and outcomes at various shopping stages.
It appears in Figure 1. In this framework, some existing shoppers of the firm adopt the app, while
others do not. The app adoption decision depends on shopper demographics, past shopping
behavior, and the data connectivity environment. Both adopters and non-adopters of the mobile
app make purchases. In addition to app adoption, recency of last purchase, income and customer
tenure drive the incidence and monetary value of purchases. Some shoppers return some of the
purchases made. The recency of last purchase, distance to the nearest store, number of stores in
the shopper’s zipcode, and order size determine the incidence and monetary value of returns.
< Figure 1 about here >
6
3. Data and Research Setting
We collect data from a large US-based retailer of video games, consumer electronics and
wireless services. Our data span July 2014-June 2015. In addition to transactions-related data
from the retailer’s 4,175 stores across the U.S. and ecommerce website, we have access to data
on mobile app usage of over 32 million customers and members of the retailer’s loyalty program.
The loyalty program contributes to nearly 75 percent of total transactions, reflecting the retailer’s
overall customer base. The retailer’s primary channel is its store network; only a small
proportion of sales transactions take place through its ecommerce site. We have data on
shoppers’ transactions. From these data, we identify the relevant outcomes to map shopper
behavior. Purchase and return incidence (whether a shopper makes a purchase or return) and the
monetary value of purchases and returns are our key outcome variables. The key variables, their
operationalization and descriptive statistics appear in Table 2. We supplement these data with
publicly available data on region-specific data connectivity information (e.g., number of wireless
providers, data speeds) from the US Federal Communications Commission (FCC).
< Table 2 about here >
Our focal independent variable is app adoption. The retailer launched its app in July 2014
without any targeted campaign. A subset of shoppers adopted the app over time. The purpose of
the app is to allow shoppers to browse the retailer’s catalog of products, get exposure to deals
and offers, order online, or locate store information to buy offline. The app allows shoppers to
learn about the retailer’s stores, including nearby locations, opening hours, phone numbers, and
driving directions. The app does not offer in-app purchases. Web Appendix A provides
screenshots from the app.
7
The mobile app data are organized at the shopper/app session level. A new session is
recorded when a user first starts the app or loads the app after not loading it in the previous 15
minutes. For each shopper/app session, the data contain a random session ID and the app features
accessed by the shopper. App features capture in-app activities during the session (e.g., browsing
product catalog, clicking offers and checking reward points).
For the analysis, we divide our data into two periods, calibration period and estimation
period. We use the three-month period (August-October 2014) as the calibration period. The
rationale for setting aside this period is to compute shoppers’ past behavioral measures, such as
past spending to help identify similar shoppers for a valid comparison of app adopters and non-
adopters. We treat the months starting November 2014 as our estimation period, identifying and
using shoppers’ app adoption timing as cut-off points for estimating the effects of adoption.
4. Analyses
4.1. Relationship between App Adoption and Shopping Outcomes: Descriptive Analysis
We define app adopters as shoppers who started using the app for the first time during our data
period. Non-adopters are those who did not access the app even once during the study period.
Unlike most prior studies (e.g., Kim et al. 2015; Gill et al. 2016), our data allow us to uniquely
identify each shopper’s app adoption date. We draw random samples of app adopters from
different periods and compare their pre- and post- app adoption outcomes relative to app non-
adopters. Our main analysis reports results for app adopters from December 2014.1
We draw a random sample of adopters and non-adopters with complete demographic
information and who have made at least one purchase in the calibration period (Xu et al. 2016).
Table 3 reports the mean statistics for 1,629 random app adopters who started using the app on
December 1, 2014 and for 7,956 non-adopters. A simple comparison of shopping outcomes
1 We subsequently performed robustness checks using other samples (see Web Appendix B for details).
8
shows that the average monetary value of purchases increased 43.48% ($126.25 to $181.15 per
month) for app adopters, while it increased only by 25.93% ($46.89 to $59.05 per month) for
non-adopters one month before and after adoption (p < 0.001). However, the average monetary
value of returns for app adopters also increased by 96.09% ($9.46 to $18.55 per month)
compared to app non-adopters who experienced a marginal increase of 0.75% ($4.02 to $4.05
per month) in the same period (p < 0.001). Overall, the net monetary value increased by more
than 39.22% for app adopters compared with 28.30% for non-adopters. It is notable that the
number of purchase transactions for the app adopters increased by over 60% relative to non-
adopters who experienced only a 16% increase. As a result, the monetary value per incidence
declined by 10.33% ($85.66 to $76.81 per month) for app adopters, while it increased for non-
adopters by 8.63% ($66.28 to $72 per month). Histograms depicting these model-free data
appear in Figure 2.
< Table 3 and Figure 2 about here >
4.2. Econometric Model: A Quasi-experimental Approach
To estimate the effect of mobile app adoption, the ideal approach would be to compare the
shopping outcomes when shoppers adopt the mobile app to the counterfactual, that is, to
outcomes when the same shoppers do not adopt the mobile app. However, because we do not
observe the counterfactual (a shopper cannot be both an app adopter and a non-adopter) and
because the treatment is not randomly assigned (a shopper self-selects into adopting the app), we
develop a quasi-experimental design to replicate the ideal experimental scenario under
reasonable assumptions (Campbell and Stanley 1963). Specifically, we employ a difference-in-
differences (DIFF-IN-DIFF) approach to compare pre- and post- adoption outcomes for app
adopters and similar non-adopters. After specifying the baseline difference-in-differences
9
regression model, in the next section, we outline our strategy to address the endogeneity of
treatment using Propensity Score Matching (Rosenbaum and Rubin 1983) and Heckman
correction procedures (Heckman 1979). We rule out several competing explanations for our
results in the robustness checks. The complete list of analyses is laid out in Table 4.
< Table 4 about here >
Baseline Difference-in-Differences Model
We adopt a difference-in-differences approach to compare the change in outcomes for the
app adopters one month before and one month after app adoption to the change in outcomes for
the non-adopters over the same time period.2 Formally, our baseline difference-in-differences
model can be specified as a two-period linear regression model:
(1) 𝑌𝑖𝑡 = 𝛼0 + 𝛼1𝐴𝑖 + 𝛼2𝑃𝑡 + 𝛼3𝐴𝑖𝑃𝑡 + 𝜗𝑖𝑡
where i is individual, t is month, Y is the outcome variable (number and monetary value of
purchases and returns), A is a dummy variable denoting treatment (1 if shopper i is an app
adopter and 0 otherwise), P is a dummy variable denoting the period (1 for the period after the
app has been downloaded and 0 otherwise), α is a coefficient vector, and is an error term. The 𝜗
coefficient of AiPt (TREAT * POST) identifies the treatment effect.
The underlying identification strategy in the DIFF-IN-DIFF approach is that the change in
outcomes observed in the non-adopter group offers a good counterfactual for the change in
outcomes that would have been observed in the adopter group in the absence of app adoption.
The validity of this assumption relies on the (a) similarity between the app adopters and non-
adopters along their observed and unobserved characteristics (including, for example, common
trends in outcomes in the pre-treatment periods), and (b) absence of any idiosyncratic shock to
2 We subsequently also examined alternative time periods in our (a) comparisons of 15-, 45- and 60-day pre- and post- outcomes (Web Appendix B), and (b) robustness check of app use vs. non-use for extended periods (Table 8).
10
either group in the study period (e.g., no unique marketing promotions should have been sent to
one group and not to the other). In our case, assumption (b) holds since there were no unique
shocks to either group in the data period. In the absence of natural randomization, we ensure the
validity of assumption (a) by employing matching estimates and the Heckman two-step
correction process, carefully observing the balance between two groups through several checks.
4.3. Endogeneity and Self-selection
In the absence of randomization in an observational study, endogeneity of treatment becomes a
major challenge in estimating the causal effects. Two sources of endogeneity exist in our setting.
First, omitted variables can affect both app adoption and shopping behavior. Consider customers
who are gaming and technology enthusiasts. It is possible that they are more interested in the
video game product category and therefore purchase gaming-related products. They may also be
spending more time on their mobile devices, including exploring the available apps. As a result,
their likelihood of adopting the app is higher. Similarly, it is possible that they read app reviews
and technology columns in the media and become more aware of app functionalities. In such
cases, both game purchases and app adoption are likely, but the effect may not necessarily be
causal. Second, mobile app usage and purchase transaction may occur together. Imagine a
customer using a mobile device while purchasing at a store or at a website. Due to the
simultaneous occurrence, it is difficult to tease out causality. This issue can result in endogeneity
from simultaneity (Wooldridge 2002).
Our quasi-experimental research design combined with a series of robustness and
falsification checks allows us to address the endogeneity concern.
4.3.1. Selection on Observables: Matching Estimates
11
Propensity score matching allows us to match app adopters and non-adopters on observed
demographic and behavioral covariates, while tackling the curse of dimensionality. Underlying
propensity score matching is the idea of conceptualizing “the observational data set as having
risen from a complex randomized experiment, where the rules used to assign the treatment
condition have been lost and must be reconstructed” (Rubin 2008; Guo and Fraser 2014).
We begin by calculating each shopper’s propensity score, which is defined as the shopper’s
probability of adopting the app. We do this using a binomial logit model.3 Next, we identify non-
adopters similar to adopters based on the estimated propensity scores to create a control group.
This approach is in line with Rosenbaum and Rubin’s approach to create a control group “that is
similar to a treated group with respect to the distribution of observed covariates” (Rosenbaum
and Rubin 1983). We match each app adopter to a non-adopter based on the 1:1 nearest neighbor
matching without replacement.4 Formally, if P(Xi) is individual i’s propensity score, the treated
individual i is matched to the control individual j, where j is min ||P(Xi) – P(Xj)|| to create
matched pairs closest to each other (Wangenheim and Bayón 2007; Huang et al. 2012).
What factors explain the decision to adopt the app? Consistent with extant literature (Hung et
al. 2003; Kim et al. 2015), we model app adoption as dependent on individual shopper
demographics (e.g., age, gender), behavioral measures (e.g., past spend, past returns, past online
buying) and other related measures (e.g., distance to the nearest store, number of stores in the
shopper’s zip code, presence of competitor stores, loyalty program membership level on the
adoption day) that are likely to influence shoppers.
(2) 𝑃𝑟𝑜𝑝𝑒𝑛𝑠𝑖𝑡𝑦/𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝐴𝑝𝑝 𝐴𝑑𝑜𝑝𝑡𝑖𝑜𝑛 𝐴𝑖 = exp (𝑈𝑖)
1 + exp (𝑈𝑖)
(3) 𝑈𝑖 = 𝛾 + 𝛿𝐷𝑖 + 𝜀𝑖where i is customer, U is the utility from app adoption, D is a vector of covariates, (γ, δ) is a
3 We also estimated propensity scores using a probit model and found no significant difference in our results.4 We present alternative matching estimates, including caliper and Mahalanobis metric in Web Appendix C.
12
coefficient vector, andεis an error term, distributed as double exponential. We also include
squared terms of the covariates to allow for nonlinear relationships and for improved model fit
(Huang et al. 2012).
We conduct a series of statistical analyses to test the goodness of our propensity score
matches, including the Kolmogorov-Smirnov test, Standardized Bias Reduction and
Rosenbaum’s Hidden Bias Sensitivity test (Rosenbaum 2005). The tests show that the match
balance between adopters and non-adopters improved significantly after matching and that there
was no concern for sensitivity of outcomes to hidden bias. The detailed results of these checks as
well as alternative matching methods are reported in Web Appendix C.
4.3.2. Selection on Unobservables: Heckman Correction
To more formally account for the non-randomness of the app adoption due to unobserved
factors, we use a two-stage Heckman correction procedure (Heckman 1979). In general, the rich
set of demographic and behavioral covariates used for matching and the common trends between
treated and control outcomes in the pre-treatment period should offer convincing evidence that
the groups are comparable. However, we further test for any unobserved confounders through
the Heckman procedure (Gill et al. 2016). In the first stage, we model the choice to adopt the app
using a probit model. For identification, we require an exclusion restriction that affects the
decision of the shopper to adopt the app without affecting the shopping outcomes. We identify
three such exclusion restrictions that relate to data connectivity and technology environment:
1. Local wireless network access, operationalized as the proportion of the population in the
shoppers’ counties with access to four or more wireless providers, will likely affect the
shoppers’ mobile usage patterns and hence, shoppers’ probability of downloading apps. If
there is greater access to wireless networks, shoppers are likely to engage in more mobile
13
use and more app download activity regardless of their intrinsic preference for a specific
firm. At the same time, wireless network access is not likely to affect purchases from any
one retailer, in particular, in stores, which serve as the primary channel for the retailer in
this setting. This measure serves as a proxy for unobserved endogenous firm preference
by making app download a function of exogenous network access.
2. Symmetric upload and download speeds, operationalized as the percentage of population
in the shoppers’ state with symmetric Digital Subscriber Lines (DSL) (same download
and upload speed) relative to asymmetric DSL (higher download speeds than upload),
may lead to low adoption of apps in those regions due to slower downloads, without
affecting purchases. This measure serves as a proxy for unobserved endogenous firm
preference by making app downloading a function of exogenous network speeds.
3. Online purchases in the calibration period, operationalized as whether the shoppers used
the online channel to make at least one purchase or not, are likely to affect the shoppers’
perceived value of the mobile app but may not influence how they buy across channels. If
an online purchase was made, on the one hand, it may be valuable for online buyers to
adopt the app to augment their experience. However, since the app does not allow direct
purchase, it may be perceived as less valuable by those who already buy online. This
measure serves as a proxy for shoppers’ tech-savviness.
These exclusion restrictions result in the following first-stage selection equation for modeling Ai,
the probability of shopper i adopting the app.
(4) Pr (𝐴𝑖 = 1| 𝐶𝑜𝑣𝑎𝑟𝑖𝑎𝑡𝑒𝑠𝑖, 𝜖𝑖) = Φ(∂0𝑊𝐼𝑅𝐸𝑁𝐸𝑇𝑖
+ ∂1𝑆𝑌𝑀𝑆𝑃𝐸𝐸𝐷𝑖
+ ∂2𝑂𝑁𝐿𝐼𝑁𝐸𝐵𝑈𝑌𝐸𝑅𝑖
+ ∂3𝑄𝑖
+ 𝜖𝑖)
where WIRENET is local wireless network, SYMSPEED is symmetry of network speed,
ONLINEBUYER is a dummy indicating prior online purchases, Q is a vector of other covariates,
is a coefficient vector, and is an error term. We compute the inverse Mills ratio from this ∂ 𝜖
14
probit regression. In the second stage, we augment the difference-in-differences model by
including the inverse Mills ratio as an additional covariate. In a further robustness check for
selection due to unobservables, we use future app adopters to identify a similar control group for
the current app adopters. The premise for doing this is that there should not be unobserved
differences among app adopters who adopt the app at different points in time. Thus, the DIFF-
IN-DIFF estimating the treatment effects for current app adopter (treated) cohorts relative to
future app adopters (control) across the same time period acts as a falsification test (Manchanda
et al. 2015) and provides consistent results (Web Appendix D).
4.4. Decomposing Treatment Effects: Exponential Type II Tobit Model
While the difference-in-differences model provides estimates for the aggregate effects, it is not
informative about the source of the effects. Do app adopters buy relatively more or less
frequently or do they buy more or less whenever they decide to buy? How do the monetary
values of purchases and returns vary conditional on the decision to purchase or return?
We jointly model the incidence (whether or not to purchase or return) and the monetary value
of purchases and returns. We use an exponential Type II Tobit for the following reasons: (1) we
have a censored model with mass point at zero, (2) our outcome of interest is an empirical
counterpart of a latent variable (utility from purchase or return), and (3) we want our fitted
values to remain in the range of the LDV (limited dependent variable), which in this case is non-
negative for monetary values, and within 0 and 1 for incidence. In the first stage of our model,
we specify a probit model for modeling the binary outcome of whether a shopper purchases
(returns) in a given period. In the second stage, we subsequently model the monetary value of
purchases (returns) per occasion (Wooldridge 2002).
15
We now describe our model setup. A shopper i chooses whether to make a purchase or not at
time t. If the shopper’s expected utility from purchasing is greater than zero, we expect the
purchase incidence to be positive for that time period. The latent utility depends on mobile app
adoption and other covariates5. Mobile app adoption in our sample is exogenous to outcomes
conditional on propensity scores. In other words, we use the propensity score matched samples to
estimate the Tobit models.
Purchase Incidence: Let hit, the purchase incidence of shopper i in period t be given by:
(5) ℎ𝑖𝑡 = 0 𝑖𝑓 ℎ ∗𝑖𝑡 ≤ 0
= 1 𝑖𝑓 ℎ ∗𝑖𝑡 > 0
where is the latent utility of ℎ ∗𝑖𝑡| 𝐴𝑖, 𝑃𝑡, 𝑋𝑖𝑡 = 𝛽𝑃𝐼 + 𝛽1𝐴𝑖 + 𝛽2𝑃𝑡 + 𝛽3𝐴𝑖𝑃𝑡 + 𝛽4𝑋𝑖𝑡 + 𝜀𝑃𝐼
𝑖𝑡
purchasing; X is a vector of covariates, is a coefficient vector, and εPI is an error term. The 𝜷
probability of the ith shopper making a purchase at time t is:
(6) 𝑃𝑃𝐼𝑖𝑡 (ℎ𝑖𝑡 = 1│ 𝐴𝑖, 𝑃𝑡, 𝑋𝑖𝑡) = Φ
(βPI + β1Ai + β2Pt + β3AiPt + β4Xit)𝜎𝑃𝐼
where is σPI is the standard deviation of the error term εPI. We next create our conditional
likelihood function across t periods for any individual i to apply the Maximum Likelihood
Estimator (MLE):
(7)
∫∞‒ ∞{∏𝑇
𝑡 = 1[Φ(βPI + β1Ai + β2Pt + β3AiPt + β4Xit)
𝜎𝑃𝐼]ℎ𝑖𝑡[1 ‒ Φ
(βPI + β1Ai + β2Pt + β3AiPt + β4Xit)𝜎𝑃𝐼
](1 ‒ ℎ𝑖𝑡)}Monetary Value of Purchases: Let t, the monetary value of purchases per purchase 𝑔𝑖𝑡
occasion for shopper i in period t be given by:
5 As a robustness check, we estimated the Tobit models without the covariates for purchase and return amounts, and found consistent estimates for the treatment effect.
16
(8) 𝑔𝑖𝑡 = 1[ℎ ∗𝑖𝑡 > 0]𝑔
∗
𝑖𝑡
𝑤ℎ𝑒𝑟𝑒 𝑔 ∗𝑖𝑡 = exp (𝛽𝑃𝐴 + 𝛽5𝐴𝑖 + 𝛽6𝑃𝑡 + 𝛽7𝐴𝑖𝑃𝑡 + 𝛽8𝑉𝑖𝑡 + 𝜀𝑃𝐴
𝑖𝑡 )
where is the log of monetary value of purchases per purchase occasion for shopper i in time t. 𝑔𝑖𝑡
We observe it when a shopper makes a purchase.6 β is a coefficient vector and εPA is an error
term. The purchase incidence and monetary value form an exponential Type II Tobit model.
Return Incidence: Let rit, the return incidence of shopper i in period t be given by:
(9) 𝑟𝑖𝑡 = 0 𝑖𝑓 𝑟 ∗𝑖𝑡 ≤ 0
= 1 𝑖𝑓 𝑟 ∗𝑖𝑡 > 0
where is the latent utility of 𝑟 ∗𝑖𝑡|𝐴𝑖, 𝑃𝑡,𝑍𝑖𝑡 = 𝛽𝑅𝐼 + 𝛽9𝐴𝑖 + 𝛽10𝑃𝑡 + 𝛽11𝐴𝑖𝑃𝑡 + 𝛽12𝑍𝑖𝑡 + 𝜀𝑅𝐼
𝑖𝑡
returning. Z is a vector of covariates, εRI is an error term, and the other terms are as defined
earlier. The probability of the ith shopper making a return at time t is:
(10) 𝑃𝑅𝐼𝑖𝑡 (𝑟𝑖𝑡 = 1│𝐴𝑖, 𝑃𝑡,𝑍𝑖𝑡) = Φ
(𝛽𝑅𝐼 + 𝛽9𝐴𝑖 + 𝛽10𝑃𝑡 + 𝛽11𝐴𝑖𝑃𝑡 + 𝛽12𝑍𝑖𝑡)𝜎𝑅𝐼
where σRI is the standard deviation of the error term εRI. The conditional likelihood function for t
periods for any individual i conditional on Zit is:
(11) ∫∞
‒ ∞{∏T
t = 1[ΦβRI + β9Ai + β10Pt + β11AiPt + β12Zit
σRI]rit[1 ‒ Φ
( βRI + β9Ai + β10Pt + β11AiPt + β12Zit)σRI
](1 ‒ rit)}
6 V is a vector of shopper covariates, such as income (proxied by average monthly past spending) and tenure (time elapsed since becoming a customer), denoting the income effect on spending and the experience effect on spending, respectively (Thaler 1990; Bolton 1998). Prior research (e.g., Ailawadi and Neslin 1998; Kushwaha et al. 2015) shows that the monetary value of purchases also depends on the inventory effect, modeled as the durability of the product category last purchased. In our context, it would make sense to model video game console buyers differently from games-only buyers because unlike games, consoles are infrequent and high value purchases. We use the data in the calibration period to classify shoppers into console-only buyers, game-only buyers, and buyers of both categories. We find that less than 1% of our sample bought a console. Moreover, 99% of the video game buyers are game-only buyers, possibly because they already own a console. Furthermore, there is no distinction in the percentage of multiple category shoppers between the treatment and control groups; 30% of shoppers in both groups are multi-category buyers. Therefore, we do not control for the inventory effect.
17
Monetary Value of Returns: Let tit, the monetary value of returns per return occasion for shopper
i in period t be given by:
(12) 𝑡𝑖𝑡 = 1[𝑟 ∗𝑖𝑡 > 0]𝑡
∗
𝑖𝑡
𝑤ℎ𝑒𝑟𝑒 𝑡 ∗𝑖𝑡 = exp (𝛽𝑅𝐴 + 𝛽13𝐴𝑖 + 𝛽14𝑃𝑡 + 𝛽15𝐴𝑖𝑃𝑡 + 𝛽16𝑊𝑖𝑡 + 𝜀𝑅𝐴
𝑖𝑡 )
where is the log of monetary value of returns per occasion for shopper i in time t. We observe 𝑡𝑖𝑡
it when a shopper makes a return. W is a vector of covariates and εRA is an error term. Return
incidence and monetary value form an exponential Type II Tobit model. We assume that the
errors in the Tobit models are normally distributed. ), ), 𝜀𝑃𝐼𝑖𝑡 ~ 𝑁 (0, 𝜎 2
𝑃𝐼 𝜀𝑃𝐴𝑖𝑡 ~ 𝑁 (0, 𝜎 2
𝑃𝐴 𝜀𝑅𝐼𝑖𝑡
) and ). ~ 𝑁 (0, 𝜎 2𝑅𝐼 𝜀𝑅𝐴
𝑖𝑡 ~ 𝑁 (0, 𝜎 2𝑅𝐴
A summary of the covariates and their likely relationships with shopping outcomes appears
in Table 6. The table also lists the covariate notations and the relevant supporting research.
< Table 6 about here >
5. Results and Robustness Checks
5.1. Results
The results from the baseline difference-in-differences model in Panel (A) of Table 5 show a
positive and significant effect of app adoption on the incidence and monetary value of purchases
and returns (p < 0.001). App adopters spend $42.73 more than non-adopters in the month after
adoption and engage in a higher number of purchases (α3=0.772, p < 0.001). Interestingly,
relative to non-adopters, app adopters also return $9.06 worth more of products and engage in
more number of returns (α3=0.133, p < 0.001) each month.
Panel (B) of Table 5 refines these estimates by controlling for self-selection and ensuring that
the two groups, app adopters and non-adopters, are comparable based on a propensity score
matching model with 1:1 nearest neighbor matching without replacement. App adoption has a
18
positive and significant effect on shopper outcomes, including the number and monetary value of
purchases and returns (p < 0.001). Relative to the baseline model, the coefficients reflect a higher
positive significant effect of app adoption for the matched sample. App adopters spend $47.91
more than non-adopters in the month after adoption and buy a greater number of times
(α3=0.869, p < 0.001). Relative to non-adopters, app adopters also return $10.76 worth more of
products and return products a greater number of times (α3=0.144, p < 0.001) each month.
Overall, app adoption leads to a higher net monetary value of $37.15 (p < 0.001).
The results hold even after we include the Heckman correction term in the model (see panel
(C) of Table 5). We do not find evidence for selection on unobservables as the coefficient of the
inverse Mills ratio is insignificant (p > 0.4). The detailed results of the first stage probit model
appear in Web Appendix D. Histograms depicting the differences in monetary value of purchases
and returns for the matched samples appear in Figure 3. The pattern is similar to that in Figure 2.
< Table 5 and Figure 3 about here >
Histograms showing the distribution of propensity scores for the treated and the control
group before and after matching appear in Figure 4. Matching on propensity scores improves the
percentage balance of propensity scores by 99.32%, making the matched treated and control
groups comparable.7
< Figure 4 about here >
How do the monetary values of purchases and returns vary conditional on the decision to
purchase or return? The results of the exponential Type II Tobit model (Table 7) provide rich
insights. Interestingly, while app adopters are likely to buy 21% more often, the effect on the
monetary value of purchases per purchase occasion is negative and significant (p < 0.05) relative
7 Appendix C reports the evidence of similarity of the two groups in the mean values of each observed covariate before and after matching, additional checks, and the results of the logit model used to compute propensity scores.
19
to non-adopters. The magnitude is close to 88% implying that adopters in the post period spend
12% less per purchase occasion than they would have in the absence of the app relative to the
pre- period. From the purchase models in panel (A) of Table 7, we further note that recency
negatively influences purchase incidence (p < 0.001) and income and tenure positively influence
the monetary value of purchases per occasion (p < 0.05).
< Table 7 about here >
From the exponential Type II Tobit returns model in panel (B) of Table 7, we observe a
negative effect of recency on return incidence (p < 0.001). Distance to the nearest store and the
number of stores in the shoppers’ zip codes do not have significant effects (p > 0.10). Our key
finding from this model is that relative to non-adopters, app adopters are 73% more likely to
return products (p < 0.001). Conditional on return incidence, however, there is no significant
effect on the monetary value of returns per occasion (p > 0.10). We explain the intuition and
possible mechanisms for these results in Section 6.
5.2. Checking Robustness and Ruling out Alternative Explanations
We perform several robustness checks and tests to rule out alternative explanations for the effect
of app adoption on purchase and returns. A summary appears in Table 8.
< Table 8 about here >
5.2.1. Alternative measures of app adoption: To rule out idiosyncrasy in the app adoption
measure, we estimated an alternative model with a more nuanced measure of app adoption. In
our main model, we compared app adopters and non-adopters along their outcomes one month
before and after app download. In the alternative model, we apply a fine-grained measure of
mobile app adoption based on app usage. In this test, we create a new quasi-experiment. We
focus on only app adopters in the post-adoption period. We segment adopters into app users and
20
non-users. A user is any app adopter who logs into the app at least once in a given month. What
this means is that the user in month one could be a non-user in the next month. This process acts
as a robustness check in that it shows that the effects are not driven by individual characteristics
over time (Xu et al. 2016). Column (A) of Table 8 reports the results from the comparison of app
users and non-users. Further, we re-estimate this model with propensity score matching, by
matching users and non-users for each month’s activity dynamically. We report the estimates
from four different propensity score matches, those for users and non-users based on usage status
in the four months from January to April 2015 in Web Appendix B; the outcomes are measured
from December 2014 to June 2015. The effects are robust and consistent with previous findings.
5.2.2. Outliers: Another possible explanation for the effects could be outliers, such as the top
spenders (and not the average shopper). We test this possible explanation by first removing the
top spenders from our sample (the mean plus two standard deviations of spending in the pre-
period) and then carrying out propensity score matching and difference-in-differences methods
as done earlier. Our estimates are robust as shown in columns (B) of Table 8. We also test
robustness to outliers based on spending in the calibration period and find consistent results.
5.2.3. Shopper heterogeneity: While we match shoppers on a broad set of covariates, one
possible untested alternative explanation is that the effects are driven largely by already deal-
prone shoppers and not by the use of the app. In general, marketing promotions and offers are
not a threat to our difference-in-differences estimates because the retailer did not send any
unique offers to mobile app adopters that the non-adopters did not receive, or vice-versa. We do
not expect deals to affect one set of shoppers idiosyncratically. Yet, to rule out the possibility
that actual redemption or use of offers prior to adoption could influence the two groups
differently, we repeat the analyses after removing deal-prone customers. Column (C) of Table 8
21
reports the estimates for the sample that did not use deals in the pre-period. We find robust
estimates. We also test robustness to deal-proneness based on offer use in the calibration period
and find consistent results. This finding is managerially significant because app adopters do not
buy more simply because they are sensitive to deals but even otherwise. However, we do notice
that the percentage of instances of deals-usage among shoppers after app adoption increases for
app adopters (6% to 50% deal using shoppers) while remaining stagnant for non-adopters,
suggesting goal-switching effects of app usage (Shankar et al. 2016).
5.2.4. Alternative matching methods: Our main analysis relies on the commonly used 1:1
nearest neighbor matching algorithm. In addition, we use the Mahalanobis metric and a refined
caliper matching approach by defining the bandwidth within which to identify matched control
units (Silverman 1986). We test using a bandwidth of 0.16 times the standard deviation of the
propensity scores, in line with the Silverman rule of thumb. To enhance support for our matches,
we also adopt a trimming approach, in which we drop the observations whose propensity score is
smaller than the minimum and larger than the maximum in the opposite group (Caliendo and
Kopeinig 2005). Web Appendix C reports the results for these alternative matched samples. We
find that these estimates are consistent with those from our proposed method.
5.2.5. Alternative samples: Our main analysis reports the results for a sample of 3,258
shoppers; the treatment group for this random sample comprises those who started using the
mobile app on December 1, 2014. To verify that the results are generalizable to other samples,
we replicated the analyses for two different types of samples: (a) a random sample selected on a
different date, February 1, 2015 (column D of Table 8) and (b) a random sample selected from
each month for a period of four months during (February-May) 2015 (see Web Appendix B for
details). In (b), the treatment group users could have started using the app on any date in that
22
month. We treat the month of adoption as part of the post-treatment period, similar to other
studies (Xu et al. 2016). Our results are consistent across four such samples of 11,380 shoppers.
5.2.6. App novelty effect: An alternative explanation for the increased net monetary value of
purchases after app adoption could be the novelty of the app. It is possible that the app triggers a
heightened shopping response only due to a temporary novelty effect that fades after a few days.
To test this explanation, we re-estimate our models using extended windows of time, that is, 45
days and 60 days, instead of one month. The effects of the app persist in these varying windows
of time, and in fact seem to increase over time (Table B4 in Web Appendix B).
5.2.7. Future adopters as control group: Column E of Table 8 shows the results of an
alternative DIFF-IN-DIFF model that uses future app adopters as a control group for current
adopters. The results are substantively similar.
6. Mechanisms Explaining App Adoption Effects: App Features and Usage Patterns
Our findings provide robust evidence for the influence of mobile app adoption on shopper
behavior. We find that app adopters buy 21% more often but spend 12% less per purchase
occasion and return 73% more often than non-adopters in the month after adoption. Overall, app
adoption results in a 24% increase in net monetary value of purchases.
What mechanisms underlie higher purchases and returns due to app adoption? App adopters’
use of app features may help answer this question. An investigation into the use of app features
shows that two most commonly used features, offer feature (e.g., clicking current deals on
products) and loyalty reward feature (e.g., checking loyalty points) could potentially explain app
adoption effects on shopping outcomes. These features primarily involve an experiential
outcome via interactivity (Bellman et al. 2011). The interactivity of mobile devices is
characterized by the control that users have over the device and the notion of presence, that is,
23
the ability to experience an environment closely through the technology. Activities like
redeeming reward points and activating offers will likely lead to greater engagement and
spending by shoppers (Kim et al 2015). Over 90% app adopters who accessed the app use its
interactive features.
An analysis of app adopters’ use of offer and loyalty reward features is noteworthy because it
helps explain our key finding of lower monetary value per purchase occasion. The descriptive
statistics pertaining to the use of offer and loyalty features pre and post app adoption appear in
Tables 9 and 10, respectively. From these tables, we examine the differences in shopping
outcomes for app adopters who access the offer and loyalty features versus those who did not. As
expected, the value of each purchase for users of these features falls by about 16-20% per
purchase occasion between the pre- and post-period while remaining virtually the same for those
who do not use such features. Furthermore, in the data, shoppers who use the mobile app show
increasing instances of offer usage post adoption from 6% shoppers using offers in the pre-
adoption to over 50% using offers in the post-adoption period. To verify the mechanism of offer
exposure through the app, we further examined the nature of app usage by shoppers on the day
they make a purchase and one day before. Indeed, out of 1,712 transactions made by app users in
the post adoption period, over 42% were triggered after the use of offer-related features and 53%
after the use of loyalty rewards-related features.
< Tables 9 and 10 about here >
If increased offers and rewards exposure is indeed one of the mechanisms for higher
purchase incidence and lower purchase values, we can easily explain higher returns for app
adopters. Higher incidence of returns due to app adoption could result from three related reasons.
First, buying decisions made at the lure of an offer by app adopters may lead to post-purchase
24
disutility and returns. Second, higher return incidence could result from exposure to
disconfirming information after the shopper has made the purchase because app users are likely
to get exposed to negative information about the product through reviews. Our conversations
with managers from the retail company confirmed this insight; the executives revealed that when
social media opinion leaders and influencers share negative reviews about a video game, even if
the game received positive reactions and pre-orders in the pre-release period, it is common to see
spikes in return incidence among buyers. Third, app adopters may be engaging in returns more
often simply because they become less inhibited after using the app; some studies indicate that
the lack of social cues in electronic device mediated communications prompt individuals to
become less inhibited (Sproull and Kiesler 1986).
Finally, contrary to intuition, shoppers who access higher number of unique features do not
always buy more. In fact, beyond a point, their sales and returns outcomes weaken. Figure 5
demonstrates this inverted U-shaped relationship through a scatter plot between users’ average
number of unique features accessed in the app and their shopping outcomes. Shoppers who use a
very high number of features in the app may experience disutility from information overload and
an increased focus and attention on the device itself, rather than on actively thinking about a
purchase and taking actions.
< Figure 5 about here >
7. Managerial Implications
Our results offer several key managerial implications. First, based on the difference in net
monetary value of purchases due to app adoption from the propensity score matched DIFF-IN-
DIFF model ($37) and the most conservative estimate from the robustness checks ($23), across
the retailers’ two million adopters, we estimate the retailer’s net annual revenue increase due to
25
app launch to range from $550 million to $890 million. This estimate provides a useful
benchmark for managers to evaluate any app introduction decision. This estimate is likely to be
higher if the retailer can convince more shoppers from its 32 million shopper base to adopt its
mobile app.
Second, the findings that the purchase frequency (monetary value of purchases per occasion)
is higher (lower) for adopters than non-adopters suggests that managers should plan for shoppers
visiting the physical and online stores more often and spending less on each occasion. These
findings also suggest reduced interaction of store associates and online agents with shoppers on a
given store visit, so the key task of associates is to encourage shoppers to visit again.
Third, the finding that app adoption leads to greater product returns exposes managers to a
darker side of apps. Managers need to proactively monitor return incidence from app adopters
and devise interventions to keep product returns in check. To the extent that some of the returns
is due the gap between expected and actual product delivered, managers can minimize the gap by
offering clearer pictures, videos, and descriptions of the products in the app.
Fourth, the findings on the role of offers and reward features indicate the importance of
dynamic experiential content in the app that provides additional value to shoppers. To promote
engagement, managers need to ensure that interactive features such as redeeming reward points
and activating offers are easily accessible.
Finally, we caution managers against an all-in-one app design and in favor of a more
thoughtful combination of app features to avoid information-overload. In doing so, managers
should adapt their mobile app design strategies to their context, including the product category.
8. Conclusion, Limitations, and Extension
26
We addressed our three research questions rigorously using two complementary research
designs, a difference-in-differences method and the exponential Type II Tobit model. We tested
a variety of alternative explanations that could contaminate our estimates. First, mobile adoption
leads to higher purchase incidence, return incidence, and net monetary value of purchases.
Second, app adopters buy 21% more often but spend 12% less per purchase occasion and return
73% more often than non-adopters in the month after adoption. Overall, app adoption results in a
24% increase in net monetary value of purchases. Third, experiential app features like
promotional offers and loyalty rewards significantly affect shopping outcomes, and the number
of unique app features accessed by the shopper has an inverted U-shaped relationship with
shopping outcomes.
We tested our results for several alternative explanations including shopper heterogeneity in
deal-proneness. The robustness of our estimates to pre-adoption deal-proneness of shoppers
shows that mobile apps influence non deal-prone shoppers as well. However, once they adopt the
app, exposure to offers and reward features plays an instrumental role in driving app adopters’
shopping outcomes.
Although our research is the first to quantify the effects of mobile apps on a broad range of
shopping outcomes, including returns, it has some limitations that future research can address.
First, our research relies on a quasi-experimental approach. The gold standard for causal
inference is randomized field experiments. Randomized field experiments, if feasible, could
provide future researchers with unique opportunities for testing specific app-related
manipulations. Second, while we have examined the net monetary value of purchases, our data
do not contain cost information. If cost data are available, it would be interesting to study the
effect of app adoption and use on customer lifetime value. Third, we have data from only one
27
retailer across channels. Future studies on mobile apps can examine data on multiple retailers to
map shoppers’ brand loyalty and preference resulting from app adoption. Likewise, future
studies can examine these research questions in the context of other retailer types such as pure
play retailers with a growing bricks-and-mortar presence (e.g., Warby Parker, Bonobos). Such a
setting could also offer interesting comparative insights on the effects of mobile apps on
shopping outcomes in different channels. Finally, there is immense potential to continue to
uncover the mechanisms underlying engagement and use of mobile apps. Furthermore, what
marketing mix strategies should firms adopt to improve adoption of and engagement through
apps? These are ripe areas for future investigation.
28
References
Ailawadi KL, Neslin SA (1998) The effect of promotion on consumption: Buying more and consuming it faster. J. Marketing Res. 35(3):390–398.
Anderson E, Hansen K, Simester D (2009) The option value of returns: Theory and empirical evidence. Marketing Sci. 28(3):405-423.
Andrews M, Goehring J, Hui S, Pancras J, Thornswood L (2016a) Mobile promotions: A framework and research priorities. J. Interactive Marketing 34:15–24.
Andrews M, Luo X, Fang Z, Ghose A (2016b) Mobile ad effectiveness: Hyper-contextual targeting with crowdedness. Marketing Sci. 35(2):218-233.
Bellman S, Potter RF, Treleaven-Hassard S, Robinson JA, Varan D (2011) The effectiveness of branded mobile phone apps. J. Interactive Marketing 25(4):191–200.
Bolton RN (1998) A dynamic model of the duration of the customer’s relationship with a continuous service provider: The role of satisfaction. Marketing Sci. 17(1):45–65.
Caliendo M, Kopeinig S (2005) Some practical guidance for the implementation of propensity score matching. IZA DP N:1588.
Campbell D, Stanley JC (1963) Experimental and Quasi-Experimental Designs for Research (Houghton Mifflin, Boston).
comScore (2016a) comScore reports January 2016 US smartphone subscriber market share. ComScore. Accessed July 12, 2016, http://tinyurl.com/gq3x5s5
comScore (2016b) The 2016 US mobile app report. ComScore. Accessed October 20, 2016, http://tinyurl.com/j43tjyw
Cragg JG (1971) Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39(5):829-844.
Danaher PJ, Smith MS, Ranasinghe K, Danaher TS (2015) Where, when, and how long: Factors that influence the redemption of mobile phone coupons. J. Marketing Res. 52(5):710–725.
Dinner I, Heerde HV, Neslin S (2015) Creating customer engagement via mobile apps: How app usage drives purchase behavior. Working paper.
Dubé JP, Fang Z, Fong NM, Luo X (2015) Competitive price targeting with smartphone coupons. NBER Working Paper No. 22067.
Eadicicco L (2015) More people now shop on Amazon using smartphones and tablets than computers. TIME. Accessed October 2, 2016, http://tinyurl.com/hddh88p
Einav L, Levin J, Popov I, Sundaresan N (2014). Growth, adoption, and use of mobile e-commerce. The American economic rev. 104(5): 489-494.
eMarketer (2014) 2 Billion consumers worldwide to get smart(phones) by 2016. Emarketer. Accessed October 2, 2016, http://tinyurl.com/kkpxevo
eMarketer (2016) Mobile ad spend to top $100 billion worldwide in 2016, 51% of digital market. Emarketer. Accessed August 10, 2016, http://tinyurl.com/p79lymk
Ericsson (2016) Ericsson Mobility Report. Ericsson. Accessed August 10, 2016, http://tinyurl.com/gmnezg6
Fong NM, Fang Z, Luo X (2015) Geo-conquesting: Competitive locational targeting of mobile promotions. J. Marketing Res. 52(5):726–735.
Forbes (2015) How mobile ordering can impact Starbucks’ valuation. Forbes. Accessed October 2, 2016, http://tinyurl.com/zrekrwp
Forrester (2016). 2016 Mobile and app marketing trends. Forrester.Gill M, Sridhar S, Grewal R (2016) On returns to business-to-business mobile engagement apps. Working
paper.Google M/A/R/C Study (2013) Mobile in-store research: How in-store shoppers are using mobile devices.
Google M/A/R/C. Accessed October 2, 2016, http://tinyurl.com/gr7ghpnGuo S, Fraser MW (2014) Propensity Score Analysis: Statistical Methods and Applications (Sage
Publications).
29
Heckman J (1979) Sample selection bias as a specification error. Econometrica 47(1): 153-161.Huang Q, Nijs VR, Hansen K, Anderson ET (2012) Wal-Mart’s impact on supplier profits. J. Marketing
Res. 49(2):131–143.Hui SK, Inman JJ, Huang Y, Suher J (2013) The effect of in-store travel distance on unplanned spending:
Applications to mobile promotion strategies. J. Marketing 77(2):1–16.Hung SY, Ku CY, Chang CM (2003) Critical factors of WAP services adoption: An empirical
study. Electronic Commerce Research and Applications 2(1):42-60.Jing X, Lewis M (2011) Stockouts in online retailing. J. Marketing Res. 48(2):342–354.Kim SJ, Wang R J-H, Malthouse EC (2015) The effects of adopting and using a brand’s mobile
application on customers’ subsequent purchase behavior. J. Interactive Marketing 31:28–41.Kushwaha T, Shankar V, Li S (2015) Multichannel marketing: Asymmetries across customer-channel
segments and optimal marketing allocation. Working paper.Lee J, Zhuang M, Kozlenkova I, Fang E (2016) The dark side of mobile channel expansion strategies.
MSI working paper.Lewis M, Singh V, Fay S (2006) An empirical study of the impact of nonlinear shipping and handling
fees on purchase incidence and expenditure decisions. Marketing Sci. 25(1):51-64.Manchanda P, Packard G, Pattabhiramaiah A (2015) Social dollars: The economic impact of customer
participation in a firm-sponsored online customer community. Marketing Sci. 34(3):367-387.Ofek E, Katona Z, Sarvary M (2011) “Bricks and clicks”: The impact of product returns on the strategies
of multichannel retailers. Marketing Sci. 30(1):42-60.Peterson H (2015) Macy’s CEO says there’s one thing everyone is getting wrong about the retail industry.
Business Insider. Accessed October 2, 2016, http://tinyurl.com/h7yf6abRosenbaum PR (2005) Sensitivity analysis in observational studies. Everitt BS, Howell DC, eds.
Encyclopedia of Statistics in Behavioral Science (Wiley, New York), 1809-1814.Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for
causal effects. Biometrika 70(1):41–55.Rubin DB (2008) For objective causal inference, design trumps analysis. The Ann. of Appl. Statis.
2(3):808–840.Shankar V, Balasubramanian S (2009) Mobile marketing: A synthesis and prognosis. J. Interactive
Marketing 23(2):118–129.Shankar V, Kleijnen M, Ramanathan S, Rizley R, Holland S, Morrissey S (2016) Mobile shopper
marketing: Key issues, current insights, and future research avenues. J. Interactive Marketing 34:37-48.
Silverman BW (1986) Density Estimation for Statistics and Data Analysis (Chapman and Hall, London).Sims G (2015) Google Play store vs the Apple App store: By the numbers. Android Authority. Accessed
October 2, 2016, http://tinyurl.com/zp4ufdqSproull L, Kiesler S (1986) Reducing social context cues: Electronic mail in organizational
communication. Management sci. 32(11):1492-1512.Thaler RH (1990) Saving, fungibility, and mental accounts. J. Economic Perspectives 4(1):193–205.Wang RJ-H, Malthouse EC, Krishnamurthi L (2015) On the go: How mobile shopping affects customer
purchase behavior. J. Retailing 91(2):217–234.Wangenheim FV, Bayón T (2007) Behavioral consequences of overbooking service capacity. J.
Marketing 71(4):36–47.Wooldridge JM (2002) Econometric Analysis of Cross Section and Panel Data (MIT Press, Cambridge).Xu J, Forman C, Kim JB, Ittersum KV (2014) News media channels: Complements or substitutes?
Evidence from mobile phone usage. J. Marketing 78(4):97–112.Xu K, Chan J, Ghose A, Han SP (2016) Battle of the channels: The impact of tablets on digital
commerce. Management Sci. Forthcoming.
30
Table 1. Selected Related Literature and Our ContributionPaper Focus DV =
PIDV = PA
DV = RI
DV = RA
Other DV
Comprehe-nsive app features
Methods** Context
Prior Research on the Effects of App Adoption on Dependent Variables (DVs*)Bellman et al. (2011)
Effect of app use on brand attitude and purchase intention
Purchase intent
Online survey and lab study
Multiple branded retail apps
Einav et al. (2014)
Analysis of eBay’s mobile app adoption and platform revenues
✔ Descriptive analysis
Online retail
Xu et al. (2014)
Effect of mobile app on demand at the mobile site
Site visit Diff-in-diff Online news
Dinner et al. (2015)
Effect of app adoption on probability of making an online and offline purchase
✔ Fixed effects panel
High-end clothing retailer’s iOS app (online and a store)
Kim et al. (2015)
Effect of use of app check-ins and information look-ups on loyalty point accruals
Loyalty point accruals
PSM and Diff-in-diff
Air Miles Reward Program app
Gill et al. (2016)
Effect of manufacturer's mobile app on B2B revenues
✔ Diff-in-diff B2B engagement app of a tools manufacturer
Prior Research on the Effects of Mobile Device Adoption on Dependent Variables (DVs)Wang et al. (2015)
Changes in customers’ spending behavior upon adopting M-shopping
✔ ✔ PSM, log-log, hazard models
Online grocery retailer’s app
Lee et al. (2016)
Effect of mobile shopping ratio (mobile vs. web) on purchases
✔ ✔ Panel Data Regression
Online retail
Xu et al. (2016)
Effect of tablet adoption on digital commerce via smartphones and PC devices
✔ Diff-in-diff Online retail
Our study
Our paper (2016)
Effects of app adoption on purchase and returns across all channels and the roles of app features on shopping outcomes
✔ ✔ ✔ ✔ - ✔
PSM, Diff-in-Diff, exponential Type II Tobit
Retailer with a chain of stores and an ecommerce site
Notes: *DV refers to the key dependent variables used in the study, including purchase incidence (PI), monetary value of purchase (PA), return incidence (RI), and monetary value of returns (RA); **Methods include difference-in-differences approach (diff-in-diff) and propensity score matching (PSM).
31
Table 2. Variable Definitions and Descriptive Statistics Variable Notation Operationalization Mean St. Dev. Min. Max.
Purchase Incidence PI/h Dummy variable indicating if at least one purchase was made in the time period (=1); else (=0)
0.48 0.50 0 1
Monetary Value of Purchases
PA/g Amount associated with purchases in the period ($) 70.09 154.87 0 3733.89
Return Incidence RI/r Dummy variable indicating if at least one return was made in the time period (=1); else (=0)
0.07 0.25 0 1
Monetary Value of Returns
RA/t Amount associated with returns in the period ($) 5.73 35.91 0 919.96
App Adoption TREAT/A Dummy variable indicating if the shopper adopted the app (=1) or not (=0) in the data period
0.17 0.38 0 1
Time Period POST/P Dummy variable indicating if the time period is before (=0) or after (=1) adoption
0.5 0.5 0 1
Recency RECENCY Number of days since the shopper’s last purchase at the start of time t
36.81 29.01 1 118
Tenure TENURE Number of days of being a customer at the start of time t
453.41 51.33 153 482
Order Size QNT Number of items in an order 0.95 1.38 0 39
Age AGE Age of shopper in years at the start of the data period 32.57 11.13 11 82Gender GENDER Gender of shopper (Female=1, Male=0) 0.21 0.41 0.00 1.00Distance to Nearest Store
DIST Distance in miles between the geographical centers of the shopper’s and the nearest store’s zip codes
4.29 7.86 2.06 196.12
Number of Stores NSTORES Number of focal retailer’s stores in shopper’s zip code 0.57 0.72 0 4Loyalty Program Level
LPROG Dummy indicating if the shopper is enrolled in the basic (=0) or professional (=1) membership on app adoption date
0.43 0.49 0 1
Area Population AREAPOPL Population of zipcode based on 2010 US census 31,611 19,009 6 113,916
Estim
atio
n W
indo
w
Competitor Stores COMPSTORE Number of competing stores in shopper’s zipcode 0.53 0.67 0 5Online Buyer ONLINEBUYER Dummy variable indicating whether the shopper made
an online purchase (=1) or not (=0) in the calibration period
0.05 0.21 0 1
Past Purchase Amount
PASTSPEND Monetary value of average monthly purchases in the calibration period ($)
44.56 62.19 0 844.12
Past Return Amount
PASTRETURN Monetary value of average monthly returns in the calibration period ($)
4.17 17.82 0 482.64
Cal
ibra
tion
Win
dow
Average Purchase Frequency
APF Average number of sales transactions in a month in the calibration period
0.79 0.73 0 15
32
Table 3. Model-free Evidence: Mean Statistics
VariableTreated pre
periodTreated post
periodControl pre
periodControl post
periodPurchase Incidence 0.686 0.822 0.416 0.441Number of purchases 1.474 2.359 0.707 0.820Monetary value of purchases 126.252 181.149 46.887 59.048Return incidence 0.096 0.185 0.049 0.062Number of returns 0.134 0.282 0.062 0.077Monetary value of returns 9.461 18.555 4.021 4.052Net monetary value of purchases 116.792 162.594 42.866 54.995
Table 4. Overview of AnalysesSection Analysis Objective Key insight/Conclusion
4.2 Baseline Difference-in-Differences (DIFF-IN-DIFF) Regression
Quantifying the treatment effect of app adoption on shopping outcomes
App adoption leads to higher incidence and monetary value of purchase and returns than non-adoption
4.3 DIFF-IN-DIFF Regression with(a) Selection on observables –
Propensity Score Matched (PSM) Sample
(b) Selection on unobservables – Heckman correction
Correcting for potential bias in treatment effects due to self-selection
App adoption leads to higher incidence and monetary value of purchase and returns than non-adoption after correcting for endogeneity of app adoption
4.4 Exponential Type II Tobit Decomposing the effects of app adoption into incidence of purchase (returns), and conditional on it, the monetary value of purchase (returns) per occasion in a two-stage model
App adoption leads to higher purchase and return incidence but lower monetary value of purchase per occasion
5.2 Robustness Checks (a) Alternative measure of app
adoption(b) Outliers(c) Customer heterogeneity in
deal proneness(d) Alternative matching(e) Alternative samples(f) App novelty and alternative
time periods
Ruling out alternative explanations for the results
App adoption treatment effects are robust to alternative explanations, such as outliers, customer deal-proneness, app novelty, and other adoption measures, samples, and time periods
Web Appendix
Additional ChecksWeb Appendix B: Visual plots for common trendsWeb Appendix C:(a) Kolmogorov-Smirnov Test(b) Standardized bias reduction(c) Hidden bias sensitivity
analysisWeb Appendix D: Alternative control group using future adopters to tackle unobservables
Evaluating robustness to any other potential threats to main methods
(a) Pre-adoption purchase trends in the control and treated groups are parallel.
(b) PSM significantly improves balance between the treated and control groups.
(c) Effects are robust to alternative control groups.
33
Table 5. Results of Difference-in-Differences ModelsCoeff. (Std. Err.)Panel A. Unmatched Samples
Variable Number of purchases
Monetary value of purchases
Number of returns
Monetary value of returns
Net monetary value of
purchasesTREAT 0.767***
(0.042)79.365***(6.099)
0.072***(0.013)
5.44***(1.24)
73.925***(5.767)
POST 0.113***(0.019)
12.16***(1.953)
0.015*(0.005)
0.032(0.458)
12.129***(1.819)
TREAT*POST 0.772***(0.07)
42.736***(8.637)
0.133***(0.024)
9.063***(2.094)
33.673***(8.009)
Intercept 0.707***(0.013)
46.887***(1.332)
0.062***(0.004)
4.021***(0.348)
42.866***(1.226)
Panel B. Propensity Score Matched SamplesVariable Number of
purchasesMonetary value of
purchasesNumber of
returnsMonetary value of
returnsNet monetary
value of purchases
TREAT 0.499***(0.052)
64.708***(6.837)
0.039*(0.017)
3.205(1.586)
61.503***(6.352)
POST 0.016(0.048)
6.983(4.861)
0.004(0.015)
-1.665(1.269)
8.648(4.441)
TREAT*POST 0.869***(0.083)
47.913***(9.717)
0.144***(0.028)
10.759***(2.405)
37.154***(8.976)
Intercept 0.975***(0.033)
61.545***(3.364)
0.095***(0.011)
6.256***(1.048)
55.289***(2.932)
Panel C. Matched Sample with Heckman Correction using Inverse Mills Ratio (IMR) as CovariateVariable Number of
purchasesMonetary value of purchases
Number of returns
Monetary value of returns
Net monetary value of purchases
TREAT 0.499***(0.052)
64.731***(6.84)
0.039*(0.017)
3.21(1.585)
61.521***(6.355)
POST 0.016(0.048)
6.983(4.862)
0.004(0.015)
-1.665(1.269)
8.648(4.441)
TREAT*POST 0.869***(0.083)
47.913***(9.717)
0.144***(0.028)
10.759***(2.405)
37.154***(8.975)
IMR 0.022(0.095)
5.754(11.22)
0.024(0.03)
1.296(2.791)
4.458(10.321)
Intercept 0.944***(0.14)
53.33**(16.534)
0.06(0.045)
4.406(4.182)
48.924**(15.183)
* Notes: Robust standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.
34
Table 6. Covariates and their Relationships with Outcomes for Exponential Type II Tobit Model Variable Notation PI PA RI RA Support from research in other
contextsMobile app A*P ✔ ✔ ✔ ✔ (Kim et al. 2015; Wang et al. 2015)Recency RECENCY ✔ ✔ (Lewis et al. 2006; Jing and Lewis
2011)Income/Past Spend
INCOME/ PASTSPEND
✔ Thaler (1990)
Tenure TENURE ✔ (Bolton 1998; Kushwaha et al. 2015)Distance to nearest store
DIST ✔ (Anderson et al. 2009; Ofek et al. 2011)
Number of stores
NSTORES ✔ (Anderson et al. 2009; Ofek et al. 2011)
Order size QNT ✔ (Anderson et al. 2009)Note: Purchase incidence (PI), monetary value of purchases (PA), return incidence (RI), and monetary value of returns (RA).
Table 7. Results of Exponential Tobit Type II Model Variable
(A)Coeff.
(Std. Err.)Variable
(B)Coeff.
(Std. Err.)Log Value of Purchases Per Occasion
Log Value of Returns Per Occasion
POST (P) 0.134**(0.046)
POST (P) -0.293*(0.117)
TREAT (A) 0.259***(0.047)
TREAT (A) 0.088(0.117)
TREAT * POST (A * P) -0.125*(0.062)
TREAT * POST (A * P) 0.251(0.171)
TENURE 0.383***(0.102)
QNT 0.05(0.027)
INCOME/PAST SPEND 0.0004*(0.001)
INTERCEPT 3.367***(0.553)
INTERCEPT 1.308*(0.621)
Purchase Incidence Return IncidencePOST (P) -0.014
(0.044)POST (P) 0.081
(0.067)TREAT (A) 0.470***
(0.045)TREAT (A) 0.213**
(0.065)TREAT * POST (A * P) 0.442***
(0.066)TREAT * POST (A * P) 0.308***
(0.088)RECENCY -0.007***
(0.001)DIST -0.005
(0.004)INTERCEPT 0.246***
(0.036)NSTORES -0.007
(0.035)RECENCY -0.007***
(0.001)
Log Likelihood Correlation (rho)
-9359.39 0.171
INTERCEPT
Log LikelihoodCorrelation (rho)
-1.283***(0.061)-2982.730.193
Notes: There are 2,421 (5,827) censored observations for purchase (returns) model; standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.
35
Table 8. Robustness Checks for Treatment Effects
Variable(A)
App use vs. non-use for six months
(B)Outliers
(C)Deal use
heterogeneity
(D)Alternative
sample
(E)Future treated
as control
Number of purchases
0.433***(0.051)
0.88***(0.083)
0.945***(0.089)
0.647***(0.094)
0.688***(0.097)
Monetary value of purchases
25.505***(4.934)
87.592***(7.662)
66.781***(11.295)
36.521***(9.064)
34.718**(11.646)
Number of returns 0.055***(0.015)
0.13***(0.028)
0.139***(0.027)
0.073*(0.03)
0.108**(0.036)
Monetary value of returns
2.236*(1.054)
10.81***(1.974)
11.943***(2.528)
8.986***(2.126)
5.25(2.871)
Net monetary value of purchases
23.269***(4.668)
76.782***(7.113)
54.838***(10.616)
27.536***(8.349)
29.468**(10.664)
Number of observations
9,774 5,828 4,624 4,356 6,516
* Notes: Robust standard errors in parentheses; *** p < 0.001, ** p < 0.01, * p < 0.05.
Table 9. Shopping Outcomes for Subgroups of App Adopters based on Offer Feature UsageVariable Offer features used Offer features not used
Pre Post Pre PostNumber of purchases 1.55 2.61 1.39 2.06Monetary value of purchases 136.27 191.58 114.19 168.59Number of returns 0.16 0.32 0.10 0.23Monetary value of returns 10.80 20.50 7.85 16.22Net monetary value of purchases 125.47 171.08 106.34 152.37Monetary value of purchases per occasion
87.92 73.40 82.15 81.84
Table 10. Shopping Outcomes for Subgroups of App Adopters based on Loyalty Feature UsageVariable Loyalty reward
features usedLoyalty reward
features not usedPre Post Pre Post
Number of purchases 1.50 2.40 1.40 2.26Monetary value of purchases 130.35 178.17 116.62 188.15Number of returns 0.14 0.27 0.13 0.31Monetary value of returns 8.95 17.99 10.67 19.89Net monetary value of purchases 121.40 160.19 105.95 168.26Monetary value of purchases per occasion
86.9 74.23 83.30 83.25
36
Figure 1. Mobile Apps and Shopper Choices
Figure 2. Model-free Evidence: Monetary Value of Purchases and Returns for App Adopters and Non-
adopters
37
Figure 3. Propensity Score Matched Sample: Monetary Value of Purchases and Returns for App Adopters and Non-adopters
Figure 4. Distribution of Propensity Scores Pre- and Post-Matching
Figure 5. Model Free Evidence: Number of App Features and Monetary Value of Purchases and Returns
Number of unique app features usedNumber of unique app features used
Mon
etar
y V
alue
of P
urch
ases
($)
Mon
etar
y V
alue
of R
etur
ns ($
)
i
The Effects of Mobile Apps on Shopper Purchases and Product Returns
WEB APPENDIX
Web Appendix A. Screenshots
Figure A1. App Screenshots on iPhone
Web Appendix B. Robustness Check for Alternative Samples, Periods and Common Trends in the Pre-period
In this section, we present the results for robustness checks relating to two alternative samples
(Tables B1-B2), alternative app adoption measures (Table B3), varying time periods (Table B4)
and the common trends plots (Figures B1-B4) for the treated and control groups.
The two samples are: (a) a random sample selected on a different date, February 1, 2015
similar to our main analysis of December 1, 2014 with app adopters selected at a certain date of
adoption, and (b) random samples selected from each month for a period of four months during
(February-May) 2015. In case (b), we treat the month of adoption as the post-period; the implicit
assumption is that the shoppers who adopted the app did so at the beginning of the month. Such
an aggregation approach will induce a downward bias in our estimates, since we would assume
ii
that shoppers start showing signs of increased spending right at the beginning of the period
(Manchanda et al. 2015).
Similar to our main estimation, we match app adopters and non-adopters using a rich set of
covariates in a binary logit model and subsequently carry out a difference-in-differences
estimation. The binary logit model specifications are tailored for best fit. For instance, for the
February 2015 sample, we matched the samples on each past month’s spending instead of
average past spending. We also replicate the analysis for an alternative control group sample of
random non-adopters. In Tables B1 and B2, we present the results for the two sets of samples
described earlier.
Next, in Table B3, we report the estimates for a refined measure of app adoption – app use
vs. non-use based on December adopters’ usage in four months from January to April, 2015.
These estimates demonstrate that the effect is indeed due to app use, and is robust to individual
characteristics over time. Finally, in Table B4, we report the estimates for varying time windows
to rule out a possible novelty effect of the app. More specifically, we find robust results using
shorter (15-day) and longer (45- and 60-day) periods as the pre-post windows compared to our
30-day window for the main estimation.
In Figures B1-B4, we present the graphs showing the monetary values of purchases for
adopters and non-adopters before app adoption to illustrate that the common trends assumption
central to the difference-in-differences design holds.
iii
Table B1. Difference-in-Differences Model Results for Feb 1, 2015 SampleVariable (A)
Nearest neighbor matches
(B)Caliper matches
(C)Excluding
outliers
(D)Excluding offer users
Treatment Effect Coeff. (Std. Err.)Number of purchases 0.647***
(0.094)0.6153***(0.0887)
0.6417***(0.092)
0.5569***(0.0952)
Monetary value of purchases 36.521***(9.064)
34.7187***(8.5425)
41.6567***(8.7232)
29.1181**(9.1233)
Number of returns 0.073*(0.03)
0.088**(0.0287)
0.0623*(0.0299)
0.0449(0.0309)
Monetary value of returns 8.986***(2.126)
7.3061***(2.0938)
5.8777**(2.1032)
5.887**(2.1571)
Net monetary value of purchases 27.536***(8.349)
27.4126***(7.8975)
35.779***(8.084)
23.2311**(8.4989)
Number of individuals 2,178 2,090 1,926 1,828Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.
Table B2. Difference-in-Differences Model Results for February-May 2015 SamplesVariable (A)
February adopters
(B)March
adopters
(C)April
adopters
(D)May
adoptersTreatment Effect Coeff. (Std. Err.)
Number of purchases 1.191***(0.0862)
1.032***(0.0936)
1.088***(0.0818)
1.152***(0.0838)
Monetary value of purchases 120.551***(8.6463)
107.667***(9.8333)
86.048***(18.8776)
71.956***(7.2029)
Number of returns 0.101**(0.0335)
0.175***(0.0332)
0.131***(0.0299)
0.139***(0.026)
Monetary value of returns 8.658*(4.1254)
18.449***(3.2732)
7.229**(2.6187)
8.411***(2.1607)
Net monetary value of purchases
Number of individuals
111.894***(7.4844)
3,180
89.217***(8.5666)
2,750
78.819***(18.5387)
2,804
63.546***(6.4654)
2,646Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.
iv
Table B3. Results of Alternative Model Comparing App Users and Non-users Variable (A)
Jan 2015 matches
(B)Feb 2015 matches
(C)March 2015
matches
(D)April 2015
matches
Effect of App Use Coeff. (Std. Err.)Number of purchases 0.805***
(0.052)0.786***(0.053)
0.844***(0.061)
0.791***(0.054)
Monetary value of purchases 60.822***(4.78)
58.268***(5.179)
59.120***(5.715)
58.938***(5.883)
Number of returns 0.104***(0.016)
0.112***(0.016)
0.133***(0.018)
0.083***(0.016)
Monetary value of returns 6.939***(1.194)
9.418***(1.544)
7.665***(1.372)
6.840***(1.543)
Net monetary value of purchases
53.883***(4.421)
48.849***(4.567)
51.455***(5.234)
52.098***(5.378)
Number of individualsNumber of observations
3484,872
2984,172
2733,822
2343,276
Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses.
Table B4. Difference-in-Differences Model Results for Varying Periods Based on Time from Adoption
Variable (A)15 days pre
and post
(B)45 days pre
and post
(C)60 days pre
and post
Treatment Effect Coeff. (Std. Err.)Number of purchases 0.556***
(0.054)1.304***(0.1)
1.629***(0.116)
Monetary value of purchases 32.806***(7.6)
75.089***(10.847)
92.645***(11.83)
Number of returns 0.074***(0.019)
0.225***(0.034)
0.247***(0.037)
Monetary value of returns 5.773**(1.752)
15.713***(2.859)
17.41***(3.039)
Net monetary value of purchases
27.033***(7.11)
59.376***(9.945)
75.235***(10.836)
Notes: *** p < 0.001, ** p < 0.01, * p < 0.05; robust standard errors are in parentheses; number of individuals for these models is the same as the main sample, that is, 3,258 treated and control individuals.
v
Figure B1. Purchase Trends Before App Adoption for the February 2015 Sample
Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-150
20
40
60
80
100
120
Non Adopters
Adopters
Month
Mon
etar
y V
alue
of
Purc
hase
s ($)
Figure B2. Purchase Trends Before App Adoption for the March 2015 Sample
Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-150
102030405060708090
100
Non Adopters
Adopters
Month
Mon
etar
y V
alue
of
Purc
hase
s ($)
vi
Figure B3. Purchase Trends Before App Adoption for the April 2015 Sample
Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-15 Mar-150
20
40
60
80
100
120
Non Adopters
Adopters
Month
Mon
etar
y V
alue
of
Purc
hase
s ($)
Figure B4. Purchase Trends Before App Adoption for the May 2015 Sample
Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-15Mar-15 Apr-150
102030405060708090
100
Non Adopters
Adopters
Month
Mon
etar
y V
alue
of
Purc
hase
s ($)
Appendix C. Tests for Propensity Score Matching
First, we report the results of the binomial logit model of app adoption used to compute
propensity scores. The results in Table C1 present the logit model coefficients for the likelihood
of a shopper becoming an app adopter. Shoppers who are more likely to adopt the retailer’s app
tend to be younger, male, online buyers, paid loyalty members on the day of app adoption, and
vii
higher frequency shoppers. We select this logit model after evaluating the model fit for several
other model specifications, including probit and logit with and without non-linear covariates.
Next, we present the results of various post-matching checks. First, the Kolmogorov-Smirnov
(Table C2) test shows that the distributions of propensity scores of the matched treated and
control groups are statistically similar. Second, percentage reduction in bias after matching
shows significant improvements in the values of covariates across app adopters and non-
adopters, thus making the two groups comparable (Table C3). Finally, there is no concern for
potential hidden bias due to unobservables (Table C4) or concerns for alternative matching
methods (Table C5).
We discuss these checks in detail next. The results in Table C2 show that the distribution of
propensity scores is nearly identical after matching. Table C3 shows the standardized bias before
and after matching. We calculate it as follows (Rosenbaum and Rubin 1983):
𝑆𝐵 = 𝑋𝑡 ‒ 𝑋
(.5) x (𝑆𝐷 2𝑋𝑡 + 𝑆𝐷2
𝑋)
where for the standardized bias before matching SBBM, the numerator is the difference between
value of X covariates for the treated individuals before matching and the X covariates for all
unmatched control individuals before matching, and the denominator is the equally weighted
variance of the two. Likewise, the standardized bias after matching SBAM uses the treated means
and control means after matching. We then calculate the percentage reduction in bias as:
𝑃𝑅𝐵 = 100(1 ‒𝑆𝐵𝐴𝑀
𝑆𝐵𝐵𝑀)
In Table C4, we summarize the results of a sensitivity test to assess if hidden bias is a cause for
concern. In this test, we manipulate the estimated odds of receiving the treatment to see how
much the estimated treatment effects may vary. In other words, we check that the estimates are
viii
robust to possible ranges of “hidden bias.” According to Rosenbaum (2005), a sensitivity
analysis in an observational study asks what the unmeasured covariate would have to be like to
alter the conclusions of the study.
Suppose we have two individuals j and k. Assume they have similar covariates but different
chances of receiving the treatment. In other words, Xj and Xk are the same but Aj and Ak, the
probability of adopting the app, may be different. The odds that they adopt the app are Aj/(1-Aj)
and Ak/(1-Ak). Let us assume the odds ratio to be at most gamma, where gamma is the exponent
of delta, the indicator of hidden bias. For various gamma values starting with one, we calculate
bounds or intervals of p-values that show us the uncertainty due to hidden bias.
1
𝑒Δ ≤𝐴𝑗(1 ‒ 𝐴𝑘)
𝐴𝑘(1 ‒ 𝐴𝑗)≤ 𝑒Δ
Let γ = eΔ. If gamma were exactly one, or equivalently delta exactly zero, then there would
be no hidden bias and if Xj and Xk are equal then so would be their log odds of getting treated.
Gamma is a measure of degree of departure from a study free of hidden bias (Guo and Fraser
2014). Our test finds that for varying values of gamma, 1 through 2, our inference is robust or the
study is insensitive to hidden bias. In other words, extremely high values of gamma would be
needed to change the inference.
ix
Table C1. App Adoption Model: Logit EstimatesVariable Coeff. (Std. Err.)
1 (Intercept) 67.187* (26.133)2 ln(1+AGE) -0.872*** (0.085)3 GENDER (F=1,M=0) -0.454*** (0.079)4 ln(1+DIST) 0.584* (0.252)5 ln(1+NSTORES) 1.122 (0.609)6 ln(1+AREAPOPL) 0.091 (0.359)7 ln(1+COMPSTORE) -0.044 (0.09)8 ln(1+TENURE) -24.709** (9.166)9 LPROG 0.549*** (0.058)10 ONLINEBUYER 0.424*** (0.116)11 ln(1+PASTSPEND) -0.098 (0.125)12 ln(1+PASTRETURN) -0.117 (0.086)13 ln(1+NSTORES) Sq. -0.538 (0.356)14 ln(1+AREAPOPL) Sq. -0.004 (0.019)15 ln(1+TENURE) Sq. 2.226* (0.801)16 ln(1+DIST) Sq. -0.135* (0.058)17 ln(1+PASTSPEND) Sq. 0.006 (0.018)18 ln(1+PASTRETURN) Sq. 0.024 (0.023)19 ln(1+APF) 2.371*** (0.317)20 ln(1+APF) Sq. -0.507** (0.168)
Notes: Null deviance: 8,737.9 on 9584 degrees of freedom; Residual deviance: 8,109.8 on 9,565 degrees of freedom; AIC: 8,149.8, Number of Fisher Scoring iterations: 5. Log likelihood: -4054.91. McFadden’s Pseudo R squared 0.072; *** p < 0.001, ** p < 0.01, * p < 0.05.
Table C2. KS Test ResultsTwo sample Kolmogorov-Smirnov test
Before matching: D = 0.29467, p-value < 2.2e-16After matching: D = 0.004911, p-value = 1
x
Table C3. Propensity Score Matching Results: Percentage Reduction in Bias After MatchingVariable Means treated
(before matching)
Means control (before
matching)
Means control (after
matching)
Percent balance
improvement
(Intercept) 0.227 0.158 0.226 99.318ln(1+AGE) 3.379 3.476 3.380 98.467GENDER (FEMALE) 0.135 0.225 0.141 93.883ln(1+DIST) 1.053 1.050 1.041 -269.836ln(1+NSTORES) 0.357 0.359 0.352 -136.892ln(1+AREAPOPL) 10.100 10.092 10.127 -240.089ln(1+COMPSTORE) 0.342 0.343 0.334 -472.658ln(1+TENURE) 6.086 6.074 6.087 87.540LPROG 0.561 0.400 0.570 94.299ONLINEBUYER 0.079 0.040 0.076 92.149ln(1+PASTSPEND) 3.532 3.175 3.528 98.853ln(1+PASTRETURN) 0.604 0.411 0.642 80.196ln(1+NSTORES) Sq. 0.302 0.303 0.296 -326.628ln(1+AREAPOPL) Sq. 102.810 102.676 103.290 -257.693ln(1+TENURE) Sq. 37.053 36.919 37.069 88.140ln(1+DIST) Sq. 2.239 2.260 2.145 -349.191ln(1+PASTSPEND) Sq. 13.699 11.196 13.687 99.507ln(1+PASTRETURN) Sq. 1.957 1.268 2.063 84.568ln(1+APF) 0.655 0.501 0.649 96.139ln(1+APF) Sq. 0.557 0.327 0.541 93.251
Notes: 1,629 adopters are matched with 1,629 non-adopters out of a pool of 7956 non-adopters pre-matching.
Table C4. Hidden Bias Sensitivity Test ResultsRosenbaum Sensitivity Test for Wilcoxon
Signed Rank P-ValueUnconfounded estimate.... 0
Gamma Lower bound Upper bound
1 0 01.1 0 01.2 0 01.3 0 01.4 0 01.5 0 01.6 0 01.7 0 01.8 0 01.9 0 02 0 0
xi
Table C5. Results of Difference-in-Differences Model with Different Matching MethodsVariable (M1)
Matching (Mahalanobis
metric)
(M2) Matching (calipers)
(M3) Matching (common
support with trimming)
Treatment Effect Coeff. (Std. Err.)Number of purchases
0.799***(0.083)
0.823***(0.082)
0.899***(0.084)
Monetary value of purchases
39.61***(9.785)
45.294***(9.485)
48.157***(9.949)
Number of returns 0.139***(0.028)
0.137***(0.027)
0.14***(0.027)
Value of returns 9.446***(2.286)
11.109***(2.253)
10.052***(2.337)
Net monetary value of purchases
30.164**(9.092)
34.186***(8.862)
56.777***(6.583)
Number of observations
6,516 6,404 6,484
Note: *** p < 0.001, ** p < 0.01.
Appendix D. Selection on Unobservables
In this section, we present (a) the results of the first-stage probit model used for the Heckman
correction in Table D1, and (b) the results for an alternative difference-in-differences using
future treated cohorts of app adopters as controls in Table D2.
Table D1. First-Stage Probit Model ResultsDV = App Adoption
Variable Coeff.(Std. Err.)
Wireless network (WIRENET)* 0.196** (0.092)Online buying (ONLINEBUYER)* 0.381***(0.066)Symmetry in upload and download speeds (SYMSPEED)*
-0.255** (0.103)
Age -0.015*** (0.001)Gender -0.256*** (0.042)Tenure 0.002*** (0.0000)Distance -0.002 (0.002)Loyalty program level 0.35*** (0.031)Competitor stores -0.017 (0.024)Intercept -1.152*** (0.14)
Notes: *** p < 0.01, ** p < 0.05; * indicates exclusion restrictions.
xii
Table D2. Alternative Difference-in-Differences Model Results with Future App Adopters as Control Group
Variable Unmatched sample
Matched sample
Treatment Effect Coeff. (Std. Err.)
Number of purchases 0.758****(0.076)
0.688****(0.097)
Monetary value of purchases 37.236**(11.749)
34.718***(11.646)
Number of returns 0.122****(0.026)
0.108***(0.036)
Monetary value of returns 7.216****(2.207)
5.25*(2.871)
Net monetary value of purchases 30.021*(11.236)
29.468***(10.664)
Notes: **** p < 0.001, *** p < 0.01, ** p < 0.05, *p < 0.10; in this method, future app adopters from Feb-May 2015 are used as controls for current app adopters from December 2014.